Commit dbc2a8c9 authored by Dennis Zhou's avatar Dennis Zhou Committed by David Sterba

btrfs: add async discard implementation overview

Give a brief overview for how async discard is implemented.
Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
Signed-off-by: default avatarDennis Zhou <dennis@kernel.org>
Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent 9ddf648f
...@@ -12,6 +12,45 @@ ...@@ -12,6 +12,45 @@
#include "discard.h" #include "discard.h"
#include "free-space-cache.h" #include "free-space-cache.h"
/*
* This contains the logic to handle async discard.
*
* Async discard manages trimming of free space outside of transaction commit.
* Discarding is done by managing the block_groups on a LRU list based on free
* space recency. Two passes are used to first prioritize discarding extents
* and then allow for trimming in the bitmap the best opportunity to coalesce.
* The block_groups are maintained on multiple lists to allow for multiple
* passes with different discard filter requirements. A delayed work item is
* used to manage discarding with timeout determined by a max of the delay
* incurred by the iops rate limit, the byte rate limit, and the max delay of
* BTRFS_DISCARD_MAX_DELAY.
*
* Note, this only keeps track of block_groups that are explicitly for data.
* Mixed block_groups are not supported.
*
* The first list is special to manage discarding of fully free block groups.
* This is necessary because we issue a final trim for a full free block group
* after forgetting it. When a block group becomes unused, instead of directly
* being added to the unused_bgs list, we add it to this first list. Then
* from there, if it becomes fully discarded, we place it onto the unused_bgs
* list.
*
* The in-memory free space cache serves as the backing state for discard.
* Consequently this means there is no persistence. We opt to load all the
* block groups in as not discarded, so the mount case degenerates to the
* crashing case.
*
* As the free space cache uses bitmaps, there exists a tradeoff between
* ease/efficiency for find_free_extent() and the accuracy of discard state.
* Here we opt to let untrimmed regions merge with everything while only letting
* trimmed regions merge with other trimmed regions. This can cause
* overtrimming, but the coalescing benefit seems to be worth it. Additionally,
* bitmap state is tracked as a whole. If we're able to fully trim a bitmap,
* the trimmed flag is set on the bitmap. Otherwise, if an allocation comes in,
* this resets the state and we will retry trimming the whole bitmap. This is a
* tradeoff between discard state accuracy and the cost of accounting.
*/
/* This is an initial delay to give some chance for block reuse */ /* This is an initial delay to give some chance for block reuse */
#define BTRFS_DISCARD_DELAY (120ULL * NSEC_PER_SEC) #define BTRFS_DISCARD_DELAY (120ULL * NSEC_PER_SEC)
#define BTRFS_DISCARD_UNUSED_DELAY (10ULL * NSEC_PER_SEC) #define BTRFS_DISCARD_UNUSED_DELAY (10ULL * NSEC_PER_SEC)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment