• Qu Wenruo's avatar
    btrfs: defrag: avoid unnecessary defrag caused by incorrect extent size · e42b9d8b
    Qu Wenruo authored
    [BUG]
    With the following file extent layout, defrag would do unnecessary IO
    and result more on-disk space usage.
    
      # mkfs.btrfs -f $dev
      # mount $dev $mnt
      # xfs_io -f -c "pwrite 0 40m" $mnt/foobar
      # sync
      # xfs_io -f -c "pwrite 40m 16k" $mnt/foobar
      # sync
    
    Above command would lead to the following file extent layout:
    
            item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
                    generation 7 type 1 (regular)
                    extent data disk byte 298844160 nr 41943040
                    extent data offset 0 nr 41943040 ram 41943040
                    extent compression 0 (none)
            item 7 key (257 EXTENT_DATA 41943040) itemoff 15763 itemsize 53
                    generation 8 type 1 (regular)
                    extent data disk byte 13631488 nr 16384
                    extent data offset 0 nr 16384 ram 16384
                    extent compression 0 (none)
    
    Which is mostly fine. We can allow the final 16K to be merged with the
    previous 40M, but it's upon the end users' preference.
    
    But if we defrag the file using the default parameters, it would result
    worse file layout:
    
     # btrfs filesystem defrag $mnt/foobar
     # sync
    
            item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
                    generation 7 type 1 (regular)
                    extent data disk byte 298844160 nr 41943040
                    extent data offset 0 nr 8650752 ram 41943040
                    extent compression 0 (none)
            item 7 key (257 EXTENT_DATA 8650752) itemoff 15763 itemsize 53
                    generation 9 type 1 (regular)
                    extent data disk byte 340787200 nr 33292288
                    extent data offset 0 nr 33292288 ram 33292288
                    extent compression 0 (none)
            item 8 key (257 EXTENT_DATA 41943040) itemoff 15710 itemsize 53
                    generation 8 type 1 (regular)
                    extent data disk byte 13631488 nr 16384
                    extent data offset 0 nr 16384 ram 16384
                    extent compression 0 (none)
    
    Note the original 40M extent is still there, but a new 32M extent is
    created for no benefit at all.
    
    [CAUSE]
    There is an existing check to make sure we won't defrag a large enough
    extent (the threshold is by default 32M).
    
    But the check is using the length to the end of the extent:
    
    	range_len = em->len - (cur - em->start);
    
    	/* Skip too large extent */
    	if (range_len >= extent_thresh)
    		goto next;
    
    This means, for the first 8MiB of the extent, the range_len is always
    smaller than the default threshold, and would not be defragged.
    But after the first 8MiB, the remaining part would fit the requirement,
    and be defragged.
    
    Such different behavior inside the same extent caused the above problem,
    and we should avoid different defrag decision inside the same extent.
    
    [FIX]
    Instead of using @range_len, just use @em->len, so that we have a
    consistent decision among the same file extent.
    
    Now with this fix, we won't touch the extent, thus not making it any
    worse.
    Reported-by: default avatarFilipe Manana <fdmanana@suse.com>
    Fixes: 0cb5950f ("btrfs: fix deadlock when reserving space during defrag")
    CC: stable@vger.kernel.org # 6.1+
    Reviewed-by: default avatarBoris Burkov <boris@bur.io>
    Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    e42b9d8b
defrag.c 40.1 KB