• Jan Kara's avatar
    ext4: use buckets for cr 1 block scan instead of rbtree · 83e80a6e
    Jan Kara authored
    Using rbtree for sorting groups by average fragment size is relatively
    expensive (needs rbtree update on every block freeing or allocation) and
    leads to wide spreading of allocations because selection of block group
    is very sentitive both to changes in free space and amount of blocks
    allocated. Furthermore selecting group with the best matching average
    fragment size is not necessary anyway, even more so because the
    variability of fragment sizes within a group is likely large so average
    is not telling much. We just need a group with large enough average
    fragment size so that we have high probability of finding large enough
    free extent and we don't want average fragment size to be too big so
    that we are likely to find free extent only somewhat larger than what we
    need.
    
    So instead of maintaing rbtree of groups sorted by fragment size keep
    bins (lists) or groups where average fragment size is in the interval
    [2^i, 2^(i+1)). This structure requires less updates on block allocation
    / freeing, generally avoids chaotic spreading of allocations into block
    groups, and still is able to quickly (even faster that the rbtree)
    provide a block group which is likely to have a suitably sized free
    space extent.
    
    This patch reduces number of block groups used when untarring archive
    with medium sized files (size somewhat above 64k which is default
    mballoc limit for avoiding locality group preallocation) to about half
    and thus improves write speeds for eMMC flash significantly.
    
    Fixes: 196e402a ("ext4: improve cr 0 / cr 1 group scanning")
    CC: stable@kernel.org
    Reported-and-tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
    Tested-by: default avatarOjaswin Mujoo <ojaswin@linux.ibm.com>
    Signed-off-by: default avatarJan Kara <jack@suse.cz>
    Reviewed-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
    Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
    Link: https://lore.kernel.org/r/20220908092136.11770-5-jack@suse.czSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    83e80a6e
mballoc.h 5.85 KB