• Josef Bacik's avatar
    btrfs: introduce delayed_refs_rsv · ba2c4d4e
    Josef Bacik authored
    Traditionally we've had voodoo in btrfs to account for the space that
    delayed refs may take up by having a global_block_rsv.  This works most
    of the time, except when it doesn't.  We've had issues reported and seen
    in production where sometimes the global reserve is exhausted during
    transaction commit before we can run all of our delayed refs, resulting
    in an aborted transaction.  Because of this voodoo we have equally
    dubious flushing semantics around throttling delayed refs which we often
    get wrong.
    
    So instead give them their own block_rsv.  This way we can always know
    exactly how much outstanding space we need for delayed refs.  This
    allows us to make sure we are constantly filling that reservation up
    with space, and allows us to put more precise pressure on the enospc
    system.  Instead of doing math to see if its a good time to throttle,
    the normal enospc code will be invoked if we have a lot of delayed refs
    pending, and they will be run via the normal flushing mechanism.
    
    For now the delayed_refs_rsv will hold the reservations for the delayed
    refs, the block group updates, and deleting csums.  We could have a
    separate rsv for the block group updates, but the csum deletion stuff is
    still handled via the delayed_refs so that will stay there.
    
    Historical background:
    
    The global reserve has grown to cover everything we don't reserve space
    explicitly for, and we've grown a lot of weird ad-hoc heuristics to know
    if we're running short on space and when it's time to force a commit.  A
    failure rate of 20-40 file systems when we run hundreds of thousands of
    them isn't super high, but cleaning up this code will make things less
    ugly and more predictible.
    
    Thus the delayed refs rsv.  We always know how many delayed refs we have
    outstanding, and although running them generates more we can use the
    global reserve for that spill over, which fits better into it's desired
    use than a full blown reservation.  This first approach is to simply
    take how many times we're reserving space for and multiply that by 2 in
    order to save enough space for the delayed refs that could be generated.
    This is a niave approach and will probably evolve, but for now it works.
    Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
    Reviewed-by: David Sterba <dsterba@suse.com> # high-level review
    [ added background notes from the cover letter ]
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    ba2c4d4e
disk-io.c 124 KB