• Omar Sandoval's avatar
    Btrfs: rework delayed ref total_bytes_pinned accounting · d7eae340
    Omar Sandoval authored
    The total_bytes_pinned counter is completely broken when accounting
    delayed refs:
    
    - If two drops for the same extent are merged, we will decrement
      total_bytes_pinned twice but only increment it once.
    - If an add is merged into a drop or vice versa, we will decrement the
      total_bytes_pinned counter but never increment it.
    - If multiple references to an extent are dropped, we will account it
      multiple times, potentially vastly over-estimating the number of bytes
      that will be freed by a commit and doing unnecessary work when we're
      close to ENOSPC.
    
    The last issue is relatively minor, but the first two make the
    total_bytes_pinned counter leak or underflow very often. These
    accounting issues were introduced in b150a4f1 ("Btrfs: use a percpu
    to keep track of possibly pinned bytes"), but they were papered over by
    zeroing out the counter on every commit until d288db5d ("Btrfs: fix
    race of using total_bytes_pinned").
    
    We need to make sure that an extent is accounted as pinned exactly once
    if and only if we will drop references to it when when the transaction
    is committed. Ideally we would only add to total_bytes_pinned when the
    *last* reference is dropped, but this information isn't readily
    available for data extents. Again, this over-estimation can lead to
    extra commits when we're close to ENOSPC, but it's not as bad as before.
    
    The fix implemented here is to increment total_bytes_pinned when the
    total refmod count for an extent goes negative and decrement it if the
    refmod count goes back to non-negative or after we've run all of the
    delayed refs for that extent.
    Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
    Tested-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
    Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    d7eae340
extent-tree.c 300 KB