• Filipe Manana's avatar
    btrfs: update generation of hole file extent item when merging holes · e6e3dec6
    Filipe Manana authored
    When punching a hole into a file range that is adjacent with a hole and we
    are not using the no-holes feature, we expand the range of the adjacent
    file extent item that represents a hole, to save metadata space.
    
    However we don't update the generation of hole file extent item, which
    means a full fsync will not log that file extent item if the fsync happens
    in a later transaction (since commit 7f30c072 ("btrfs: stop copying
    old file extents when doing a full fsync")).
    
    For example, if we do this:
    
        $ mkfs.btrfs -f -O ^no-holes /dev/sdb
        $ mount /dev/sdb /mnt
        $ xfs_io -f -c "pwrite -S 0xab 2M 2M" /mnt/foobar
        $ sync
    
    We end up with 2 file extent items in our file:
    
    1) One that represents the hole for the file range [0, 2M), with a
       generation of 7;
    
    2) Another one that represents an extent covering the range [2M, 4M).
    
    After that if we do the following:
    
        $ xfs_io -c "fpunch 2M 2M" /mnt/foobar
    
    We end up with a single file extent item in the file, which represents a
    hole for the range [0, 4M) and with a generation of 7 - because we end
    dropping the data extent for range [2M, 4M) and then update the file
    extent item that represented the hole at [0, 2M), by increasing
    length from 2M to 4M.
    
    Then doing a full fsync and power failing:
    
        $ xfs_io -c "fsync" /mnt/foobar
        <power failure>
    
    will result in the full fsync not logging the file extent item that
    represents the hole for the range [0, 4M), because its generation is 7,
    which is lower than the generation of the current transaction (8).
    As a consequence, after mounting again the filesystem (after log replay),
    the region [2M, 4M) does not have a hole, it still points to the
    previous data extent.
    
    So fix this by always updating the generation of existing file extent
    items representing holes when we merge/expand them. This solves the
    problem and it's the same approach as when we merge prealloc extents that
    got written (at btrfs_mark_extent_written()). Setting the generation to
    the current transaction's generation is also what we do when merging
    the new hole extent map with the previous one or the next one.
    
    A test case for fstests, covering both cases of hole file extent item
    merging (to the left and to the right), will be sent soon.
    
    Fixes: 7f30c072 ("btrfs: stop copying old file extents when doing a full fsync")
    CC: stable@vger.kernel.org # 5.18+
    Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    e6e3dec6
file.c 106 KB