1. 08 Feb, 2021 40 commits
    • Josef Bacik's avatar
      btrfs: fix reloc root leak with 0 ref reloc roots on recovery · c78a10ae
      Josef Bacik authored
      When recovering a relocation, if we run into a reloc root that has 0
      refs we simply add it to the reloc_control->reloc_roots list, and then
      clean it up later.  The problem with this is __del_reloc_root() doesn't
      do anything if the root isn't in the radix tree, which in this case it
      won't be because we never call __add_reloc_root() on the reloc_root.
      
      This exit condition simply isn't correct really.  During normal
      operation we can remove ourselves from the rb tree and then we're meant
      to clean up later at merge_reloc_roots() time, and this happens
      correctly.  During recovery we're depending on free_reloc_roots() to
      drop our references, but we're short-circuiting.
      
      Fix this by continuing to check if we're on the list and dropping
      ourselves from the reloc_control root list and dropping our reference
      appropriately.  Change the corresponding BUG_ON() to an ASSERT() that
      does the correct thing if we aren't in the rb tree.
      
      CC: stable@vger.kernel.org # 4.4+
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      c78a10ae
    • Nigel Christian's avatar
      btrfs: remove repeated word in struct member comment · 2e626e56
      Nigel Christian authored
      Comment for processed extent end of range has an unnecessary "in",
      remove it.
      Signed-off-by: default avatarNigel Christian <nigel.l.christian@gmail.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2e626e56
    • Josef Bacik's avatar
      btrfs: account for new extents being deleted in total_bytes_pinned · 81e75ac7
      Josef Bacik authored
      My recent patch set "A variety of lock contention fixes", found here
      
      https://lore.kernel.org/linux-btrfs/cover.1608319304.git.josef@toxicpanda.com/
      (Tracked in https://github.com/btrfs/linux/issues/86)
      
      that reduce lock contention on the extent root by running delayed refs
      less often resulted in a regression in generic/371.  This test
      fallocate()'s the fs until it's full, deletes all the files, and then
      tries to fallocate() until full again.
      
      Before these patches we would run all of the delayed refs during
      flushing, and then would commit the transaction because we had plenty of
      pinned space to recover in order to allocate.  However my patches made
      it so we weren't running the delayed refs as aggressively, which meant
      that we appeared to have less pinned space when we were deciding to
      commit the transaction.
      
      We use the space_info->total_bytes_pinned to approximate how much space
      we have pinned.  It's approximate because if we remove a reference to an
      extent we may free it, but there may be more references to it than we
      know of at that point, but we account it as pinned at the creation time,
      and then it's properly accounted when the delayed ref runs.
      
      The way we account for pinned space is if the
      delayed_ref_head->total_ref_mod is < 0, because that is clearly a
      freeing option.  However there is another case, and that is where
      ->total_ref_mod == 0 && ->must_insert_reserved == 1.
      
      When we allocate a new extent, we have ->total_ref_mod == 1 and we have
      ->must_insert_reserved == 1.  This is used to indicate that it is a
      brand new extent and will need to have its extent entry added before we
      modify any references on the delayed ref head.  But if we subsequently
      remove that extent reference, our ->total_ref_mod will be 0, and that
      space will be pinned and freed.  Accounting for this case properly
      allows for generic/371 to pass with my delayed refs patches applied.
      
      It's important to note that this problem exists without the referenced
      patches, it just was uncovered by them.
      
      CC: stable@vger.kernel.org # 5.10
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      81e75ac7
    • Josef Bacik's avatar
      btrfs: handle space_info::total_bytes_pinned inside the delayed ref itself · 2187374f
      Josef Bacik authored
      Currently we pass things around to figure out if we maybe freeing data
      based on the state of the delayed refs head.  This makes the accounting
      sort of confusing and hard to follow, as it's distinctly separate from
      the delayed ref heads stuff, but also depends on it entirely.
      
      Fix this by explicitly adjusting the space_info->total_bytes_pinned in
      the delayed refs code.  We now have two places where we modify this
      counter, once where we create the delayed and destroy the delayed refs,
      and once when we pin and unpin the extents.  This means there is a
      slight overlap between delayed refs and the pin/unpin mechanisms, but
      this is simply used by the ENOSPC infrastructure to determine if we need
      to commit the transaction, so there's no adverse affect from this, we
      might simply commit thinking it will give us enough space when it might
      not.
      
      CC: stable@vger.kernel.org # 5.10
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2187374f
    • Nikolay Borisov's avatar
      btrfs: enable W=1 checks for btrfs · e9aa7c28
      Nikolay Borisov authored
      Now that the btrfs' codebase is clean of almost all W=1 warnings let's
      enable those checks unconditionally for the entire fs/btrfs/ and its
      subdirectories to catch potential errors during development.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      [ add some comments ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      e9aa7c28
    • Nikolay Borisov's avatar
      lib/zstd: convert constants to defines · 71c36788
      Nikolay Borisov authored
      These constants are really used internally by zstd and including
      linux/zstd.h into users results in the following warnings:
      
      In file included from fs/btrfs/zstd.c:19:
      ./include/linux/zstd.h:798:21: warning: ‘ZSTD_skippableHeaderSize’ defined but not used [-Wunused-const-variable=]
        798 | static const size_t ZSTD_skippableHeaderSize = 8;
            |                     ^~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/zstd.h:796:21: warning: ‘ZSTD_frameHeaderSize_max’ defined but not used [-Wunused-const-variable=]
        796 | static const size_t ZSTD_frameHeaderSize_max = ZSTD_FRAMEHEADERSIZE_MAX;
            |                     ^~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/zstd.h:795:21: warning: ‘ZSTD_frameHeaderSize_min’ defined but not used [-Wunused-const-variable=]
        795 | static const size_t ZSTD_frameHeaderSize_min = ZSTD_FRAMEHEADERSIZE_MIN;
            |                     ^~~~~~~~~~~~~~~~~~~~~~~~
      ./include/linux/zstd.h:794:21: warning: ‘ZSTD_frameHeaderSize_prefix’ defined but not used [-Wunused-const-variable=]
        794 | static const size_t ZSTD_frameHeaderSize_prefix = 5;
      
      So fix those warnings by turning the constants into defines.
      Reviewed-by: default avatarNick Terrell <terrelln@fb.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      71c36788
    • Nikolay Borisov's avatar
      btrfs: zoned: remove unused variable in btrfs_sb_log_location_bdev · 8c31a3db
      Nikolay Borisov authored
      This fixes warning:
      
      fs/btrfs/zoned.c:491:6: warning: variable ‘zone_size’ set but not used [-Wunused-but-set-variable]
        491 |  u64 zone_size;
      
      which got introduced in 12659251 ("btrfs: implement log-structured
      superblock for ZONED mode"). We'll enable the warning by default and
      want clean build until the relevant zoned patches land.
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      8c31a3db
    • Nikolay Borisov's avatar
      btrfs: fix parameter description for functions in extent_io.c · 3bed2da1
      Nikolay Borisov authored
      This makes the file W=1 clean and fixes the following warnings:
      
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'tree' not described in '__etree_search'
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'offset' not described in '__etree_search'
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'next_ret' not described in '__etree_search'
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'prev_ret' not described in '__etree_search'
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'p_ret' not described in '__etree_search'
      fs/btrfs/extent_io.c:414: warning: Function parameter or member 'parent_ret' not described in '__etree_search'
      fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'tree' not described in 'find_contiguous_extent_bit'
      fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'start' not described in 'find_contiguous_extent_bit'
      fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'start_ret' not described in 'find_contiguous_extent_bit'
      fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'end_ret' not described in 'find_contiguous_extent_bit'
      fs/btrfs/extent_io.c:1607: warning: Function parameter or member 'bits' not described in 'find_contiguous_extent_bit'
      fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'tree' not described in 'find_first_clear_extent_bit'
      fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'start' not described in 'find_first_clear_extent_bit'
      fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'start_ret' not described in 'find_first_clear_extent_bit'
      fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'end_ret' not described in 'find_first_clear_extent_bit'
      fs/btrfs/extent_io.c:1644: warning: Function parameter or member 'bits' not described in 'find_first_clear_extent_bit'
      fs/btrfs/extent_io.c:4187: warning: Function parameter or member 'epd' not described in 'extent_write_cache_pages'
      fs/btrfs/extent_io.c:4187: warning: Excess function parameter 'data' description in 'extent_write_cache_pages'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      3bed2da1
    • Nikolay Borisov's avatar
      btrfs: fix parameter description in space-info.c · d98b188e
      Nikolay Borisov authored
      With these fixes space-info.c is clear for W=1 warnings, namely the
      following ones are fixed:
      
      fs/btrfs/space-info.c:575: warning: Function parameter or member 'fs_info' not described in 'may_commit_transaction'
      fs/btrfs/space-info.c:575: warning: Function parameter or member 'space_info' not described in 'may_commit_transaction'
      fs/btrfs/space-info.c:1231: warning: Function parameter or member 'fs_info' not described in 'handle_reserve_ticket'
      fs/btrfs/space-info.c:1231: warning: Function parameter or member 'space_info' not described in 'handle_reserve_ticket'
      fs/btrfs/space-info.c:1231: warning: Function parameter or member 'ticket' not described in 'handle_reserve_ticket'
      fs/btrfs/space-info.c:1231: warning: Function parameter or member 'flush' not described in 'handle_reserve_ticket'
      fs/btrfs/space-info.c:1315: warning: Function parameter or member 'fs_info' not described in '__reserve_bytes'
      fs/btrfs/space-info.c:1315: warning: Function parameter or member 'space_info' not described in '__reserve_bytes'
      fs/btrfs/space-info.c:1315: warning: Function parameter or member 'orig_bytes' not described in '__reserve_bytes'
      fs/btrfs/space-info.c:1315: warning: Function parameter or member 'flush' not described in '__reserve_bytes'
      fs/btrfs/space-info.c:1427: warning: Function parameter or member 'root' not described in 'btrfs_reserve_metadata_bytes'
      fs/btrfs/space-info.c:1427: warning: Function parameter or member 'block_rsv' not described in 'btrfs_reserve_metadata_bytes'
      fs/btrfs/space-info.c:1427: warning: Function parameter or member 'orig_bytes' not described in 'btrfs_reserve_metadata_bytes'
      fs/btrfs/space-info.c:1427: warning: Function parameter or member 'flush' not described in 'btrfs_reserve_metadata_bytes'
      fs/btrfs/space-info.c:1462: warning: Function parameter or member 'fs_info' not described in 'btrfs_reserve_data_bytes'
      fs/btrfs/space-info.c:1462: warning: Function parameter or member 'bytes' not described in 'btrfs_reserve_data_bytes'
      fs/btrfs/space-info.c:1462: warning: Function parameter or member 'flush' not described in 'btrfs_reserve_data_bytes'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      d98b188e
    • Nikolay Borisov's avatar
      btrfs: fix parameter description of btrfs_inode_rsv_release/btrfs_delalloc_release_space · b762d1d0
      Nikolay Borisov authored
      Fixes following warnings:
      
      fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'inode' not described in 'btrfs_inode_rsv_release'
      fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_inode_rsv_release'
      fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'reserved' not described in 'btrfs_delalloc_release_space'
      fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_delalloc_release_space'
      fs/btrfs/delalloc-space.c:472: warning: Excess function parameter 'release_bytes' description in 'btrfs_delalloc_release_space'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      b762d1d0
    • Nikolay Borisov's avatar
      6e353e3b
    • Nikolay Borisov's avatar
      btrfs: fix description format of fs_info of btrfs_wait_on_delayed_iputs · 2639631d
      Nikolay Borisov authored
      Fixes fs/btrfs/inode.c:3101: warning: Function parameter or member 'fs_info' not described in 'btrfs_wait_on_delayed_iputs'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      2639631d
    • Nikolay Borisov's avatar
      btrfs: document fs_info in btrfs_rmap_block · 9ee9b979
      Nikolay Borisov authored
      Fixes fs/btrfs/block-group.c:1570: warning: Function parameter or member 'fs_info' not described in 'btrfs_rmap_block'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9ee9b979
    • Nikolay Borisov's avatar
      btrfs: document now parameter of peek_discard_list · 92419695
      Nikolay Borisov authored
      Fixes fs/btrfs/discard.c:203: warning: Function parameter or member 'now' not described in 'peek_discard_list'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      92419695
    • Nikolay Borisov's avatar
      btrfs: improve parameter description for __btrfs_write_out_cache · f092cf3c
      Nikolay Borisov authored
      Fixes following W=1 warnings:
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'root' not described in '__btrfs_write_out_cache'
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'inode' not described in '__btrfs_write_out_cache'
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'ctl' not described in '__btrfs_write_out_cache'
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'block_group' not described in '__btrfs_write_out_cache'
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'io_ctl' not described in '__btrfs_write_out_cache'
      fs/btrfs/free-space-cache.c:1317: warning: Function parameter or member 'trans' not described in '__btrfs_write_out_cache'
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      f092cf3c
    • Nikolay Borisov's avatar
      btrfs: fix parameter description in delayed-ref.c functions · 696eb22b
      Nikolay Borisov authored
      This fixes the following warnings:
      
      fs/btrfs/delayed-ref.c:80: warning: Function parameter or member 'fs_info' not described in 'btrfs_delayed_refs_rsv_release'
      fs/btrfs/delayed-ref.c:80: warning: Function parameter or member 'nr' not described in 'btrfs_delayed_refs_rsv_release'
      fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'fs_info' not described in 'btrfs_migrate_to_delayed_refs_rsv'
      fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'src' not described in 'btrfs_migrate_to_delayed_refs_rsv'
      fs/btrfs/delayed-ref.c:128: warning: Function parameter or member 'num_bytes' not described in 'btrfs_migrate_to_delayed_refs_rsv'
      fs/btrfs/delayed-ref.c:174: warning: Function parameter or member 'fs_info' not described in 'btrfs_delayed_refs_rsv_refill'
      fs/btrfs/delayed-ref.c:174: warning: Function parameter or member 'flush' not described in 'btrfs_delayed_refs_rsv_refill'
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      696eb22b
    • Nikolay Borisov's avatar
      btrfs: fix function description formats in file-item.c · ca4207ae
      Nikolay Borisov authored
      This fixes following W=1 warnings:
      
      fs/btrfs/file-item.c:27: warning: Cannot understand  * @inode:  the inode we want to update the disk_i_size for
       on line 27 - I thought it was a doc line
      fs/btrfs/file-item.c:65: warning: Cannot understand  * @inode - the inode we're modifying
       on line 65 - I thought it was a doc line
      fs/btrfs/file-item.c:91: warning: Cannot understand  * @inode - the inode we're modifying
       on line 91 - I thought it was a doc line
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      ca4207ae
    • Nikolay Borisov's avatar
      btrfs: fix parameter description of btrfs_add_extent_mapping · 9ad37bb3
      Nikolay Borisov authored
      This fixes the following compiler warnings:
      
      fs/btrfs/extent_map.c:601: warning: Function parameter or member 'fs_info' not described in 'btrfs_add_extent_mapping'
      fs/btrfs/extent_map.c:601: warning: Function parameter or member 'em_tree' not described in 'btrfs_add_extent_mapping'
      fs/btrfs/extent_map.c:601: warning: Function parameter or member 'em_in' not described in 'btrfs_add_extent_mapping'
      fs/btrfs/extent_map.c:601: warning: Function parameter or member 'start' not described in 'btrfs_add_extent_mapping'
      fs/btrfs/extent_map.c:601: warning: Function parameter or member 'len' not described in 'btrfs_add_extent_mapping'
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9ad37bb3
    • Nikolay Borisov's avatar
      btrfs: document modified parameter of add_extent_mapping · 401bd2dd
      Nikolay Borisov authored
      Fixes fs/btrfs/extent_map.c:399: warning: Function parameter or member
      'modified' not described in 'add_extent_mapping'
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      401bd2dd
    • Qu Wenruo's avatar
      btrfs: rework the order of btrfs_ordered_extent::flags · 3c198fe0
      Qu Wenruo authored
      [BUG]
      There is a long existing bug in the last parameter of
      btrfs_add_ordered_extent(), in commit 771ed689 ("Btrfs: Optimize
      compressed writeback and reads") back to 2008.
      
      In that ancient commit btrfs_add_ordered_extent() expects the @type
      parameter to be one of the following:
      
      - BTRFS_ORDERED_REGULAR
      - BTRFS_ORDERED_NOCOW
      - BTRFS_ORDERED_PREALLOC
      - BTRFS_ORDERED_COMPRESSED
      
      But we pass 0 in cow_file_range(), which means BTRFS_ORDERED_IO_DONE.
      
      Ironically extra check in __btrfs_add_ordered_extent() won't set the bit
      if we see (type == IO_DONE || type == IO_COMPLETE), and avoid any
      obvious bug.
      
      But this still leads to regular COW ordered extent having no bit to
      indicate its type in various trace events, rendering REGULAR bit
      useless.
      
      [FIX]
      Change the following aspects to avoid such problem:
      
      - Reorder btrfs_ordered_extent::flags
        Now the type bits go first (REGULAR/NOCOW/PREALLCO/COMPRESSED), then
        DIRECT bit, finally extra status bits like IO_DONE/COMPLETE/IOERR.
      
      - Add extra ASSERT() for btrfs_add_ordered_extent_*()
      
      - Remove @type parameter for btrfs_add_ordered_extent_compress()
        As the only valid @type here is BTRFS_ORDERED_COMPRESSED.
      
      - Remove the unnecessary special check for IO_DONE/COMPLETE in
        __btrfs_add_ordered_extent()
        This is just to make the code work, with extra ASSERT(), there are
        limited values can be passed in.
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      3c198fe0
    • Yang Li's avatar
      btrfs: remove redundant NULL check before kvfree · fe3b7bb0
      Yang Li authored
      Fix below warnings reported by coccicheck:
      ./fs/btrfs/raid56.c:237:2-8: WARNING: NULL check before some freeing
      functions is not needed.
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Reviewed-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarYang Li <abaci-bugfix@linux.alibaba.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      fe3b7bb0
    • Josef Bacik's avatar
      btrfs: do not cleanup upper nodes in btrfs_backref_cleanup_node · 7e2a870a
      Josef Bacik authored
      Zygo reported the following panic when testing my error handling patches
      for relocation:
      
        kernel BUG at fs/btrfs/backref.c:2545!
        invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 3 PID: 8472 Comm: btrfs Tainted: G        W 14
        Hardware name: QEMU Standard PC (i440FX + PIIX,
      
        Call Trace:
         btrfs_backref_error_cleanup+0x4df/0x530
         build_backref_tree+0x1a5/0x700
         ? _raw_spin_unlock+0x22/0x30
         ? release_extent_buffer+0x225/0x280
         ? free_extent_buffer.part.52+0xd7/0x140
         relocate_tree_blocks+0x2a6/0xb60
         ? kasan_unpoison_shadow+0x35/0x50
         ? do_relocation+0xc10/0xc10
         ? kasan_kmalloc+0x9/0x10
         ? kmem_cache_alloc_trace+0x6a3/0xcb0
         ? free_extent_buffer.part.52+0xd7/0x140
         ? rb_insert_color+0x342/0x360
         ? add_tree_block.isra.36+0x236/0x2b0
         relocate_block_group+0x2eb/0x780
         ? merge_reloc_roots+0x470/0x470
         btrfs_relocate_block_group+0x26e/0x4c0
         btrfs_relocate_chunk+0x52/0x120
         btrfs_balance+0xe2e/0x18f0
         ? pvclock_clocksource_read+0xeb/0x190
         ? btrfs_relocate_chunk+0x120/0x120
         ? lock_contended+0x620/0x6e0
         ? do_raw_spin_lock+0x1e0/0x1e0
         ? do_raw_spin_unlock+0xa8/0x140
         btrfs_ioctl_balance+0x1f9/0x460
         btrfs_ioctl+0x24c8/0x4380
         ? __kasan_check_read+0x11/0x20
         ? check_chain_key+0x1f4/0x2f0
         ? __asan_loadN+0xf/0x20
         ? btrfs_ioctl_get_supported_features+0x30/0x30
         ? kvm_sched_clock_read+0x18/0x30
         ? check_chain_key+0x1f4/0x2f0
         ? lock_downgrade+0x3f0/0x3f0
         ? handle_mm_fault+0xad6/0x2150
         ? do_vfs_ioctl+0xfc/0x9d0
         ? ioctl_file_clone+0xe0/0xe0
         ? check_flags.part.50+0x6c/0x1e0
         ? check_flags.part.50+0x6c/0x1e0
         ? check_flags+0x26/0x30
         ? lock_is_held_type+0xc3/0xf0
         ? syscall_enter_from_user_mode+0x1b/0x60
         ? do_syscall_64+0x13/0x80
         ? rcu_read_lock_sched_held+0xa1/0xd0
         ? __kasan_check_read+0x11/0x20
         ? __fget_light+0xae/0x110
         __x64_sys_ioctl+0xc3/0x100
         do_syscall_64+0x37/0x80
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      This occurs because of this check
      
        if (RB_EMPTY_NODE(&upper->rb_node))
      	  BUG_ON(!list_empty(&node->upper));
      
      As we are dropping the backref node, if we discover that our upper node
      in the edge we just cleaned up isn't linked into the cache that we are
      now done with this node, thus the BUG_ON().
      
      However this is an erroneous assumption, as we will look up all the
      references for a node first, and then process the pending edges.  All of
      the 'upper' nodes in our pending edges won't be in the cache's rb_tree
      yet, because they haven't been processed.  We could very well have many
      edges still left to cleanup on this node.
      
      The fact is we simply do not need this check, we can just process all of
      the edges only for this node, because below this check we do the
      following
      
        if (list_empty(&upper->lower)) {
      	  list_add_tail(&upper->lower, &cache->leaves);
      	  upper->lowest = 1;
        }
      
      If the upper node truly isn't used yet, then we add it to the
      cache->leaves list to be cleaned up later.  If it is still used then the
      last child node that has it linked into its node will add it to the
      leaves list and then it will be cleaned up.
      
      Fix this problem by dropping this logic altogether.  With this fix I no
      longer see the panic when testing with error injection in the backref
      code.
      
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      7e2a870a
    • Josef Bacik's avatar
      btrfs: keep track of the root owner for relocation reads · f7ba2d37
      Josef Bacik authored
      While testing the error paths in relocation, I hit the following lockdep
      splat:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        5.10.0-rc3+ #206 Not tainted
        ------------------------------------------------------
        btrfs-balance/1571 is trying to acquire lock:
        ffff8cdbcc8f77d0 (&head_ref->mutex){+.+.}-{3:3}, at: btrfs_lookup_extent_info+0x156/0x3b0
      
        but task is already holding lock:
        ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
      
        which lock already depends on the new lock.
      
        the existing dependency chain (in reverse order) is:
      
        -> #2 (btrfs-tree-00){++++}-{3:3}:
      	 down_write_nested+0x43/0x80
      	 __btrfs_tree_lock+0x27/0x100
      	 btrfs_search_slot+0x248/0x890
      	 relocate_tree_blocks+0x490/0x650
      	 relocate_block_group+0x1ba/0x5d0
      	 kretprobe_trampoline+0x0/0x50
      
        -> #1 (btrfs-csum-01){++++}-{3:3}:
      	 down_read_nested+0x43/0x130
      	 __btrfs_tree_read_lock+0x27/0x100
      	 btrfs_read_lock_root_node+0x31/0x40
      	 btrfs_search_slot+0x5ab/0x890
      	 btrfs_del_csums+0x10b/0x3c0
      	 __btrfs_free_extent+0x49d/0x8e0
      	 __btrfs_run_delayed_refs+0x283/0x11f0
      	 btrfs_run_delayed_refs+0x86/0x220
      	 btrfs_start_dirty_block_groups+0x2ba/0x520
      	 kretprobe_trampoline+0x0/0x50
      
        -> #0 (&head_ref->mutex){+.+.}-{3:3}:
      	 __lock_acquire+0x1167/0x2150
      	 lock_acquire+0x116/0x3e0
      	 __mutex_lock+0x7e/0x7b0
      	 btrfs_lookup_extent_info+0x156/0x3b0
      	 walk_down_proc+0x1c3/0x280
      	 walk_down_tree+0x64/0xe0
      	 btrfs_drop_subtree+0x182/0x260
      	 do_relocation+0x52e/0x660
      	 relocate_tree_blocks+0x2ae/0x650
      	 relocate_block_group+0x1ba/0x5d0
      	 kretprobe_trampoline+0x0/0x50
      
        other info that might help us debug this:
      
        Chain exists of:
          &head_ref->mutex --> btrfs-csum-01 --> btrfs-tree-00
      
         Possible unsafe locking scenario:
      
      	 CPU0                    CPU1
      	 ----                    ----
          lock(btrfs-tree-00);
      				 lock(btrfs-csum-01);
      				 lock(btrfs-tree-00);
          lock(&head_ref->mutex);
      
         *** DEADLOCK ***
      
        5 locks held by btrfs-balance/1571:
         #0: ffff8cdb89749ff8 (&fs_info->delete_unused_bgs_mutex){+.+.}-{3:3}, at: btrfs_balance+0x563/0xf40
         #1: ffff8cdb89748838 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x156/0x300
         #2: ffff8cdbc2c16650 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x413/0x5c0
         #3: ffff8cdbc135f538 (btrfs-treloc-01){+.+.}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
         #4: ffff8cdbc54adbf8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x27/0x100
      
        stack backtrace:
        CPU: 1 PID: 1571 Comm: btrfs-balance Not tainted 5.10.0-rc3+ #206
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
        Call Trace:
         dump_stack+0x8b/0xb0
         check_noncircular+0xcf/0xf0
         ? trace_call_bpf+0x139/0x260
         __lock_acquire+0x1167/0x2150
         lock_acquire+0x116/0x3e0
         ? btrfs_lookup_extent_info+0x156/0x3b0
         __mutex_lock+0x7e/0x7b0
         ? btrfs_lookup_extent_info+0x156/0x3b0
         ? btrfs_lookup_extent_info+0x156/0x3b0
         ? release_extent_buffer+0x124/0x170
         ? _raw_spin_unlock+0x1f/0x30
         ? release_extent_buffer+0x124/0x170
         btrfs_lookup_extent_info+0x156/0x3b0
         walk_down_proc+0x1c3/0x280
         walk_down_tree+0x64/0xe0
         btrfs_drop_subtree+0x182/0x260
         do_relocation+0x52e/0x660
         relocate_tree_blocks+0x2ae/0x650
         ? add_tree_block+0x149/0x1b0
         relocate_block_group+0x1ba/0x5d0
         elfcorehdr_read+0x40/0x40
         ? elfcorehdr_read+0x40/0x40
         ? btrfs_balance+0x796/0xf40
         ? __kthread_parkme+0x66/0x90
         ? btrfs_balance+0xf40/0xf40
         ? balance_kthread+0x37/0x50
         ? kthread+0x137/0x150
         ? __kthread_bind_mask+0x60/0x60
         ? ret_from_fork+0x1f/0x30
      
      As you can see this is bogus, we never take another tree's lock under
      the csum lock.  This happens because sometimes we have to read tree
      blocks from disk without knowing which root they belong to during
      relocation.  We defaulted to an owner of 0, which translates to an fs
      tree.  This is fine as all fs trees have the same class, but obviously
      isn't fine if the block belongs to a COW only tree.
      
      Thankfully COW only trees only have their owners root as a reference to
      them, and since we already look up the extent information during
      relocation, go ahead and check and see if this block might belong to a
      COW only tree, and if so save the owner in the tree_block struct.  This
      allows us to read_tree_block with the proper owner, which gets rid of
      this lockdep splat.
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      f7ba2d37
    • Qu Wenruo's avatar
      btrfs: introduce helper to grab an existing extent buffer from a page · c0f0a9e7
      Qu Wenruo authored
      This patch will extract the code to grab an extent buffer from a page
      into a helper, grab_extent_buffer_from_page().
      
      This reduces one indent level, and provides the work place for later
      expansion for subapge support.
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      c0f0a9e7
    • Qu Wenruo's avatar
      btrfs: update comment for btrfs_dirty_pages · c0fab480
      Qu Wenruo authored
      The original comment is from the initial merge, which has several
      problems:
      
      - No holes check any more
      - No inline decision is made
      
      Update the out-of-date comment with more correct one.
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      c0fab480
    • Qu Wenruo's avatar
      btrfs: refactor __extent_writepage_io() to improve readability · 6bc5636a
      Qu Wenruo authored
      The refactoring involves the following modifications:
      
      - iosize alignment
        In fact we don't really need to manually do alignment at all.
        All extent maps should already be aligned, thus basic ASSERT() check
        would be enough.
      
      - redundant variables
        We have extra variable like blocksize/pg_offset/end.
        They are all unnecessary.
      
        @blocksize can be replaced by sectorsize size directly, and it's only
        used to verify the em start/size is aligned.
      
        @pg_offset can be easily calculated using @cur and page_offset(page).
      
        @end is just assigned from @page_end and never modified, use
        "start + PAGE_SIZE - 1" directly and remove @page_end.
      
      - remove some BUG_ON()s
        The BUG_ON()s are for extent map, which we have tree-checker to check
        on-disk extent data item and runtime check.
        ASSERT() should be enough.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6bc5636a
    • Qu Wenruo's avatar
      btrfs: rename parameter offset to disk_bytenr in submit_extent_page · 0c64c33c
      Qu Wenruo authored
      The parameter offset is confusing, it's supposed to be the disk bytenr
      of metadata/data.  Rename it to disk_bytenr and update the comment.
      
      Also rename each offset passed to submit_extent_page() as @disk_bytenr
      so they're consistent.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0c64c33c
    • Qu Wenruo's avatar
      btrfs: refactor btrfs_dec_test_* functions for ordered extents · 58f74b22
      Qu Wenruo authored
      The refactoring involves the following modifications:
      
      - Return bool instead of int
      
      - Parameter update for @cached of btrfs_dec_test_first_ordered_pending()
        For btrfs_dec_test_first_ordered_pending(), @cached is only used to
        return the finished ordered extent.
        Rename it to @finished_ret.
      
      - Comment updates
      
        * Change one stale comment
          Which still refers to btrfs_dec_test_ordered_pending(), but the
          context is calling  btrfs_dec_test_first_ordered_pending().
        * Follow the common comment style for both functions
          Add more detailed descriptions for parameters and the return value
        * Move the reason why test_and_set_bit() is used into the call sites
      
      - Change how the return value is calculated
        The most anti-human part of the return value is:
      
          if (...)
      	ret = 1;
          ...
          return ret == 0;
      
        This means, when we set ret to 1, the function returns 0.
        Change the local variable name to @finished, and directly return the
        value of it.
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      58f74b22
    • Qu Wenruo's avatar
      btrfs: make btrfs_dio_private::bytes u32 · 523929f1
      Qu Wenruo authored
      btrfs_dio_private::bytes is only assigned from bio::bi_iter::bi_size,
      which is never larger than U32.
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      523929f1
    • Nikolay Borisov's avatar
      btrfs: remove always true condition in btrfs_start_delalloc_roots · d7830b71
      Nikolay Borisov authored
      Following the rework in e076ab2a ("btrfs: shrink delalloc pages
      instead of full inodes") the nr variable is no longer passed by
      reference to start_delalloc_inodes hence it cannot change. Additionally
      we are always guaranteed for it to be positive number hence it's
      redundant to have it as a condition in the loop. Simply remove that
      usage.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      d7830b71
    • Nikolay Borisov's avatar
      btrfs: make btrfs_start_delalloc_root's nr argument a long · 9db4dc24
      Nikolay Borisov authored
      It's currently u64 which gets instantly translated either to LONG_MAX
      (if U64_MAX is passed) or cast to an unsigned long (which is in fact,
      wrong because writeback_control::nr_to_write is a signed, long type).
      
      Just convert the function's argument to be long time which obviates the
      need to manually convert u64 value to a long. Adjust all call sites
      which pass U64_MAX to pass LONG_MAX. Finally ensure that in
      shrink_delalloc the u64 is converted to a long without overflowing,
      resulting in a negative number.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9db4dc24
    • Filipe Manana's avatar
      btrfs: send: remove stale code when checking for shared extents · 9c4a062a
      Filipe Manana authored
      After commit 040ee612 ("Btrfs: send, improve clone range") we do not
      use anymore the data_offset field of struct backref_ctx, as after that we
      do all the necessary checks for the data offset of file extent items at
      clone_range(). Since there are no more users of data_offset from that
      structure, remove it.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      9c4a062a
    • Nikolay Borisov's avatar
      btrfs: consolidate btrfs_previous_item ret val handling in btrfs_shrink_device · 7056bf69
      Nikolay Borisov authored
      Instead of having three 'if' to handle non-NULL return value consolidate
      this in one 'if (ret)'. That way the code is more obvious:
      
       - Always drop delete_unused_bgs_mutex if ret is not NULL
       - If ret is negative -> goto done
       - If it's 1 -> reset ret to 0, release the path and finish the loop.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      7056bf69
    • Josef Bacik's avatar
      btrfs: ref-verify: make sure owner is set for all refs · 1478143a
      Josef Bacik authored
      I noticed that shared ref entries in ref-verify didn't have the proper
      owner set, which caused me to think there was something seriously wrong.
      However the problem is if we have a parent we simply weren't filling out
      the owner part of the reference, even though we have it.
      
      Fix this by making sure we set all the proper fields when we modify a
      reference, this way we'll have the proper owner if a problem happens and
      we don't waste time thinking we're updating the wrong level.
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      1478143a
    • Josef Bacik's avatar
      btrfs: ref-verify: pass down tree block level when building refs · 0d73a11c
      Josef Bacik authored
      I noticed that sometimes I would have the wrong level printed out with
      ref-verify while testing some error injection related problems.  This is
      because we only get the level from the main extent item, but our
      references could go off the current leaf into another, and at that point
      we lose our level.
      
      Fix this by keeping track of the last tree block level that we found,
      the same way we keep track of our bytenr and num_bytes, in case we
      happen to wander into another leaf while still processing the references
      for a bytenr.
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      0d73a11c
    • Josef Bacik's avatar
      btrfs: noinline btrfs_should_cancel_balance · 1fec12a5
      Josef Bacik authored
      I was attempting to reproduce a problem that Zygo hit, but my error
      injection wasn't firing for a few of the common calls to
      btrfs_should_cancel_balance.  This is because the compiler decided to
      inline it at these spots.  Keep this from happening by explicitly
      marking the function as noinline so that error injection will always
      work.
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      1fec12a5
    • Josef Bacik's avatar
      btrfs: allow error injection for btrfs_search_slot and btrfs_cow_block · f75e2b79
      Josef Bacik authored
      The following patches are going to address error handling in relocation,
      in order to test those patches I need to be able to inject errors in
      btrfs_search_slot and btrfs_cow_block, as we call both of these pretty
      often in different cases during relocation.
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      f75e2b79
    • Nikolay Borisov's avatar
      btrfs: remove new_dirid argument from btrfs_create_subvol_root · 69948022
      Nikolay Borisov authored
      It's no longer used. While at it also remove new_dirid in create_subvol
      as it's used in a single place and open code it. No functional changes.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      69948022
    • Nikolay Borisov's avatar
      btrfs: make btrfs_root::free_objectid hold the next available objectid · 23125104
      Nikolay Borisov authored
      Adjust the way free_objectid is being initialized, it now stores
      BTRFS_FIRST_FREE_OBJECTID rather than the, somewhat arbitrary,
      BTRFS_FIRST_FREE_OBJECTID - 1. This change also has the added benefit
      that now it becomes unnecessary to explicitly initialize free_objectid
      for a newly create fs root.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      23125104
    • Nikolay Borisov's avatar
      btrfs: rename btrfs_root::highest_objectid to free_objectid · 6b8fad57
      Nikolay Borisov authored
      This reflects the true purpose of the member as it's being used solely
      in context where a new objectid is being allocated. Future changes will
      also change the way it's being used to closely follow this semantics.
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      6b8fad57