• Qu Wenruo's avatar
    btrfs: fix false alert on bad tree level check · 1d854e4f
    Qu Wenruo authored
    [BUG]
    There is a bug report that on a RAID0 NVMe btrfs system, under heavy
    write load the filesystem can flip RO randomly.
    
    With extra debugging, it shows some tree blocks failed to pass their
    level checks, and if that happens at critical path of a transaction, we
    abort the transaction:
    
      BTRFS error (device nvme0n1p3): level verify failed on logical 5446121209856 mirror 1 wanted 0 found 1
      BTRFS error (device nvme0n1p3: state A): Transaction aborted (error -5)
      BTRFS: error (device nvme0n1p3: state A) in btrfs_finish_ordered_io:3343: errno=-5 IO failure
      BTRFS info (device nvme0n1p3: state EA): forced readonly
    
    [CAUSE]
    The reporter has already bisected to commit 947a6299 ("btrfs: move
    tree block parentness check into validate_extent_buffer()").
    
    And with extra debugging, it shows we can have btrfs_tree_parent_check
    filled with all zeros in the following call trace:
    
      submit_one_bio+0xd4/0xe0
      submit_extent_page+0x142/0x550
      read_extent_buffer_pages+0x584/0x9c0
      ? __pfx_end_bio_extent_readpage+0x10/0x10
      ? folio_unlock+0x1d/0x50
      btrfs_read_extent_buffer+0x98/0x150
      read_tree_block+0x43/0xa0
      read_block_for_search+0x266/0x370
      btrfs_search_slot+0x351/0xd30
      ? lock_is_held_type+0xe8/0x140
      btrfs_lookup_csum+0x63/0x150
      btrfs_csum_file_blocks+0x197/0x6c0
      ? sched_clock_cpu+0x9f/0xc0
      ? lock_release+0x14b/0x440
      ? _raw_read_unlock+0x29/0x50
      btrfs_finish_ordered_io+0x441/0x860
      btrfs_work_helper+0xfe/0x400
      ? lock_is_held_type+0xe8/0x140
      process_one_work+0x294/0x5b0
      worker_thread+0x4f/0x3a0
      ? __pfx_worker_thread+0x10/0x10
      kthread+0xf5/0x120
      ? __pfx_kthread+0x10/0x10
      ret_from_fork+0x2c/0x50
    
    Currently we only copy the btrfs_tree_parent_check structure into bbio
    at read_extent_buffer_pages() after we have assembled the bbio.
    
    But as shown above, submit_extent_page() itself can already submit the
    bbio, leaving the bbio->parent_check uninitialized, and cause the false
    alert.
    
    [FIX]
    Instead of copying @check into bbio after bbio is assembled, we pass
    @check in btrfs_bio_ctrl::parent_check, and copy the content of
    parent_check in submit_one_bio() for metadata read.
    
    By this we should be able to pass the needed info for metadata endio
    verification, and fix the false alert.
    Reported-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
    Link: https://lore.kernel.org/linux-btrfs/CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com/
    Fixes: 947a6299 ("btrfs: move tree block parentness check into validate_extent_buffer()")
    Tested-by: default avatarMikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
    Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    1d854e4f
extent_io.c 157 KB