• Nikolay Borisov's avatar
    btrfs: Fix error handling in btrfs_cleanup_ordered_extents · d1051d6e
    Nikolay Borisov authored
    Running btrfs/124 in a loop hung up on me sporadically with the
    following call trace:
    
    	btrfs           D    0  5760   5324 0x00000000
    	Call Trace:
    	 ? __schedule+0x243/0x800
    	 schedule+0x33/0x90
    	 btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
    	 ? wait_woken+0xa0/0xa0
    	 btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
    	 btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
    	 btrfs_relocate_chunk+0x49/0x100 [btrfs]
    	 btrfs_balance+0xbeb/0x1740 [btrfs]
    	 btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
    	 btrfs_ioctl+0x1691/0x3110 [btrfs]
    	 ? lockdep_hardirqs_on+0xed/0x180
    	 ? __handle_mm_fault+0x8e7/0xfb0
    	 ? _raw_spin_unlock+0x24/0x30
    	 ? __handle_mm_fault+0x8e7/0xfb0
    	 ? do_vfs_ioctl+0xa5/0x6e0
    	 ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
    	 do_vfs_ioctl+0xa5/0x6e0
    	 ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
    	 ksys_ioctl+0x3a/0x70
    	 __x64_sys_ioctl+0x16/0x20
    	 do_syscall_64+0x60/0x1b0
    	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
    
    This happens because during page writeback it's valid for
    writepage_delalloc to instantiate a delalloc range which doesn't belong
    to the page currently being written back.
    
    The reason this case is valid is due to find_lock_delalloc_range
    returning any available range after the passed delalloc_start and
    ignoring whether the page under writeback is within that range.
    
    In turn ordered extents (OE) are always created for the returned range
    from find_lock_delalloc_range. If, however, a failure occurs while OE
    are being created then the clean up code in btrfs_cleanup_ordered_extents
    will be called.
    
    Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
    the case of such 'foreign' range being processed and instead it always
    assumes that the range OE are created for belongs to the page. This
    leads to the first page of such foregin range to not be cleaned up since
    it's deliberately missed and skipped by the current cleaning up code.
    
    Fix this by correctly checking whether the current page belongs to the
    range being instantiated and if so adjsut the range parameters passed
    for cleaning up. If it doesn't, then just clean the whole OE range
    directly.
    
    Fixes: 52427260 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang")
    CC: stable@vger.kernel.org # 4.14+
    Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
    Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
    Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
    d1051d6e
inode.c 293 KB