Commit cc1d0d93 authored by Qu Wenruo's avatar Qu Wenruo Committed by David Sterba

btrfs: subpage: fix writeback which does not have ordered extent

[BUG]
When running fsstress with subpage RW support, there are random
BUG_ON()s triggered with the following trace:

 kernel BUG at fs/btrfs/file-item.c:667!
 Internal error: Oops - BUG: 0 [#1] SMP
 CPU: 1 PID: 3486 Comm: kworker/u13:2 5.11.0-rc4-custom+ #43
 Hardware name: Radxa ROCK Pi 4B (DT)
 Workqueue: btrfs-worker-high btrfs_work_helper [btrfs]
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
 pc : btrfs_csum_one_bio+0x420/0x4e0 [btrfs]
 lr : btrfs_csum_one_bio+0x400/0x4e0 [btrfs]
 Call trace:
  btrfs_csum_one_bio+0x420/0x4e0 [btrfs]
  btrfs_submit_bio_start+0x20/0x30 [btrfs]
  run_one_async_start+0x28/0x44 [btrfs]
  btrfs_work_helper+0x128/0x1b4 [btrfs]
  process_one_work+0x22c/0x430
  worker_thread+0x70/0x3a0
  kthread+0x13c/0x140
  ret_from_fork+0x10/0x30

[CAUSE]
Above BUG_ON() means there is some bio range which doesn't have ordered
extent, which indeed is worth a BUG_ON().

Unlike regular sectorsize == PAGE_SIZE case, in subpage we have extra
subpage dirty bitmap to record which range is dirty and should be
written back.

This means, if we submit bio for a subpage range, we do not only need to
clear page dirty, but also need to clear subpage dirty bits.

In __extent_writepage_io(), we will call btrfs_page_clear_dirty() for
any range we submit a bio.

But there is loophole, if we hit a range which is beyond i_size, we just
call btrfs_writepage_endio_finish_ordered() to finish the ordered io,
then break out, without clearing the subpage dirty.

This means, if we hit above branch, the subpage dirty bits are still
there, if other range of the page get dirtied and we need to writeback
that page again, we will submit bio for the old range, leaving a wild
bio range which doesn't have ordered extent.

[FIX]
Fix it by always calling btrfs_page_clear_dirty() in
__extent_writepage_io().

Also to avoid such problem from happening again, add a new assert,
btrfs_page_assert_not_dirty(), to make sure both page dirty and subpage
dirty bits are cleared before exiting __extent_writepage_io().
Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
parent c2832898
...@@ -3864,6 +3864,15 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, ...@@ -3864,6 +3864,15 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
if (cur >= i_size) { if (cur >= i_size) {
btrfs_writepage_endio_finish_ordered(inode, page, cur, btrfs_writepage_endio_finish_ordered(inode, page, cur,
end, true); end, true);
/*
* This range is beyond i_size, thus we don't need to
* bother writing back.
* But we still need to clear the dirty subpage bit, or
* the next time the page gets dirtied, we will try to
* writeback the sectors with subpage dirty bits,
* causing writeback without ordered extent.
*/
btrfs_page_clear_dirty(fs_info, page, cur, end + 1 - cur);
break; break;
} }
...@@ -3914,6 +3923,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, ...@@ -3914,6 +3923,7 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
else else
btrfs_writepage_endio_finish_ordered(inode, btrfs_writepage_endio_finish_ordered(inode,
page, cur, cur + iosize - 1, true); page, cur, cur + iosize - 1, true);
btrfs_page_clear_dirty(fs_info, page, cur, iosize);
cur += iosize; cur += iosize;
continue; continue;
} }
...@@ -3949,6 +3959,12 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode, ...@@ -3949,6 +3959,12 @@ static noinline_for_stack int __extent_writepage_io(struct btrfs_inode *inode,
cur += iosize; cur += iosize;
nr++; nr++;
} }
/*
* If we finish without problem, we should not only clear page dirty,
* but also empty subpage dirty bits
*/
if (!ret)
btrfs_page_assert_not_dirty(fs_info, page);
*nr_ret = nr; *nr_ret = nr;
return ret; return ret;
} }
......
...@@ -559,3 +559,23 @@ IMPLEMENT_BTRFS_PAGE_OPS(writeback, set_page_writeback, end_page_writeback, ...@@ -559,3 +559,23 @@ IMPLEMENT_BTRFS_PAGE_OPS(writeback, set_page_writeback, end_page_writeback,
PageWriteback); PageWriteback);
IMPLEMENT_BTRFS_PAGE_OPS(ordered, SetPageOrdered, ClearPageOrdered, IMPLEMENT_BTRFS_PAGE_OPS(ordered, SetPageOrdered, ClearPageOrdered,
PageOrdered); PageOrdered);
/*
* Make sure not only the page dirty bit is cleared, but also subpage dirty bit
* is cleared.
*/
void btrfs_page_assert_not_dirty(const struct btrfs_fs_info *fs_info,
struct page *page)
{
struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private;
if (!IS_ENABLED(CONFIG_BTRFS_ASSERT))
return;
ASSERT(!PageDirty(page));
if (fs_info->sectorsize == PAGE_SIZE)
return;
ASSERT(PagePrivate(page) && page->private);
ASSERT(subpage->dirty_bitmap == 0);
}
...@@ -126,4 +126,7 @@ DECLARE_BTRFS_SUBPAGE_OPS(ordered); ...@@ -126,4 +126,7 @@ DECLARE_BTRFS_SUBPAGE_OPS(ordered);
bool btrfs_subpage_clear_and_test_dirty(const struct btrfs_fs_info *fs_info, bool btrfs_subpage_clear_and_test_dirty(const struct btrfs_fs_info *fs_info,
struct page *page, u64 start, u32 len); struct page *page, u64 start, u32 len);
void btrfs_page_assert_not_dirty(const struct btrfs_fs_info *fs_info,
struct page *page);
#endif #endif
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment