ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate

Recently we notice that ext4 filesystem would occasionally fail to read metadata from disk and report error message, but the disk and block layer looks fine. After analyse, we lockon commit 88dbcbb3 ("blkdev: avoid migration stalls for blkdev pages"). It provide a migration method for the bdev, we could move page that has buffers without extra users now, but it lock the buffers on the page, which breaks the fragile metadata read operation on ext4 filesystem, ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the assumption of that locked buffer means it is under IO. So it just trylock the buffer and skip submit IO if it lock failed, after wait_on_buffer() we conclude IO error because the buffer is not uptodate. This issue could be easily reproduced by add some delay just after buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do fsstress on ext4 filesystem. EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193: comm fsstress: reading directory lblock 0 EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334: comm fsstress: reading directory lblock 0 Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock the buffer and submit IO if it's not uptodate, and also leave over readahead helper. Cc: stable@kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220831074629.3755110-1-yi.zhang@huawei.comSigned-off-by: Theodore Ts'o <tytso@mit.edu>

ext4: ext4_read_bh_lock() should submit IO if the buffer isn't uptodate
Recently we notice that ext4 filesystem would occasionally fail to read metadata from disk and report error message, but the disk and block layer looks fine. After analyse, we lockon commit 88dbcbb3 ("blkdev: avoid migration stalls for blkdev pages"). It provide a migration method for the bdev, we could move page that has buffers without extra users now, but it lock the buffers on the page, which breaks the fragile metadata read operation on ext4 filesystem, ext4_read_bh_lock() was copied from ll_rw_block(), it depends on the assumption of that locked buffer means it is under IO. So it just trylock the buffer and skip submit IO if it lock failed, after wait_on_buffer() we conclude IO error because the buffer is not uptodate. This issue could be easily reproduced by add some delay just after buffer_migrate_lock_buffers() in __buffer_migrate_folio() and do fsstress on ext4 filesystem. EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #73193: comm fsstress: reading directory lblock 0 EXT4-fs error (device pmem1): __ext4_find_entry:1658: inode #75334: comm fsstress: reading directory lblock 0 Fix it by removing the trylock logic in ext4_read_bh_lock(), just lock the buffer and submit IO if it's not uptodate, and also leave over readahead helper. Cc: stable@kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220831074629.3755110-1-yi.zhang@huawei.comSigned-off-by: Theodore Ts'o <tytso@mit.edu>
0b73284c · Zhang Yi · Theodore Ts'o · 1ff20307 · 0b73284c
Commit 0b73284c authored Aug 31, 2022 by Zhang Yi Committed by Theodore Ts'o Sep 30, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 11 deletions

fs/ext4/super.c fs/ext4/super.c +5 -11

No files found.
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -205,19 +205,12 @@ int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io

 int ext4_read_bh_lock(struct buffer_head *bh, blk_opf_t op_flags, bool wait)
 {
-	if (trylock_buffer(bh)) {
-		if (wait)
-			return ext4_read_bh(bh, op_flags, NULL);
+	lock_buffer(bh);
+	if (!wait) {
 		ext4_read_bh_nowait(bh, op_flags, NULL);
 		return 0;
 	}
-	if (wait) {
-		wait_on_buffer(bh);
-		if (buffer_uptodate(bh))
-			return 0;
-		return -EIO;
-	}
-	return 0;
+	return ext4_read_bh(bh, op_flags, NULL);
 }

 /*
@@ -264,7 +257,8 @@ void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block)
 	struct buffer_head *bh = sb_getblk_gfp(sb, block, 0);

 	if (likely(bh)) {
-		ext4_read_bh_lock(bh, REQ_RAHEAD, false);
+		if (trylock_buffer(bh))
+			ext4_read_bh_nowait(bh, REQ_RAHEAD, NULL);
 		brelse(bh);
 	}
 }