• Baokun Li's avatar
    ext4: fix bug_on ext4_mb_use_inode_pa · a08f789d
    Baokun Li authored
    Hulk Robot reported a BUG_ON:
    ==================================================================
    kernel BUG at fs/ext4/mballoc.c:3211!
    [...]
    RIP: 0010:ext4_mb_mark_diskspace_used.cold+0x85/0x136f
    [...]
    Call Trace:
     ext4_mb_new_blocks+0x9df/0x5d30
     ext4_ext_map_blocks+0x1803/0x4d80
     ext4_map_blocks+0x3a4/0x1a10
     ext4_writepages+0x126d/0x2c30
     do_writepages+0x7f/0x1b0
     __filemap_fdatawrite_range+0x285/0x3b0
     file_write_and_wait_range+0xb1/0x140
     ext4_sync_file+0x1aa/0xca0
     vfs_fsync_range+0xfb/0x260
     do_fsync+0x48/0xa0
    [...]
    ==================================================================
    
    Above issue may happen as follows:
    -------------------------------------
    do_fsync
     vfs_fsync_range
      ext4_sync_file
       file_write_and_wait_range
        __filemap_fdatawrite_range
         do_writepages
          ext4_writepages
           mpage_map_and_submit_extent
            mpage_map_one_extent
             ext4_map_blocks
              ext4_mb_new_blocks
               ext4_mb_normalize_request
                >>> start + size <= ac->ac_o_ex.fe_logical
               ext4_mb_regular_allocator
                ext4_mb_simple_scan_group
                 ext4_mb_use_best_found
                  ext4_mb_new_preallocation
                   ext4_mb_new_inode_pa
                    ext4_mb_use_inode_pa
                     >>> set ac->ac_b_ex.fe_len <= 0
               ext4_mb_mark_diskspace_used
                >>> BUG_ON(ac->ac_b_ex.fe_len <= 0);
    
    we can easily reproduce this problem with the following commands:
    	`fallocate -l100M disk`
    	`mkfs.ext4 -b 1024 -g 256 disk`
    	`mount disk /mnt`
    	`fsstress -d /mnt -l 0 -n 1000 -p 1`
    
    The size must be smaller than or equal to EXT4_BLOCKS_PER_GROUP.
    Therefore, "start + size <= ac->ac_o_ex.fe_logical" may occur
    when the size is truncated. So start should be the start position of
    the group where ac_o_ex.fe_logical is located after alignment.
    In addition, when the value of fe_logical or EXT4_BLOCKS_PER_GROUP
    is very large, the value calculated by start_off is more accurate.
    
    Cc: stable@kernel.org
    Fixes: cd648b8a ("ext4: trim allocation requests to group size")
    Reported-by: default avatarHulk Robot <hulkci@huawei.com>
    Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
    Reviewed-by: default avatarRitesh Harjani <ritesh.list@gmail.com>
    Link: https://lore.kernel.org/r/20220528110017.354175-2-libaokun1@huawei.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
    a08f789d
mballoc.c 184 KB