1. 01 Oct, 2022 23 commits
  2. 30 Sep, 2022 8 commits
  3. 29 Sep, 2022 4 commits
  4. 27 Sep, 2022 2 commits
    • Eric Whitney's avatar
      ext4: minor defrag code improvements · d412df53
      Eric Whitney authored
      Modify the error returns for two file types that can't be defragged to
      more clearly communicate those restrictions to a caller.  When the
      defrag code is applied to swap files, return -ETXTBSY, and when applied
      to quota files, return -EOPNOTSUPP.  Move an extent tree search whose
      results are only occasionally required to the site always requiring them
      for improved efficiency.  Address a few typos.
      Signed-off-by: default avatarEric Whitney <enwlinux@gmail.com>
      Link: https://lore.kernel.org/r/20220722163910.268564-1-enwlinux@gmail.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d412df53
    • Jerry Lee 李修賢's avatar
      ext4: continue to expand file system when the target size doesn't reach · df3cb754
      Jerry Lee 李修賢 authored
      When expanding a file system from (16TiB-2MiB) to 18TiB, the operation
      exits early which leads to result inconsistency between resize2fs and
      Ext4 kernel driver.
      
      === before ===
      ○ → resize2fs /dev/mapper/thin
      resize2fs 1.45.5 (07-Jan-2020)
      Filesystem at /dev/mapper/thin is mounted on /mnt/test; on-line resizing required
      old_desc_blocks = 2048, new_desc_blocks = 2304
      The filesystem on /dev/mapper/thin is now 4831837696 (4k) blocks long.
      
      [  865.186308] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
      [  912.091502] dm-4: detected capacity change from 34359738368 to 38654705664
      [  970.030550] dm-5: detected capacity change from 34359734272 to 38654701568
      [ 1000.012751] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
      [ 1000.012878] EXT4-fs (dm-5): resized filesystem to 4294967296
      
      === after ===
      [  129.104898] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
      [  143.773630] dm-4: detected capacity change from 34359738368 to 38654705664
      [  198.203246] dm-5: detected capacity change from 34359734272 to 38654701568
      [  207.918603] EXT4-fs (dm-5): resizing filesystem from 4294966784 to 4831837696 blocks
      [  207.918754] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
      [  207.918758] EXT4-fs (dm-5): Converting file system to meta_bg
      [  207.918790] EXT4-fs (dm-5): resizing filesystem from 4294967296 to 4831837696 blocks
      [  221.454050] EXT4-fs (dm-5): resized to 4658298880 blocks
      [  227.634613] EXT4-fs (dm-5): resized filesystem to 4831837696
      Signed-off-by: default avatarJerry Lee <jerrylee@qnap.com>
      Link: https://lore.kernel.org/r/PU1PR04MB22635E739BD21150DC182AC6A18C9@PU1PR04MB2263.apcprd04.prod.outlook.comSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      df3cb754
  5. 26 Sep, 2022 1 commit
  6. 22 Sep, 2022 2 commits
    • Theodore Ts'o's avatar
      ext4: limit the number of retries after discarding preallocations blocks · 80fa46d6
      Theodore Ts'o authored
      This patch avoids threads live-locking for hours when a large number
      threads are competing over the last few free extents as they blocks
      getting added and removed from preallocation pools.  From our bug
      reporter:
      
         A reliable way for triggering this has multiple writers
         continuously write() to files when the filesystem is full, while
         small amounts of space are freed (e.g. by truncating a large file
         -1MiB at a time). In the local filesystem, this can be done by
         simply not checking the return code of write (0) and/or the error
         (ENOSPACE) that is set. Over NFS with an async mount, even clients
         with proper error checking will behave this way since the linux NFS
         client implementation will not propagate the server errors [the
         write syscalls immediately return success] until the file handle is
         closed. This leads to a situation where NFS clients send a
         continuous stream of WRITE rpcs which result in ERRNOSPACE -- but
         since the client isn't seeing this, the stream of writes continues
         at maximum network speed.
      
         When some space does appear, multiple writers will all attempt to
         claim it for their current write. For NFS, we may see dozens to
         hundreds of threads that do this.
      
         The real-world scenario of this is database backup tooling (in
         particular, github.com/mdkent/percona-xtrabackup) which may write
         large files (>1TiB) to NFS for safe keeping. Some temporary files
         are written, rewound, and read back -- all before closing the file
         handle (the temp file is actually unlinked, to trigger automatic
         deletion on close/crash.) An application like this operating on an
         async NFS mount will not see an error code until TiB have been
         written/read.
      
         The lockup was observed when running this database backup on large
         filesystems (64 TiB in this case) with a high number of block
         groups and no free space. Fragmentation is generally not a factor
         in this filesystem (~thousands of large files, mostly contiguous
         except for the parts written while the filesystem is at capacity.)
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      80fa46d6
    • Luís Henriques's avatar
      ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0 · 29a5b8a1
      Luís Henriques authored
      When walking through an inode extents, the ext4_ext_binsearch_idx() function
      assumes that the extent header has been previously validated.  However, there
      are no checks that verify that the number of entries (eh->eh_entries) is
      non-zero when depth is > 0.  And this will lead to problems because the
      EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:
      
      [  135.245946] ------------[ cut here ]------------
      [  135.247579] kernel BUG at fs/ext4/extents.c:2258!
      [  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
      [  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
      [  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
      [  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
      [  135.256475] Code:
      [  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
      [  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
      [  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
      [  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
      [  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
      [  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
      [  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
      [  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
      [  135.277952] Call Trace:
      [  135.278635]  <TASK>
      [  135.279247]  ? preempt_count_add+0x6d/0xa0
      [  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
      [  135.281612]  ? _raw_read_unlock+0x18/0x30
      [  135.282704]  ext4_map_blocks+0x294/0x5a0
      [  135.283745]  ? xa_load+0x6f/0xa0
      [  135.284562]  ext4_mpage_readpages+0x3d6/0x770
      [  135.285646]  read_pages+0x67/0x1d0
      [  135.286492]  ? folio_add_lru+0x51/0x80
      [  135.287441]  page_cache_ra_unbounded+0x124/0x170
      [  135.288510]  filemap_get_pages+0x23d/0x5a0
      [  135.289457]  ? path_openat+0xa72/0xdd0
      [  135.290332]  filemap_read+0xbf/0x300
      [  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
      [  135.292192]  new_sync_read+0x103/0x170
      [  135.293014]  vfs_read+0x15d/0x180
      [  135.293745]  ksys_read+0xa1/0xe0
      [  135.294461]  do_syscall_64+0x3c/0x80
      [  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      This patch simply adds an extra check in __ext4_ext_check(), verifying that
      eh_entries is not 0 when eh_depth is > 0.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
      Cc: Baokun Li <libaokun1@huawei.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLuís Henriques <lhenriques@suse.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.deSigned-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      29a5b8a1