1. 26 Jun, 2023 11 commits
  2. 12 Jun, 2023 16 commits
    • Chao Yu's avatar
      f2fs: avoid dead loop in f2fs_issue_checkpoint() · 5079e1c0
      Chao Yu authored
      
      generic/082 reports a bug as below:
      
      __schedule+0x332/0xf60
      schedule+0x6f/0xf0
      schedule_timeout+0x23b/0x2a0
      wait_for_completion+0x8f/0x140
      f2fs_issue_checkpoint+0xfe/0x1b0
      f2fs_sync_fs+0x9d/0xb0
      sync_filesystem+0x87/0xb0
      dquot_load_quota_sb+0x41b/0x460
      dquot_load_quota_inode+0xa5/0x130
      dquot_quota_on+0x4b/0x60
      f2fs_quota_on+0xe3/0x1b0
      do_quotactl+0x483/0x700
      __x64_sys_quotactl+0x15c/0x310
      do_syscall_64+0x3f/0x90
      entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      The root casue is race case as below:
      
      Thread A			Kworker			IRQ
      - write()
      : write data to quota.user file
      
      				- writepages
      				 - f2fs_submit_page_write
      				  - __is_cp_guaranteed return false
      				  - inc_page_count(F2FS_WB_DATA)
      				 - submit_bio
      - quotactl(Q_QUOTAON)
       - f2fs_quota_on
        - dquot_quota_on
         - dquot_load_quota_inode
          - vfs_setup_quota_inode
          : inode->i_flags |= S_NOQUOTA
      							- f2fs_write_end_io
      							 - __is_cp_guaranteed return true
      							 - dec_page_count(F2FS_WB_CP_DATA)
          - dquot_load_quota_sb
           - f2fs_sync_fs
            - f2fs_issue_checkpoint
             - do_checkpoint
              - f2fs_wait_on_all_pages(F2FS_WB_CP_DATA)
              : loop due to F2FS_WB_CP_DATA count is negative
      
      Calling filemap_fdatawrite() and filemap_fdatawait() to keep all data
      clean before quota file setup.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5079e1c0
    • Wu Bo's avatar
      f2fs: fix args passed to trace_f2fs_lookup_end · cadfc2f9
      Wu Bo authored
      
      The NULL return of 'd_splice_alias' dosen't mean error. Thus the
      successful case will also return NULL, which makes the tracepoint always
      print 'err=-ENOENT'.
      
      And the different cases of 'new' & 'err' are list as following:
      1) dentry exists: err(0) with new(NULL) --> dentry, err=0
      2) dentry exists: err(0) with new(VALID) --> new, err=0
      3) dentry exists: err(0) with new(ERR) --> dentry, err=ERR
      4) no dentry exists: err(-ENOENT) with new(NULL) --> dentry, err=-ENOENT
      5) no dentry exists: err(-ENOENT) with new(VALID) --> new, err=-ENOENT
      6) no dentry exists: err(-ENOENT) with new(ERR) --> dentry, err=ERR
      Signed-off-by: default avatarWu Bo <bo.wu@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cadfc2f9
    • Yangtao Li's avatar
      f2fs: flag as supporting buffered async reads · 38b57833
      Yangtao Li authored
      The f2fs uses generic_file_buffered_read(), which supports buffered async
      reads since commit 1a0a7853
      
       ("mm: support async buffered reads in
      generic_file_buffered_read()").
      
      Let's enable it to match other file-systems. The read performance has been
      greatly improved under io_uring:
      
          167M/s -> 234M/s, Increase ratio by 40%
      
      Test w/:
          ./fio --name=onessd --filename=/data/test/local/io_uring_test
          --size=256M --rw=randread --bs=4k --direct=0 --overwrite=0
          --numjobs=1 --iodepth=1 --time_based=0 --runtime=10
          --ioengine=io_uring --registerfiles --fixedbufs
          --gtod_reduce=1 --group_reporting --sqthread_poll=1
      Signed-off-by: default avatarLu Hongfei <luhongfei@vivo.com>
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      38b57833
    • Chao Yu's avatar
      f2fs: fix to drop all dirty meta/node pages during umount() · 20872584
      Chao Yu authored
      
      For cp error case, there will be dirty meta/node pages remained after
      f2fs_write_checkpoint() in f2fs_put_super(), drop them explicitly, and
      do sanity check on reference count of dirty pages and inflight IOs.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      20872584
    • Chunhai Guo's avatar
      f2fs: Detect looped node chain efficiently · 38a4a330
      Chunhai Guo authored
      
      find_fsync_dnodes() detect the looped node chain by comparing the loop
      counter with free blocks. While it may take tens of seconds to quit when
      the free blocks are large enough. We can use Floyd's cycle detection
      algorithm to make the detection more efficient.
      Signed-off-by: default avatarChunhai Guo <guochunhai@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      38a4a330
    • Daejun Park's avatar
      f2fs: add async reset zone command support · 25f90805
      Daejun Park authored
      
      This patch enables submit reset zone command asynchornously. It helps
      decrease average latency of write IOs in high utilization scenario by
      faster checkpointing.
      Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      25f90805
    • Chao Yu's avatar
      f2fs: flush error flags in workqueue · 901c12d1
      Chao Yu authored
      In IRQ context, it wakes up workqueue to record errors into on-disk
      superblock fields rather than in-memory fields.
      
      Fixes: 1aa161e4 ("f2fs: fix scheduling while atomic in decompression path")
      Fixes: 95fa90c9
      
       ("f2fs: support recording errors into superblock")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      901c12d1
    • Chao Yu's avatar
      f2fs: don't reset unchangable mount option in f2fs_remount() · 458c15df
      Chao Yu authored
      syzbot reports a bug as below:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000009: 0000 [#1] PREEMPT SMP KASAN
      RIP: 0010:__lock_acquire+0x69/0x2000 kernel/locking/lockdep.c:4942
      Call Trace:
       lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5691
       __raw_write_lock include/linux/rwlock_api_smp.h:209 [inline]
       _raw_write_lock+0x2e/0x40 kernel/locking/spinlock.c:300
       __drop_extent_tree+0x3ac/0x660 fs/f2fs/extent_cache.c:1100
       f2fs_drop_extent_tree+0x17/0x30 fs/f2fs/extent_cache.c:1116
       f2fs_insert_range+0x2d5/0x3c0 fs/f2fs/file.c:1664
       f2fs_fallocate+0x4e4/0x6d0 fs/f2fs/file.c:1838
       vfs_fallocate+0x54b/0x6b0 fs/open.c:324
       ksys_fallocate fs/open.c:347 [inline]
       __do_sys_fallocate fs/open.c:355 [inline]
       __se_sys_fallocate fs/open.c:353 [inline]
       __x64_sys_fallocate+0xbd/0x100 fs/open.c:353
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The root cause is race condition as below:
      - since it tries to remount rw filesystem, so that do_remount won't
      call sb_prepare_remount_readonly to block fallocate, there may be race
      condition in between remount and fallocate.
      - in f2fs_remount(), default_options() will reset mount option to default
      one, and then update it based on result of parse_options(), so there is
      a hole which race condition can happen.
      
      Thread A			Thread B
      - f2fs_fill_super
       - parse_options
        - clear_opt(READ_EXTENT_CACHE)
      
      - f2fs_remount
       - default_options
        - set_opt(READ_EXTENT_CACHE)
      				- f2fs_fallocate
      				 - f2fs_insert_range
      				  - f2fs_drop_extent_tree
      				   - __drop_extent_tree
      				    - __may_extent_tree
      				     - test_opt(READ_EXTENT_CACHE) return true
      				    - write_lock(&et->lock) access NULL pointer
       - parse_options
        - clear_opt(READ_EXTENT_CACHE)
      
      Cc: <stable@vger.kernel.org>
      Reported-by: syzbot+d015b6c2fbb5c383bf08@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/20230522124203.3838360-1-chao@kernel.org
      
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      458c15df
    • Chao Yu's avatar
      f2fs: fix to avoid NULL pointer dereference f2fs_write_end_io() · d8189834
      Chao Yu authored
      butt3rflyh4ck reports a bug as below:
      
      When a thread always calls F2FS_IOC_RESIZE_FS to resize fs, if resize fs is
      failed, f2fs kernel thread would invoke callback function to update f2fs io
      info, it would call  f2fs_write_end_io and may trigger null-ptr-deref in
      NODE_MAPPING.
      
      general protection fault, probably for non-canonical address
      KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
      RIP: 0010:NODE_MAPPING fs/f2fs/f2fs.h:1972 [inline]
      RIP: 0010:f2fs_write_end_io+0x727/0x1050 fs/f2fs/data.c:370
       <TASK>
       bio_endio+0x5af/0x6c0 block/bio.c:1608
       req_bio_endio block/blk-mq.c:761 [inline]
       blk_update_request+0x5cc/0x1690 block/blk-mq.c:906
       blk_mq_end_request+0x59/0x4c0 block/blk-mq.c:1023
       lo_complete_rq+0x1c6/0x280 drivers/block/loop.c:370
       blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1101
       __do_softirq+0x1d4/0x8ef kernel/softirq.c:571
       run_ksoftirqd kernel/softirq.c:939 [inline]
       run_ksoftirqd+0x31/0x60 kernel/softirq.c:931
       smpboot_thread_fn+0x659/0x9e0 kernel/smpboot.c:164
       kthread+0x33e/0x440 kernel/kthread.c:379
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      The root cause is below race case can cause leaving dirty metadata
      in f2fs after filesystem is remount as ro:
      
      Thread A				Thread B
      - f2fs_ioc_resize_fs
       - f2fs_readonly   --- return false
       - f2fs_resize_fs
      					- f2fs_remount
      					 - write_checkpoint
      					 - set f2fs as ro
        - free_segment_range
         - update meta_inode's data
      
      Then, if f2fs_put_super()  fails to write_checkpoint due to readonly
      status, and meta_inode's dirty data will be writebacked after node_inode
      is put, finally, f2fs_write_end_io will access NULL pointer on
      sbi->node_inode.
      
      Thread A				IRQ context
      - f2fs_put_super
       - write_checkpoint fails
       - iput(node_inode)
       - node_inode = NULL
       - iput(meta_inode)
        - write_inode_now
         - f2fs_write_meta_page
      					- f2fs_write_end_io
      					 - NODE_MAPPING(sbi)
      					 : access NULL pointer on node_inode
      
      Fixes: b4b10061
      
       ("f2fs: refactor resize_fs to avoid meta updates in progress")
      Reported-by: default avatarbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Closes: https://lore.kernel.org/r/1684480657-2375-1-git-send-email-yangtiezhu@loongson.cn
      
      Tested-by: default avatarbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d8189834
    • Chao Yu's avatar
      f2fs: clean up w/ sbi->log_sectors_per_block · bfd47662
      Chao Yu authored
      
      Use sbi->log_sectors_per_block to clean up below calculated one:
      
      unsigned int log_sectors_per_block = sbi->log_blocksize - SECTOR_SHIFT;
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bfd47662
    • Chao Yu's avatar
      f2fs: fix to set noatime and immutable flag for quota file · 90b7c4b7
      Chao Yu authored
      
      We should set noatime bit for quota files, since no one cares about
      atime of quota file, and we should set immutalbe bit as well, due to
      nobody should write to the file through exported interfaces.
      
      Meanwhile this patch use inode_lock to avoid race condition during
      inode->i_flags, f2fs_inode->i_flags update.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      90b7c4b7
    • Chao Yu's avatar
      f2fs: renew value of F2FS_FEATURE_* · 77e820ea
      Chao Yu authored
      
      Define F2FS_FEATURE_* macro w/ 32-bits value rather than 16-bits value.
      
      No logic changes.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      77e820ea
    • Chao Yu's avatar
      f2fs: renew value of F2FS_MOUNT_* · 478d7100
      Chao Yu authored
      
      Then we can just define newly introduced mount option w/ lasted
      free number rather than random free one.
      
      Just cleanup, no logic changes.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      478d7100
    • Chao Yu's avatar
      f2fs: fix potential deadlock due to unpaired node_write lock use · f082c6b2
      Chao Yu authored
      If S_NOQUOTA is cleared from inode during data page writeback of quota
      file, it may miss to unlock node_write lock, result in potential
      deadlock, fix to use the lock in paired.
      
      Kworker					Thread
      - writepage
       if (IS_NOQUOTA())
         f2fs_down_read(&sbi->node_write);
      					- vfs_cleanup_quota_inode
      					 - inode->i_flags &= ~S_NOQUOTA;
       if (IS_NOQUOTA())
         f2fs_up_read(&sbi->node_write);
      
      Fixes: 79963d96
      
       ("f2fs: shrink node_write lock coverage")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f082c6b2
    • Yonggil Song's avatar
      f2fs: Fix over-estimating free section during FG GC · 36ded4c1
      Yonggil Song authored
      
      There was a bug that finishing FG GC unconditionally because free sections
      are over-estimated after checkpoint in FG GC.
      This patch initializes sec_freed by every checkpoint in FG GC.
      Signed-off-by: default avatarYonggil Song <yonggil.song@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      36ded4c1
    • Daeho Jeong's avatar
      f2fs: close unused open zones while mounting · 04abeb69
      Daeho Jeong authored
      
      Zoned UFS allows only 6 open zones at the same time, so we need to take
      care of the count of open zones while mounting.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      04abeb69
  3. 24 May, 2023 2 commits
  4. 08 May, 2023 5 commits
  5. 07 May, 2023 6 commits
    • Linus Torvalds's avatar
      Linux 6.4-rc1 · ac9a7868
      Linus Torvalds authored
      ac9a7868
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of... · f085df1b
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tool updates from Arnaldo Carvalho de Melo:
       "Third version of perf tool updates, with the build problems with with
        using a 'vmlinux.h' generated from the main build fixed, and the bpf
        skeleton build disabled by default.
      
        Build:
      
         - Require libtraceevent to build, one can disable it using
           NO_LIBTRACEEVENT=1.
      
           It is required for tools like 'perf sched', 'perf kvm', 'perf
           trace', etc.
      
           libtraceevent is available in most distros so installing
           'libtraceevent-devel' should be a one-time event to continue
           building perf as usual.
      
           Using NO_LIBTRACEEVENT=1 produces tooling that is functional and
           sufficient for lots of users not interested in those libtraceevent
           dependent features.
      
         - Allow Python support in 'perf script' when libtraceevent isn't
           linked, as not all features requires it, for instance Intel PT does
           not use tracepoints.
      
       ...
      f085df1b
    • Linus Torvalds's avatar
      Merge tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 17784de6
      Linus Torvalds authored
      Pull debugobjects fix from Thomas Gleixner:
       "A single fix for debugobjects:
      
        The recent fix to ensure atomicity of lookup and allocation
        inadvertently broke the pool refill mechanism, so that debugobject
        OOMs now in certain situations. The reason is that the functions which
        got updated no longer invoke debug_objecs_init(), which is now the
        only place to care about refilling the tracking object pool.
      
        Restore the original behaviour by adding explicit refill opportunities
        to those places"
      
      * tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debugobject: Ensure pool refill (again)
      17784de6
    • Linus Torvalds's avatar
      Merge tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 6f69c981
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - A long-standing bug in crypto_engine
      
       - A buggy but harmless check in the sun8i-ss driver
      
       - A regression in the CRYPTO_USER interface
      
      * tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: api - Fix CRYPTO_USER checks for report function
        crypto: engine - fix crypto_queue backlog handling
        crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()
      6f69c981
    • Linus Torvalds's avatar
      Merge tag '6.4-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 · 63342b1d
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "smb3 client fixes, mostly DFS or reconnect related:
      
         - Two DFS connection sharing fixes
      
         - DFS refresh fix
      
         - Reconnect fix
      
         - Two potential use after free fixes
      
         - Also print prefix patch in mount debug msg
      
         - Two small cleanup fixes"
      
      * tag '6.4-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Remove unneeded semicolon
        cifs: fix sharing of DFS connections
        cifs: avoid potential races when handling multiple dfs tcons
        cifs: protect access of TCP_Server_Info::{origin,leaf}_fullpath
        cifs: fix potential race when tree connecting ipc
        cifs: fix potential use-after-free bugs in TCP_Server_Info::hostname
        cifs: print smb3_fs_context::source when mounting
        cifs: protect session status check in smb2_reconnect()
        SMB3.1.1: correct definition for app_instance_id create contexts
      63342b1d
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · d6b8a8c4
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A couple more patches that would be good to get into -rc1:
      
         - Revert an i.MX patch that's causing video failures because division
           math goes sideways
      
         - Fix a clang + W=1 build isue where FIELD_PREP() is taking a 32-bit
           variable instead of the usual u64 type
      
         - Fix a Kconfig bug in the StarFive JH7110 clk config that selects a
           reset controller when it can't be selected"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: starfive: Fix RESET_STARFIVE_JH7110 can't be selected in a specified case
        clk: sp7021: Adjust width of _m in HWM_FIELD_PREP()
        Revert "clk: imx: composite-8m: Add support to determine_rate"
      d6b8a8c4