1. 28 Nov, 2022 6 commits
  2. 11 Nov, 2022 7 commits
    • Yangtao Li's avatar
      f2fs: fix to set flush_merge opt and show noflush_merge · 967eaad1
      Yangtao Li authored
      Some minor modifications to flush_merge and related parameters:
      
        1.The FLUSH_MERGE opt is set by default only in non-ro mode.
        2.When ro and merge are set at the same time, an error is reported.
        3.Display noflush_merge mount opt.
      Suggested-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      967eaad1
    • Tetsuo Handa's avatar
      f2fs: initialize locks earlier in f2fs_fill_super() · 92b4cf5b
      Tetsuo Handa authored
      syzbot is reporting lockdep warning at f2fs_handle_error() [1], for
      spin_lock(&sbi->error_lock) is called before spin_lock_init() is called.
      For safe locking in error handling, move initialization of locks (and
      obvious structures) in f2fs_fill_super() to immediately after memory
      allocation.
      
      Link: https://syzkaller.appspot.com/bug?extid=40642be9b7e0bb28e0df [1]
      Reported-by: default avatarsyzbot <syzbot+40642be9b7e0bb28e0df@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Tested-by: default avatarsyzbot <syzbot+40642be9b7e0bb28e0df@syzkaller.appspotmail.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      92b4cf5b
    • Chao Yu's avatar
      f2fs: optimize iteration over sparse directories · 59237a21
      Chao Yu authored
      Wei Chen reports a kernel bug as blew:
      
      INFO: task syz-executor.0:29056 blocked for more than 143 seconds.
            Not tainted 5.15.0-rc5 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:syz-executor.0  state:D stack:14632 pid:29056 ppid:  6574 flags:0x00000004
      Call Trace:
       __schedule+0x4a1/0x1720
       schedule+0x36/0xe0
       rwsem_down_write_slowpath+0x322/0x7a0
       fscrypt_ioctl_set_policy+0x11f/0x2a0
       __f2fs_ioctl+0x1a9f/0x5780
       f2fs_ioctl+0x89/0x3a0
       __x64_sys_ioctl+0xe8/0x140
       do_syscall_64+0x34/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Eric did some investigation on this issue, quoted from reply of Eric:
      
      "Well, the quality of this bug report has a lot to be desired (not on
      upstream kernel, reproducer is full of totally irrelevant stuff, not
      sent to the mailing list of the filesystem whose disk image is being
      fuzzed, etc.).  But what is going on is that f2fs_empty_dir() doesn't
      consider the case of a directory with an extremely large i_size on a
      malicious disk image.
      
      Specifically, the reproducer mounts an f2fs image with a directory
      that has an i_size of 14814520042850357248, then calls
      FS_IOC_SET_ENCRYPTION_POLICY on it.
      
      That results in a call to f2fs_empty_dir() to check whether the
      directory is empty.  f2fs_empty_dir() then iterates through all
      3616826182336513 blocks the directory allegedly contains to check
      whether any contain anything.  i_rwsem is held during this, so
      anything else that tries to take it will hang."
      
      In order to solve this issue, let's use f2fs_get_next_page_offset()
      to speed up iteration by skipping holes for all below functions:
      - f2fs_empty_dir
      - f2fs_readdir
      - find_in_level
      
      The way why we can speed up iteration was described in
      'commit 3cf45747 ("f2fs: introduce get_next_page_offset to speed
      up SEEK_DATA")'.
      
      Meanwhile, in f2fs_empty_dir(), let's use f2fs_find_data_page()
      instead f2fs_get_lock_data_page(), due to i_rwsem was held in
      caller of f2fs_empty_dir(), there shouldn't be any races, so it's
      fine to not lock dentry page during lookuping dirents in the page.
      
      Link: https://lore.kernel.org/lkml/536944df-a0ae-1dd8-148f-510b476e1347@kernel.org/T/Reported-by: default avatarWei Chen <harperchen1110@gmail.com>
      Cc: Eric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      59237a21
    • Chao Yu's avatar
      f2fs: fix to avoid accessing uninitialized spinlock · cc249e4c
      Chao Yu authored
      syzbot reports a kernel bug:
      
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
       assign_lock_key+0x22a/0x240 kernel/locking/lockdep.c:981
       register_lock_class+0x287/0x9b0 kernel/locking/lockdep.c:1294
       __lock_acquire+0xe4/0x1f60 kernel/locking/lockdep.c:4934
       lock_acquire+0x1a7/0x400 kernel/locking/lockdep.c:5668
       __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
       _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
       spin_lock include/linux/spinlock.h:350 [inline]
       f2fs_save_errors fs/f2fs/super.c:3868 [inline]
       f2fs_handle_error+0x29/0x230 fs/f2fs/super.c:3896
       f2fs_iget+0x215/0x4bb0 fs/f2fs/inode.c:516
       f2fs_fill_super+0x47d3/0x7b50 fs/f2fs/super.c:4222
       mount_bdev+0x26c/0x3a0 fs/super.c:1401
       legacy_get_tree+0xea/0x180 fs/fs_context.c:610
       vfs_get_tree+0x88/0x270 fs/super.c:1531
       do_new_mount+0x289/0xad0 fs/namespace.c:3040
       do_mount fs/namespace.c:3383 [inline]
       __do_sys_mount fs/namespace.c:3591 [inline]
       __se_sys_mount+0x2e3/0x3d0 fs/namespace.c:3568
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      F2FS-fs (loop1): Failed to read F2FS meta data inode
      
      The root cause is if sbi->error_lock may be accessed before
      its initialization, fix it.
      
      Link: https://lore.kernel.org/linux-f2fs-devel/0000000000007edb6605ecbb6442@google.com/T/#u
      Reported-by: syzbot+40642be9b7e0bb28e0df@syzkaller.appspotmail.com
      Fixes: 95fa90c9 ("f2fs: support recording errors into superblock")
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cc249e4c
    • Daeho Jeong's avatar
      f2fs: correct i_size change for atomic writes · 4d8d45df
      Daeho Jeong authored
      We need to make sure i_size doesn't change until atomic write commit is
      successful and restore it when commit is failed.
      Signed-off-by: default avatarDaeho Jeong <daehojeong@google.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4d8d45df
    • Yangtao Li's avatar
      f2fs: add proc entry to show discard_plist info · 225d6795
      Yangtao Li authored
      This patch adds a new proc entry to show discard_plist
      information in more detail, which is very helpful to
      know the discard pend list count clearly.
      
      Such as:
      
      Discard pend list(Show diacrd_cmd count on each entry, .:not exist):
        0       390     156      85      67      46      37      26      14
        8        17      12       9       9       6      12      11      10
        16        5       9       2       4       8       3       4       1
        24        3       2       2       5       2       4       5       4
        32        3       3       2       3       .       3       3       1
        40        .       4       1       3       2       1       2       1
        48        1       .       1       1       .       1       1       .
        56        .       1       1       1       .       2       .       1
        64        1       2       .       .       .       .       .       .
        72        .       1       .       .       .       .       .       .
        80        3       1       .       .       1       1       .       .
        88        1       .       .       .       1       .       .       1
      ......
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      225d6795
    • Jaegeuk Kim's avatar
      f2fs: allow to read node block after shutdown · e6ecb142
      Jaegeuk Kim authored
      If block address is still alive, we should give a valid node block even after
      shutdown. Otherwise, we can see zero data when reading out a file.
      
      Cc: stable@vger.kernel.org
      Fixes: 83a3bfdb ("f2fs: indicate shutdown f2fs to allow unmount successfully")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e6ecb142
  3. 02 Nov, 2022 19 commits
  4. 28 Oct, 2022 1 commit
  5. 25 Oct, 2022 2 commits
  6. 17 Oct, 2022 1 commit
  7. 16 Oct, 2022 4 commits
    • Linus Torvalds's avatar
      Linux 6.1-rc1 · 9abf2313
      Linus Torvalds authored
      9abf2313
    • Linus Torvalds's avatar
      Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random · f1947d7c
      Linus Torvalds authored
      Pull more random number generator updates from Jason Donenfeld:
       "This time with some large scale treewide cleanups.
      
        The intent of this pull is to clean up the way callers fetch random
        integers. The current rules for doing this right are:
      
         - If you want a secure or an insecure random u64, use get_random_u64()
      
         - If you want a secure or an insecure random u32, use get_random_u32()
      
           The old function prandom_u32() has been deprecated for a while
           now and is just a wrapper around get_random_u32(). Same for
           get_random_int().
      
         - If you want a secure or an insecure random u16, use get_random_u16()
      
         - If you want a secure or an insecure random u8, use get_random_u8()
      
         - If you want secure or insecure random bytes, use get_random_bytes().
      
           The old function prandom_bytes() has been deprecated for a while
           now and has long been a wrapper around get_random_bytes()
      
         - If you want a non-uniform random u32, u16, or u8 bounded by a
           certain open interval maximum, use prandom_u32_max()
      
           I say "non-uniform", because it doesn't do any rejection sampling
           or divisions. Hence, it stays within the prandom_*() namespace, not
           the get_random_*() namespace.
      
           I'm currently investigating a "uniform" function for 6.2. We'll see
           what comes of that.
      
        By applying these rules uniformly, we get several benefits:
      
         - By using prandom_u32_max() with an upper-bound that the compiler
           can prove at compile-time is ≤65536 or ≤256, internally
           get_random_u16() or get_random_u8() is used, which wastes fewer
           batched random bytes, and hence has higher throughput.
      
         - By using prandom_u32_max() instead of %, when the upper-bound is
           not a constant, division is still avoided, because
           prandom_u32_max() uses a faster multiplication-based trick instead.
      
         - By using get_random_u16() or get_random_u8() in cases where the
           return value is intended to indeed be a u16 or a u8, we waste fewer
           batched random bytes, and hence have higher throughput.
      
        This series was originally done by hand while I was on an airplane
        without Internet. Later, Kees and I worked on retroactively figuring
        out what could be done with Coccinelle and what had to be done
        manually, and then we split things up based on that.
      
        So while this touches a lot of files, the actual amount of code that's
        hand fiddled is comfortably small"
      
      * tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
        prandom: remove unused functions
        treewide: use get_random_bytes() when possible
        treewide: use get_random_u32() when possible
        treewide: use get_random_{u8,u16}() when possible, part 2
        treewide: use get_random_{u8,u16}() when possible, part 1
        treewide: use prandom_u32_max() when possible, part 2
        treewide: use prandom_u32_max() when possible, part 1
      f1947d7c
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of... · 8636df94
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
      
       - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
         when using bperf (perf BPF based counters) with cgroups.
      
       - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
         monitors bandwidth, latency, bus utilization and buffer occupancy.
      
         Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.
      
       - User space tasks can migrate between CPUs, so when tracing selected
         CPUs, system-wide sideband is still needed, fix it in the setup of
         Intel PT on hybrid systems.
      
       - Fix metricgroups title message in 'perf list', it should state that
         the metrics groups are to be used with the '-M' option, not '-e'.
      
       - Sync the msr-index.h copy with the kernel sources, adding support for
         using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
         as decoding it when printing the MSR tracepoint arguments.
      
       - Fix program header size and alignment when generating a JIT ELF in
         'perf inject'.
      
       - Add multiple new Intel PT 'perf test' entries, including a jitdump
         one.
      
       - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
         running on PowerPC due to an invalid topology number in that arch.
      
       - Fix the 'perf test' for arm_coresight failures on the ARM Juno
         system.
      
       - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
         option to the or expression expected in the intercepted
         perf_event_open() syscall.
      
       - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
         'perf annotate' asm parser.
      
       - Fix 'perf mem record -C' option processing, it was being chopped up
         when preparing the underlying 'perf record -e mem-events' and thus
         being ignored, requiring using '-- -C CPUs' as a workaround.
      
       - Improvements and tidy ups for 'perf test' shell infra.
      
       - Fix Intel PT information printing segfault in uClibc, where a NULL
         format was being passed to fprintf.
      
      * tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
        perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
        perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
        perf tests stat+json_output: Include sanity check for topology
        perf tests stat+csv_output: Include sanity check for topology
        perf intel-pt: Fix system_wide dummy event for hybrid
        perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
        perf test: Fix attr tests for PERF_FORMAT_LOST
        perf test: test_intel_pt.sh: Add 9 tests
        perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
        perf test: test_intel_pt.sh: Add jitdump test
        perf test: test_intel_pt.sh: Tidy some alignment
        perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
        perf test: test_intel_pt.sh: Tidy some perf record options
        perf test: test_intel_pt.sh: Fix return checking again
        perf: Skip and warn on unknown format 'configN' attrs
        perf list: Fix metricgroups title message
        perf mem: Fix -C option behavior for perf mem record
        perf annotate: Add missing condition flags for arm64
        ...
      8636df94
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.1' of... · 2df76606
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y compile error for the
         combination of Clang >= 14 and GAS <= 2.35.
      
       - Drop vmlinux.bz2 from the rpm package as it just annoyingly increased
         the package size.
      
       - Fix modpost error under build environments using musl.
      
       - Make *.ll files keep value names for easier debugging
      
       - Fix single directory build
      
       - Prevent RISC-V from selecting the broken DWARF5 support when Clang
         and GAS are used together.
      
      * tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
        kbuild: fix single directory build
        kbuild: add -fno-discard-value-names to cmd_cc_ll_c
        scripts/clang-tools: Convert clang-tidy args to list
        modpost: put modpost options before argument
        kbuild: Stop including vmlinux.bz2 in the rpm's
        Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
        Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
      2df76606