1. 10 Jul, 2024 5 commits
    • Sunmin Jeong's avatar
      f2fs: use meta inode for GC of atomic file · b40a2b00
      Sunmin Jeong authored
      The page cache of the atomic file keeps new data pages which will be
      stored in the COW file. It can also keep old data pages when GCing the
      atomic file. In this case, new data can be overwritten by old data if a
      GC thread sets the old data page as dirty after new data page was
      evicted.
      
      Also, since all writes to the atomic file are redirected to COW inodes,
      GC for the atomic file is not working well as below.
      
      f2fs_gc(gc_type=FG_GC)
        - select A as a victim segment
        do_garbage_collect
          - iget atomic file's inode for block B
          move_data_page
            f2fs_do_write_data_page
              - use dn of cow inode
              - set fio->old_blkaddr from cow inode
          - seg_freed is 0 since block B is still valid
        - goto gc_more and A is selected as victim again
      
      To solve the problem, let's separate GC writes and updates in the atomic
      file by using the meta inode for GC writes.
      
      Fixes: 3db1de0e ("f2fs: change the current atomic write way")
      Cc: stable@vger.kernel.org #v5.19+
      Reviewed-by: default avatarSungjong Seo <sj1557.seo@samsung.com>
      Reviewed-by: default avatarYeongjin Gil <youngjin.gil@samsung.com>
      Signed-off-by: default avatarSunmin Jeong <s_min.jeong@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b40a2b00
    • Sheng Yong's avatar
      f2fs: only fragment segment in the same section · e3a19972
      Sheng Yong authored
      When new_curseg() is allocating a new segment, if mode=fragment:xxx is
      switched on in large section scenario, __get_next_segno() will select
      the next segno randomly in the range of [0, maxsegno] in order to
      fragment segments.
      
      If the candidate segno is free, get_new_segment() will use it directly
      as the new segment.
      
      However, if the section of the candidate is not empty, and some other
      segments have already been used, and have a different type (e.g NODE)
      with the candidate (e.g DATA), GC will complain inconsistent segment
      type later.
      
      This could be reproduced by the following steps:
      
        dd if=/dev/zero of=test.img bs=1M count=10240
        mkfs.f2fs -s 128 test.img
        mount -t f2fs test.img /mnt -o mode=fragment:block
        echo 1 > /sys/fs/f2fs/loop0/max_fragment_chunk
        echo 512 > /sys/fs/f2fs/loop0/max_fragment_hole
        dd if=/dev/zero of=/mnt/testfile bs=4K count=100
        umount /mnt
      
        F2FS-fs (loop0): Inconsistent segment (4377) type [0, 1] in SSA and SIT
      
      In order to allow simulating segment fragmentation in large section
      scenario, this patch reduces the candidate range:
       * if curseg is the last segment in the section, return curseg->segno
         to make get_new_segment() itself find the next free segment.
       * if curseg is in the middle of the section, select candicate randomly
         in the range of [curseg + 1, last_seg_in_the_same_section] to keep
         type consistent.
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarSheng Yong <shengyong@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      e3a19972
    • Chao Yu's avatar
      f2fs: fix to update user block counts in block_operations() · f06c0f82
      Chao Yu authored
      Commit 59c9081b ("f2fs: allow write page cache when writting cp")
      allows write() to write data to page cache during checkpoint, so block
      count fields like .total_valid_block_count, .alloc_valid_block_count
      and .rf_node_block_count may encounter race condition as below:
      
      CP				Thread A
      - write_checkpoint
       - block_operations
        - f2fs_down_write(&sbi->node_change)
        - __prepare_cp_block
        : ckpt->valid_block_count = .total_valid_block_count
        - f2fs_up_write(&sbi->node_change)
      				- write
      				 - f2fs_preallocate_blocks
      				  - f2fs_map_blocks(,F2FS_GET_BLOCK_PRE_AIO)
      				   - f2fs_map_lock
      				    - f2fs_down_read(&sbi->node_change)
      				   - f2fs_reserve_new_blocks
      				    - inc_valid_block_count
      				    : percpu_counter_add(&sbi->alloc_valid_block_count, count)
      				    : sbi->total_valid_block_count += count
      				    - f2fs_up_read(&sbi->node_change)
       - do_checkpoint
       : sbi->last_valid_block_count = sbi->total_valid_block_count
       : percpu_counter_set(&sbi->alloc_valid_block_count, 0)
       : percpu_counter_set(&sbi->rf_node_block_count, 0)
      				- fsync
      				 - need_do_checkpoint
      				  - f2fs_space_for_roll_forward
      				  : alloc_valid_block_count was reset to zero,
      				    so, it may missed last data during checkpoint
      
      Let's change to update .total_valid_block_count, .alloc_valid_block_count
      and .rf_node_block_count in block_operations(), then their access can be
      protected by .node_change and .cp_rwsem lock, so that it can avoid above
      race condition.
      
      Fixes: 59c9081b ("f2fs: allow write page cache when writting cp")
      Cc: Yunlei He <heyunlei@oppo.com>
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f06c0f82
    • Eric Sandeen's avatar
      f2fs: remove unreachable lazytime mount option parsing · 54f43a10
      Eric Sandeen authored
      The lazytime/nolazytime options are now handled in the VFS, and are
      never seen in filesystem parsers, so remove handling of these
      options from f2fs.
      
      Note: when lazytime support was added in 6d94c74a it made
      lazytime the default in default_options() - as a result, lazytime
      cannot be disabled (because Opt_nolazytime is never seen in f2fs
      parsing).
      
      If lazytime is desired to be configurable, and default off is OK,
      default_options() could be updated to stop setting it by default
      and allow mount option control.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      54f43a10
    • Daejun Park's avatar
      f2fs: fix null reference error when checking end of zone · c82bc1ab
      Daejun Park authored
      This patch fixes a potentially null pointer being accessed by
      is_end_zone_blkaddr() that checks the last block of a zone
      when f2fs is mounted as a single device.
      
      Fixes: e067dc3c ("f2fs: maintain six open zones for zoned devices")
      Signed-off-by: default avatarDaejun Park <daejun7.park@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarDaeho Jeong <daehojeong@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c82bc1ab
  2. 09 Jul, 2024 1 commit
  3. 24 Jun, 2024 1 commit
  4. 21 Jun, 2024 1 commit
    • Jaegeuk Kim's avatar
      f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid · 8cb1f408
      Jaegeuk Kim authored
      mkdir /mnt/test/comp
      f2fs_io setflags compression /mnt/test/comp
      dd if=/dev/zero of=/mnt/test/comp/testfile bs=16k count=1
      truncate --size 13 /mnt/test/comp/testfile
      
      In the above scenario, we can get a BUG_ON.
       kernel BUG at fs/f2fs/segment.c:3589!
       Call Trace:
        do_write_page+0x78/0x390 [f2fs]
        f2fs_outplace_write_data+0x62/0xb0 [f2fs]
        f2fs_do_write_data_page+0x275/0x740 [f2fs]
        f2fs_write_single_data_page+0x1dc/0x8f0 [f2fs]
        f2fs_write_multi_pages+0x1e5/0xae0 [f2fs]
        f2fs_write_cache_pages+0xab1/0xc60 [f2fs]
        f2fs_write_data_pages+0x2d8/0x330 [f2fs]
        do_writepages+0xcf/0x270
        __writeback_single_inode+0x44/0x350
        writeback_sb_inodes+0x242/0x530
        __writeback_inodes_wb+0x54/0xf0
        wb_writeback+0x192/0x310
        wb_workfn+0x30d/0x400
      
      The reason is we gave CURSEG_ALL_DATA_ATGC to COMPR_ADDR where the
      page was set the gcing flag by set_cluster_dirty().
      
      Cc: stable@vger.kernel.org
      Fixes: 4961acdd ("f2fs: fix to tag gcing flag on page during block migration")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Tested-by: default avatarWill McVicker <willmcvicker@google.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8cb1f408
  5. 18 Jun, 2024 3 commits
  6. 12 Jun, 2024 12 commits
    • Chao Yu's avatar
      f2fs: fix to truncate preallocated blocks in f2fs_file_open() · 298b1e41
      Chao Yu authored
      chenyuwen reports a f2fs bug as below:
      
      Unable to handle kernel NULL pointer dereference at virtual address 0000000000000011
       fscrypt_set_bio_crypt_ctx+0x78/0x1e8
       f2fs_grab_read_bio+0x78/0x208
       f2fs_submit_page_read+0x44/0x154
       f2fs_get_read_data_page+0x288/0x5f4
       f2fs_get_lock_data_page+0x60/0x190
       truncate_partial_data_page+0x108/0x4fc
       f2fs_do_truncate_blocks+0x344/0x5f0
       f2fs_truncate_blocks+0x6c/0x134
       f2fs_truncate+0xd8/0x200
       f2fs_iget+0x20c/0x5ac
       do_garbage_collect+0x5d0/0xf6c
       f2fs_gc+0x22c/0x6a4
       f2fs_disable_checkpoint+0xc8/0x310
       f2fs_fill_super+0x14bc/0x1764
       mount_bdev+0x1b4/0x21c
       f2fs_mount+0x20/0x30
       legacy_get_tree+0x50/0xbc
       vfs_get_tree+0x5c/0x1b0
       do_new_mount+0x298/0x4cc
       path_mount+0x33c/0x5fc
       __arm64_sys_mount+0xcc/0x15c
       invoke_syscall+0x60/0x150
       el0_svc_common+0xb8/0xf8
       do_el0_svc+0x28/0xa0
       el0_svc+0x24/0x84
       el0t_64_sync_handler+0x88/0xec
      
      It is because inode.i_crypt_info is not initialized during below path:
      - mount
       - f2fs_fill_super
        - f2fs_disable_checkpoint
         - f2fs_gc
          - f2fs_iget
           - f2fs_truncate
      
      So, let's relocate truncation of preallocated blocks to f2fs_file_open(),
      after fscrypt_file_open().
      
      Fixes: d4dd19ec ("f2fs: do not expose unwritten blocks to user by DIO")
      Reported-by: default avatarchenyuwen <yuwen.chen@xjmz.com>
      Closes: https://lore.kernel.org/linux-kernel/20240517085327.1188515-1-yuwen.chen@xjmz.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      298b1e41
    • Chao Yu's avatar
      f2fs: fix to cover read extent cache access with lock · d7409b05
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
      BUG: KASAN: slab-use-after-free in sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46
      Read of size 4 at addr ffff8880739ab220 by task syz-executor200/5097
      
      CPU: 0 PID: 5097 Comm: syz-executor200 Not tainted 6.9.0-rc6-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
       print_address_description mm/kasan/report.c:377 [inline]
       print_report+0x169/0x550 mm/kasan/report.c:488
       kasan_report+0x143/0x180 mm/kasan/report.c:601
       sanity_check_extent_cache+0x370/0x410 fs/f2fs/extent_cache.c:46
       do_read_inode fs/f2fs/inode.c:509 [inline]
       f2fs_iget+0x33e1/0x46e0 fs/f2fs/inode.c:560
       f2fs_nfs_get_inode+0x74/0x100 fs/f2fs/super.c:3237
       generic_fh_to_dentry+0x9f/0xf0 fs/libfs.c:1413
       exportfs_decode_fh_raw+0x152/0x5f0 fs/exportfs/expfs.c:444
       exportfs_decode_fh+0x3c/0x80 fs/exportfs/expfs.c:584
       do_handle_to_path fs/fhandle.c:155 [inline]
       handle_to_path fs/fhandle.c:210 [inline]
       do_handle_open+0x495/0x650 fs/fhandle.c:226
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      We missed to cover sanity_check_extent_cache() w/ extent cache lock,
      so, below race case may happen, result in use after free issue.
      
      - f2fs_iget
       - do_read_inode
        - f2fs_init_read_extent_tree
        : add largest extent entry in to cache
      					- shrink
      					 - f2fs_shrink_read_extent_tree
      					  - __shrink_extent_tree
      					   - __detach_extent_node
      					   : drop largest extent entry
        - sanity_check_extent_cache
        : access et->largest w/o lock
      
      let's refactor sanity_check_extent_cache() to avoid extent cache access
      and call it before f2fs_init_read_extent_tree() to fix this issue.
      
      Reported-by: syzbot+74ebe2104433e9dc610d@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/00000000000009beea061740a531@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d7409b05
    • Chao Yu's avatar
      f2fs: fix return value of f2fs_convert_inline_inode() · a8eb3de2
      Chao Yu authored
      If device is readonly, make f2fs_convert_inline_inode()
      return EROFS instead of zero, otherwise it may trigger
      panic during writeback of inline inode's dirty page as
      below:
      
       f2fs_write_single_data_page+0xbb6/0x1e90 fs/f2fs/data.c:2888
       f2fs_write_cache_pages fs/f2fs/data.c:3187 [inline]
       __f2fs_write_data_pages fs/f2fs/data.c:3342 [inline]
       f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3369
       do_writepages+0x359/0x870 mm/page-writeback.c:2634
       filemap_fdatawrite_wbc+0x125/0x180 mm/filemap.c:397
       __filemap_fdatawrite_range mm/filemap.c:430 [inline]
       file_write_and_wait_range+0x1aa/0x290 mm/filemap.c:788
       f2fs_do_sync_file+0x68a/0x1ae0 fs/f2fs/file.c:276
       generic_write_sync include/linux/fs.h:2806 [inline]
       f2fs_file_write_iter+0x7bd/0x24e0 fs/f2fs/file.c:4977
       call_write_iter include/linux/fs.h:2114 [inline]
       new_sync_write fs/read_write.c:497 [inline]
       vfs_write+0xa72/0xc90 fs/read_write.c:590
       ksys_write+0x1a0/0x2c0 fs/read_write.c:643
       do_syscall_x64 arch/x86/entry/common.c:52 [inline]
       do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+848062ba19c8782ca5c8@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000d103ce06174d7ec3@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      a8eb3de2
    • Zhiguo Niu's avatar
      f2fs: use new ioprio Macro to get ckpt thread ioprio level · 270b0931
      Zhiguo Niu authored
      IOPRIO_PRIO_DATA in the new kernel version includes level and hint,
      So Macro IOPRIO_PRIO_LEVEL is more accurate to get ckpt thread
      ioprio data/level, and it is also consisten with the way setting
      ckpt thread ioprio by IOPRIO_PRIO_VALUE(class, data/level).
      
      Besides, change variable name from "data" to "level" for more readable.
      Signed-off-by: default avatarZhiguo Niu <zhiguo.niu@unisoc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      270b0931
    • Chao Yu's avatar
      f2fs: fix to don't dirty inode for readonly filesystem · 192b8fb8
      Chao Yu authored
      syzbot reports f2fs bug as below:
      
      kernel BUG at fs/f2fs/inode.c:933!
      RIP: 0010:f2fs_evict_inode+0x1576/0x1590 fs/f2fs/inode.c:933
      Call Trace:
       evict+0x2a4/0x620 fs/inode.c:664
       dispose_list fs/inode.c:697 [inline]
       evict_inodes+0x5f8/0x690 fs/inode.c:747
       generic_shutdown_super+0x9d/0x2c0 fs/super.c:675
       kill_block_super+0x44/0x90 fs/super.c:1667
       kill_f2fs_super+0x303/0x3b0 fs/f2fs/super.c:4894
       deactivate_locked_super+0xc1/0x130 fs/super.c:484
       cleanup_mnt+0x426/0x4c0 fs/namespace.c:1256
       task_work_run+0x24a/0x300 kernel/task_work.c:180
       ptrace_notify+0x2cd/0x380 kernel/signal.c:2399
       ptrace_report_syscall include/linux/ptrace.h:411 [inline]
       ptrace_report_syscall_exit include/linux/ptrace.h:473 [inline]
       syscall_exit_work kernel/entry/common.c:251 [inline]
       syscall_exit_to_user_mode_prepare kernel/entry/common.c:278 [inline]
       __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
       syscall_exit_to_user_mode+0x15c/0x280 kernel/entry/common.c:296
       do_syscall_64+0x50/0x110 arch/x86/entry/common.c:88
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The root cause is:
      - do_sys_open
       - f2fs_lookup
        - __f2fs_find_entry
         - f2fs_i_depth_write
          - f2fs_mark_inode_dirty_sync
           - f2fs_dirty_inode
            - set_inode_flag(inode, FI_DIRTY_INODE)
      
      - umount
       - kill_f2fs_super
        - kill_block_super
         - generic_shutdown_super
          - sync_filesystem
          : sb is readonly, skip sync_filesystem()
          - evict_inodes
           - iput
            - f2fs_evict_inode
             - f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE))
             : trigger kernel panic
      
      When we try to repair i_current_depth in readonly filesystem, let's
      skip dirty inode to avoid panic in later f2fs_evict_inode().
      
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+31e4659a3fe953aec2f4@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000e890bc0609a55cff@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      192b8fb8
    • Zhiguo Niu's avatar
      f2fs: fix to avoid use SSR allocate when do defragment · 21327a04
      Zhiguo Niu authored
      SSR allocate mode will be used when doing file defragment
      if ATGC is working at the same time, that is because
      set_page_private_gcing may make CURSEG_ALL_DATA_ATGC segment
      type got in f2fs_allocate_data_block when defragment page
      is writeback, which may cause file fragmentation is worse.
      
      A file with 2 fragmentations is changed as following after defragment:
      
      ----------------file info-------------------
      sensorsdata :
      --------------------------------------------
      dev       [254:48]
      ino       [0x    3029 : 12329]
      mode      [0x    81b0 : 33200]
      nlink     [0x       1 : 1]
      uid       [0x    27e6 : 10214]
      gid       [0x    27e6 : 10214]
      size      [0x  242000 : 2367488]
      blksize   [0x    1000 : 4096]
      blocks    [0x    1210 : 4624]
      --------------------------------------------
      
      file_pos   start_blk     end_blk        blks
             0    11361121    11361207          87
        356352    11361215    11361216           2
        364544    11361218    11361218           1
        368640    11361220    11361221           2
        376832    11361224    11361225           2
        385024    11361227    11361238          12
        434176    11361240    11361252          13
        487424    11361254    11361254           1
        491520    11361271    11361279           9
        528384     3681794     3681795           2
        536576     3681797     3681797           1
        540672     3681799     3681799           1
        544768     3681803     3681803           1
        548864     3681805     3681805           1
        552960     3681807     3681807           1
        557056     3681809     3681809           1
      Signed-off-by: default avatarZhiguo Niu <zhiguo.niu@unisoc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      21327a04
    • Chao Yu's avatar
      f2fs: fix to force buffered IO on inline_data inode · 5c8764f8
      Chao Yu authored
      It will return all zero data when DIO reading from inline_data inode, it
      is because f2fs_iomap_begin() assign iomap->type w/ IOMAP_HOLE incorrectly
      for this case.
      
      We can let iomap framework handle inline data via assigning iomap->type
      and iomap->inline_data correctly, however, it will be a little bit
      complicated when handling race case in between direct IO and buffered IO.
      
      So, let's force to use buffered IO to fix this issue.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarBarry Song <v-songbaohua@oppo.com>
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5c8764f8
    • Zhiguo Niu's avatar
      f2fs: fix to remove redundant SBI_NEED_FSCK flag set · 6924c8b6
      Zhiguo Niu authored
      Subsequent f2fs_stop_checkpoint will set cp_err, so this
      SBI_NEED_FSCK flag set action is invalid.
      Signed-off-by: default avatarZhiguo Niu <zhiguo.niu@unisoc.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6924c8b6
    • Sheng Yong's avatar
      f2fs: alloc new section if curseg is not the first seg in its zone · 76da333f
      Sheng Yong authored
      If curseg is not the first segment in its zone, the zone is not empty.
      A new section should be allocated and avoid resetting the old zone.
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarSheng Yong <shengyong@oppo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      76da333f
    • Chao Yu's avatar
      f2fs: add support for FS_IOC_GETFSSYSFSPATH · cc260b66
      Chao Yu authored
      FS_IOC_GETFSSYSFSPATH ioctl expects sysfs sub-path of a filesystem, the
      format can be "$FSTYP/$SYSFS_IDENTIFIER" under /sys/fs, it can helps to
      standardizes exporting sysfs datas across filesystems.
      
      This patch wires up FS_IOC_GETFSSYSFSPATH for f2fs, it will output
      "f2fs/<dev>".
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      cc260b66
    • Chao Yu's avatar
      f2fs: fix to do sanity check on blocks for inline_data inode · c240c87b
      Chao Yu authored
      inode can be fuzzed, so it can has F2FS_INLINE_DATA flag and valid
      i_blocks/i_nid value, this patch supports to do extra sanity check
      to detect such corrupted state.
      Signed-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c240c87b
    • Chao Yu's avatar
      f2fs: fix to do sanity check on F2FS_INLINE_DATA flag in inode during GC · fc01008c
      Chao Yu authored
      syzbot reports a f2fs bug as below:
      
      ------------[ cut here ]------------
      kernel BUG at fs/f2fs/inline.c:258!
      CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 6.9.0-rc6-syzkaller-00012-g9e4bc4bc #0
      RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
      Call Trace:
       f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
       f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
       __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
       f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
       do_writepages+0x35b/0x870 mm/page-writeback.c:2612
       __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
       writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
       wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
       wb_do_writeback fs/fs-writeback.c:2264 [inline]
       wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
       process_one_work kernel/workqueue.c:3254 [inline]
       process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
       worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
       kthread+0x2f2/0x390 kernel/kthread.c:388
       ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
      The root cause is: inline_data inode can be fuzzed, so that there may
      be valid blkaddr in its direct node, once f2fs triggers background GC
      to migrate the block, it will hit f2fs_bug_on() during dirty page
      writeback.
      
      Let's add sanity check on F2FS_INLINE_DATA flag in inode during GC,
      so that, it can forbid migrating inline_data inode's data block for
      fixing.
      
      Reported-by: syzbot+848062ba19c8782ca5c8@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000d103ce06174d7ec3@google.comSigned-off-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      fc01008c
  7. 11 Jun, 2024 1 commit
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 2ef5971f
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
       "Misc:
         - Restore debugfs behavior of ignoring unknown mount options
         - Fix kernel doc for netfs_wait_for_oustanding_io()
         - Fix struct statx comment after new addition for this cycle
         - Fix a check in find_next_fd()
      
        iomap:
         - Fix data zeroing behavior when an extent spans the block that
           contains i_size
         - Restore i_size increasing in iomap_write_end() for now to avoid
           stale data exposure on xfs with a realtime device
      
        Cachefiles:
         - Remove unneeded fdtable.h include
         - Improve trace output for cachefiles_obj_{get,put}_ondemand_fd()
         - Remove requests from the request list to prevent accessing already
           freed requests
         - Fix UAF when issuing restore command while the daemon is still
           alive by adding an additional reference count to requests
         - Fix UAF by grabbing a reference during xarray lookup with xa_lock()
           held
         - Simplify error handling in cachefiles_ondemand_daemon_read()
         - Add consistency checks read and open requests to avoid crashes
         - Add a spinlock to protect ondemand_id variable which is used to
           determine whether an anonymous cachefiles fd has already been
           closed
         - Make on-demand reads killable allowing to handle broken cachefiles
           daemon better
         - Flush all requests after the kernel has been marked dead via
           CACHEFILES_DEAD to avoid hung-tasks
         - Ensure that closed requests are marked as such to avoid reusing
           them with a reopen request
         - Defer fd_install() until after copy_to_user() succeeded and thereby
           get rid of having to use close_fd()
         - Ensure that anonymous cachefiles on-demand fds are reused while
           they are valid to avoid pinning already freed cookies"
      
      * tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        iomap: Fix iomap_adjust_read_range for plen calculation
        iomap: keep on increasing i_size in iomap_write_end()
        cachefiles: remove unneeded include of <linux/fdtable.h>
        fs/file: fix the check in find_next_fd()
        cachefiles: make on-demand read killable
        cachefiles: flush all requests after setting CACHEFILES_DEAD
        cachefiles: Set object to close if ondemand_id < 0 in copen
        cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
        cachefiles: never get a new anonymous fd if ondemand_id is valid
        cachefiles: add spin_lock for cachefiles_ondemand_info
        cachefiles: add consistency check for copen/cread
        cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
        cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
        cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
        cachefiles: remove requests from xarray during flushing requests
        cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
        statx: Update offset commentary for struct statx
        netfs: fix kernel doc for nets_wait_for_outstanding_io()
        debugfs: continue to ignore unknown mount options
      2ef5971f
  8. 09 Jun, 2024 5 commits
    • Linus Torvalds's avatar
      Linux 6.10-rc3 · 83a7eefe
      Linus Torvalds authored
      83a7eefe
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of... · b8481381
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Update copies of kernel headers, which resulted in support for the
         new 'mseal' syscall, SUBVOL statx return mask bit, RISC-V and PPC
         prctls, fcntl's DUPFD_QUERY, POSTED_MSI_NOTIFICATION IRQ vector,
         'map_shadow_stack' syscall for x86-32.
      
       - Revert perf.data record memory allocation optimization that ended up
         causing a regression, work is being done to re-introduce it in the
         next merge window.
      
       - Fix handling of minimal vmlinux.h file used with BPF's CO-RE when
         interrupting the build.
      
      * tag 'perf-tools-fixes-for-v6.10-2-2024-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build
        Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"
        tools headers arm64: Sync arm64's cputype.h with the kernel sources
        tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_SUBVOL
        tools headers UAPI: Update i915_drm.h with the kernel sources
        tools headers UAPI: Sync kvm headers with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall
        perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION
        perf beauty: Update copy of linux/socket.h with the kernel sources
        tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools include UAPI: Sync linux/stat.h with the kernel sources
      b8481381
    • Linus Torvalds's avatar
      Merge tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 637c2dfc
      Linus Torvalds authored
      Pull EDAC fixes from Borislav Petkov:
      
       - Convert PCI core error codes to proper error numbers since latter get
         propagated all the way up to the module loading functions
      
      * tag 'edac_urgent_for_v6.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
        EDAC/igen6: Convert PCIBIOS_* return codes to errnos
        EDAC/amd64: Convert PCIBIOS_* return codes to errnos
      637c2dfc
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 771ed661
      Linus Torvalds authored
      Pull clk fix from Stephen Boyd:
       "One fix for the SiFive PRCI clocks so that the device boots again.
      
        This driver was registering clkdev lookups that were always going to
        be useless. This wasn't a problem until clkdev started returning an
        error in these cases, causing this driver to fail probe, and thus boot
        to fail because clks are essential for most drivers. The fix is
        simple, don't use clkdev because this is a DT based system where
        clkdev isn't used"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sifive: Do not register clkdevs for PRCI clocks
      771ed661
    • Linus Torvalds's avatar
      Merge tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · c5dbc2ed
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Two small smb3 client fixes:
      
         - fix deadlock in umount
      
         - minor cleanup due to netfs change"
      
      * tag '6.10-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: Don't advance the I/O iterator before terminating subrequest
        smb: client: fix deadlock in smb2_find_smb_tcon()
      c5dbc2ed
  9. 08 Jun, 2024 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 061d1af7
      Linus Torvalds authored
      Pull HID fixes from Benjamin Tissoires:
      
       - fix potential read out of bounds in hid-asus (Andrew Ballance)
      
       - fix endian-conversion on little endian systems in intel-ish-hid (Arnd
         Bergmann)
      
       - A couple of new input event codes (Aseda Aboagye)
      
       - errors handling fixes in hid-nvidia-shield (Chen Ni), hid-nintendo
         (Christophe JAILLET), hid-logitech-dj (José Expósito)
      
       - current leakage fix while the device is in suspend on a i2c-hid
         laptop (Johan Hovold)
      
       - other assorted smaller fixes and device ID / quirk entry additions
      
      * tag 'for-linus-2024060801' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: Ignore battery for ELAN touchscreens 2F2C and 4116
        HID: i2c-hid: elan: fix reset suspend current leakage
        dt-bindings: HID: i2c-hid: elan: add 'no-reset-on-power-off' property
        dt-bindings: HID: i2c-hid: elan: add Elan eKTH5015M
        dt-bindings: HID: i2c-hid: add dedicated Ilitek ILI2901 schema
        input: Add support for "Do Not Disturb"
        input: Add event code for accessibility key
        hid: asus: asus_report_fixup: fix potential read out of bounds
        HID: logitech-hidpp: add missing MODULE_DESCRIPTION() macro
        HID: intel-ish-hid: fix endian-conversion
        HID: nintendo: Fix an error handling path in nintendo_hid_probe()
        HID: logitech-dj: Fix memory leak in logi_dj_recv_switch_to_dj_mode()
        HID: core: remove unnecessary WARN_ON() in implement()
        HID: nvidia-shield: Add missing check for input_ff_create_memless
        HID: intel-ish-hid: Fix build error for COMPILE_TEST
      061d1af7
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.10-2' of... · 329f70c5
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix the initial state of the save button in 'make gconfig'
      
       - Improve the Kconfig documentation
      
       - Fix a Kconfig bug regarding property visibility
      
       - Fix build breakage for systems where 'sed' is not installed in /bin
      
       - Fix a false warning about missing MODULE_DESCRIPTION()
      
      * tag 'kbuild-fixes-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        modpost: do not warn about missing MODULE_DESCRIPTION() for vmlinux.o
        kbuild: explicitly run mksysmap as sed script from link-vmlinux.sh
        kconfig: remove wrong expr_trans_bool()
        kconfig: doc: document behavior of 'select' and 'imply' followed by 'if'
        kconfig: doc: fix a typo in the note about 'imply'
        kconfig: gconf: give a proper initial state to the Save button
        kconfig: remove unneeded code for user-supplied values being out of range
      329f70c5
    • Linus Torvalds's avatar
      Merge tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 1e7ccdd3
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
      
       - fixes for the new ipu6 driver (and related fixes to mei csi driver)
      
       - fix a double debugfs remove logic at mgb4 driver
      
       - a documentation fix
      
      * tag 'media/v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: intel/ipu6: add csi2 port sanity check in notifier bound
        media: intel/ipu6: update the maximum supported csi2 port number to 6
        media: mei: csi: Warn less verbosely of a missing device fwnode
        media: mei: csi: Put the IPU device reference
        media: intel/ipu6: fix the buffer flags caused by wrong parentheses
        media: intel/ipu6: Fix an error handling path in isys_probe()
        media: intel/ipu6: Move isys_remove() close to isys_probe()
        media: intel/ipu6: Fix some redundant resources freeing in ipu6_pci_remove()
        media: Documentation: v4l: Fix ACTIVE route flag
        media: mgb4: Fix double debugfs remove
      1e7ccdd3
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 36714d69
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
      
       - Fix possible memory leak the riscv-intc irqchip driver load failures
      
       - Fix boot crash in the sifive-plic irqchip driver caused by recently
         changed boot initialization order
      
       - Fix race condition in the gic-v3-its irqchip driver
      
      * tag 'irq-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()
        irqchip/sifive-plic: Chain to parent IRQ after handlers are ready
        irqchip/riscv-intc: Prevent memory leak when riscv_intc_init_common() fails
      36714d69
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cedb020
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Miscellaneous fixes:
      
         - Fix kexec() crash if call depth tracking is enabled
      
         - Fix SMN reads on inaccessible registers on certain AMD systems"
      
      * tag 'x86-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/amd_nb: Check for invalid SMN reads
        x86/kexec: Fix bug with call depth tracking
      7cedb020
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7cec2e16
      Linus Torvalds authored
      Pull perf event fix from Ingo Molnar:
       "Fix race between perf_event_free_task() and perf_event_release_kernel()
        that can result in missed wakeups and hung tasks"
      
      * tag 'perf-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Fix missing wakeup when waiting for context reference
      7cec2e16
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bbc5332b
      Linus Torvalds authored
      Pull locking doc fix from Ingo Molnar:
       "Fix typos in the kerneldoc of some of the atomic APIs"
      
      * tag 'locking-urgent-2024-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldoc
      bbc5332b
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of... · dc772f82
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "14 hotfixes, 6 of which are cc:stable.
      
        All except the nilfs2 fix affect MM and all are singletons - see the
        chagelogs for details"
      
      * tag 'mm-hotfixes-stable-2024-06-07-15-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        nilfs2: fix nilfs_empty_dir() misjudgment and long loop on I/O errors
        mm: fix xyz_noprof functions calling profiled functions
        codetag: avoid race at alloc_slab_obj_exts
        mm/hugetlb: do not call vma_add_reservation upon ENOMEM
        mm/ksm: fix ksm_zero_pages accounting
        mm/ksm: fix ksm_pages_scanned accounting
        kmsan: do not wipe out origin when doing partial unpoisoning
        vmalloc: check CONFIG_EXECMEM in is_vmalloc_or_module_addr()
        mm: page_alloc: fix highatomic typing in multi-block buddies
        nilfs2: fix potential kernel bug due to lack of writeback flag waiting
        memcg: remove the lockdep assert from __mod_objcg_mlstate()
        mm: arm64: fix the out-of-bounds issue in contpte_clear_young_dirty_ptes
        mm: huge_mm: fix undefined reference to `mthp_stats' for CONFIG_SYSFS=n
        mm: drop the 'anon_' prefix for swap-out mTHP counters
      dc772f82
  10. 07 Jun, 2024 3 commits
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · e60721bf
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - interrupt handling and Kconfig fixes for gpio-tqmx86
      
       - add a buffer for storing output values in gpio-tqmx86 as reading back
         the registers always returns the input values
      
       - add missing MODULE_DESCRIPTION()s to several GPIO drivers
      
      * tag 'gpio-fixes-for-v6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: add missing MODULE_DESCRIPTION() macros
        gpio: tqmx86: fix broken IRQ_TYPE_EDGE_BOTH interrupt type
        gpio: tqmx86: store IRQ trigger type and unmask status separately
        gpio: tqmx86: introduce shadow register for GPIO output value
        gpio: tqmx86: fix typo in Kconfig label
      e60721bf
    • Linus Torvalds's avatar
      Merge tag 'block-6.10-20240607' of git://git.kernel.dk/linux · 602079a0
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix for null_blk block size validation (Andreas)
      
       - NVMe pull request via Keith:
            - Use reserved tags for special fabrics operations (Chunguang)
            - Persistent Reservation status masking fix (Weiwen)
      
      * tag 'block-6.10-20240607' of git://git.kernel.dk/linux:
        null_blk: fix validation of block size
        nvme: fix nvme_pr_* status code parsing
        nvme-fabrics: use reserved tag for reg read/write command
      602079a0
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux · e3391589
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix a locking order issue with setting max async thread workers
         (Hagar)
      
       - Fix for a NULL pointer dereference for failed async flagged requests
         using ring provided buffers. This doesn't affect the current kernel,
         but it does affect older kernels, and is being queued up for 6.10
         just to make the stable process easier (me)
      
       - Fix for NAPI timeout calculations for how long to busy poll, and
         subsequently how much to sleep post that if a wait timeout is passed
         in (me)
      
       - Fix for a regression in this release cycle, where we could end up
         using a partially unitialized match value for io-wq (Su)
      
      * tag 'io_uring-6.10-20240607' of git://git.kernel.dk/linux:
        io_uring: fix possible deadlock in io_register_iowq_max_workers()
        io_uring/io-wq: avoid garbage value of 'match' in io_wq_enqueue()
        io_uring/napi: fix timeout calculation
        io_uring: check for non-NULL file pointer in io_file_can_poll()
      e3391589