1. 10 Apr, 2023 2 commits
    • Yonggil Song's avatar
      f2fs: Fix system crash due to lack of free space in LFS · d11cef14
      Yonggil Song authored
      When f2fs tries to checkpoint during foreground gc in LFS mode, system
      crash occurs due to lack of free space if the amount of dirty node and
      dentry pages generated by data migration exceeds free space.
      The reproduction sequence is as follows.
      
       - 20GiB capacity block device (null_blk)
       - format and mount with LFS mode
       - create a file and write 20,000MiB
       - 4k random write on full range of the file
      
       RIP: 0010:new_curseg+0x48a/0x510 [f2fs]
       Code: 55 e7 f5 89 c0 48 0f af c3 48 8b 5d c0 48 c1 e8 20 83 c0 01 89 43 6c 48 83 c4 28 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc <0f> 0b f0 41 80 4f 48 04 45 85 f6 0f 84 ba fd ff ff e9 ef fe ff ff
       RSP: 0018:ffff977bc397b218 EFLAGS: 00010246
       RAX: 00000000000027b9 RBX: 0000000000000000 RCX: 00000000000027c0
       RDX: 0000000000000000 RSI: 00000000000027b9 RDI: ffff8c25ab4e74f8
       RBP: ffff977bc397b268 R08: 00000000000027b9 R09: ffff8c29e4a34b40
       R10: 0000000000000001 R11: ffff977bc397b0d8 R12: 0000000000000000
       R13: ffff8c25b4dd81a0 R14: 0000000000000000 R15: ffff8c2f667f9000
       FS: 0000000000000000(0000) GS:ffff8c344ec80000(0000) knlGS:0000000000000000
       CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 000000c00055d000 CR3: 0000000e30810003 CR4: 00000000003706e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
       <TASK>
       allocate_segment_by_default+0x9c/0x110 [f2fs]
       f2fs_allocate_data_block+0x243/0xa30 [f2fs]
       ? __mod_lruvec_page_state+0xa0/0x150
       do_write_page+0x80/0x160 [f2fs]
       f2fs_do_write_node_page+0x32/0x50 [f2fs]
       __write_node_page+0x339/0x730 [f2fs]
       f2fs_sync_node_pages+0x5a6/0x780 [f2fs]
       block_operations+0x257/0x340 [f2fs]
       f2fs_write_checkpoint+0x102/0x1050 [f2fs]
       f2fs_gc+0x27c/0x630 [f2fs]
       ? folio_mark_dirty+0x36/0x70
       f2fs_balance_fs+0x16f/0x180 [f2fs]
      
      This patch adds checking whether free sections are enough before checkpoint
      during gc.
      Signed-off-by: default avatarYonggil Song <yonggil.song@samsung.com>
      [Jaegeuk Kim: code clean-up]
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d11cef14
    • Yangtao Li's avatar
      f2fs: remove struct victim_selection default_v_ops · 19e0e21a
      Yangtao Li authored
      There is only single instance of these ops, and Jaegeuk point out that:
      
          Originally this was intended to give a chance to provide other
          allocation option. Anyway, it seems quit hard to do it anymore.
      
      So remove the indirection and call f2fs_get_victim() directly.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      19e0e21a
  2. 04 Apr, 2023 5 commits
  3. 29 Mar, 2023 18 commits
    • Jaegeuk Kim's avatar
      f2fs: fix scheduling while atomic in decompression path · 1aa161e4
      Jaegeuk Kim authored
      [   16.945668][    C0] Call trace:
      [   16.945678][    C0]  dump_backtrace+0x110/0x204
      [   16.945706][    C0]  dump_stack_lvl+0x84/0xbc
      [   16.945735][    C0]  __schedule_bug+0xb8/0x1ac
      [   16.945756][    C0]  __schedule+0x724/0xbdc
      [   16.945778][    C0]  schedule+0x154/0x258
      [   16.945793][    C0]  bit_wait_io+0x48/0xa4
      [   16.945808][    C0]  out_of_line_wait_on_bit+0x114/0x198
      [   16.945824][    C0]  __sync_dirty_buffer+0x1f8/0x2e8
      [   16.945853][    C0]  __f2fs_commit_super+0x140/0x1f4
      [   16.945881][    C0]  f2fs_commit_super+0x110/0x28c
      [   16.945898][    C0]  f2fs_handle_error+0x1f4/0x2f4
      [   16.945917][    C0]  f2fs_decompress_cluster+0xc4/0x450
      [   16.945942][    C0]  f2fs_end_read_compressed_page+0xc0/0xfc
      [   16.945959][    C0]  f2fs_handle_step_decompress+0x118/0x1cc
      [   16.945978][    C0]  f2fs_read_end_io+0x168/0x2b0
      [   16.945993][    C0]  bio_endio+0x25c/0x2c8
      [   16.946015][    C0]  dm_io_dec_pending+0x3e8/0x57c
      [   16.946052][    C0]  clone_endio+0x134/0x254
      [   16.946069][    C0]  bio_endio+0x25c/0x2c8
      [   16.946084][    C0]  blk_update_request+0x1d4/0x478
      [   16.946103][    C0]  scsi_end_request+0x38/0x4cc
      [   16.946129][    C0]  scsi_io_completion+0x94/0x184
      [   16.946147][    C0]  scsi_finish_command+0xe8/0x154
      [   16.946164][    C0]  scsi_complete+0x90/0x1d8
      [   16.946181][    C0]  blk_done_softirq+0xa4/0x11c
      [   16.946198][    C0]  _stext+0x184/0x614
      [   16.946214][    C0]  __irq_exit_rcu+0x78/0x144
      [   16.946234][    C0]  handle_domain_irq+0xd4/0x154
      [   16.946260][    C0]  gic_handle_irq.33881+0x5c/0x27c
      [   16.946281][    C0]  call_on_irq_stack+0x40/0x70
      [   16.946298][    C0]  do_interrupt_handler+0x48/0xa4
      [   16.946313][    C0]  el1_interrupt+0x38/0x68
      [   16.946346][    C0]  el1h_64_irq_handler+0x20/0x30
      [   16.946362][    C0]  el1h_64_irq+0x78/0x7c
      [   16.946377][    C0]  finish_task_switch+0xc8/0x3d8
      [   16.946394][    C0]  __schedule+0x600/0xbdc
      [   16.946408][    C0]  preempt_schedule_common+0x34/0x5c
      [   16.946423][    C0]  preempt_schedule+0x44/0x48
      [   16.946438][    C0]  process_one_work+0x30c/0x550
      [   16.946456][    C0]  worker_thread+0x414/0x8bc
      [   16.946472][    C0]  kthread+0x16c/0x1e0
      [   16.946486][    C0]  ret_from_fork+0x10/0x20
      
      Fixes: bff139b4 ("f2fs: handle decompress only post processing in softirq")
      Fixes: 95fa90c9 ("f2fs: support recording errors into superblock")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      1aa161e4
    • Hans Holmberg's avatar
      f2fs: preserve direct write semantics when buffering is forced · 92318f20
      Hans Holmberg authored
      In some cases, e.g. for zoned block devices, direct writes are
      forced into buffered writes that will populate the page cache
      and be written out just like buffered io.
      
      Direct reads, on the other hand, is supported for the zoned
      block device case. This has the effect that applications
      built for direct io will fill up the page cache with data
      that will never be read, and that is a waste of resources.
      
      If we agree that this is a problem, how do we fix it?
      
      A) Supporting proper direct writes for zoned block devices would
      be the best, but it is currently not supported (probably for
      a good but non-obvious reason). Would it be feasible to
      implement proper direct IO?
      
      B) Avoid the cost of keeping unwanted data by syncing and throwing
      out the cached pages for buffered O_DIRECT writes before completion.
      
      This patch implements B) by reusing the code for how partial
      block writes are flushed out on the "normal" direct write path.
      
      Note that this changes the performance characteristics of f2fs
      quite a bit.
      
      Direct IO performance for zoned block devices is lower for
      small writes after this patch, but this should be expected
      with direct IO and in line with how f2fs behaves on top of
      conventional block devices.
      
      Another open question is if the flushing should be done for
      all cases where buffered writes are forced.
      Signed-off-by: default avatarHans Holmberg <hans.holmberg@wdc.com>
      Reviewed-by: default avatarYonggil Song <yonggil.song@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      92318f20
    • Yangtao Li's avatar
      f2fs: compress: fix to call f2fs_wait_on_page_writeback() in f2fs_write_raw_pages() · babedcba
      Yangtao Li authored
      BUG_ON() will be triggered when writing files concurrently,
      because the same page is writtenback multiple times.
      
      1597 void folio_end_writeback(struct folio *folio)
      1598 {
      		......
      1618     if (!__folio_end_writeback(folio))
      1619         BUG();
      		......
      1625 }
      
      kernel BUG at mm/filemap.c:1619!
      Call Trace:
       <TASK>
       f2fs_write_end_io+0x1a0/0x370
       blk_update_request+0x6c/0x410
       blk_mq_end_request+0x15/0x130
       blk_complete_reqs+0x3c/0x50
       __do_softirq+0xb8/0x29b
       ? sort_range+0x20/0x20
       run_ksoftirqd+0x19/0x20
       smpboot_thread_fn+0x10b/0x1d0
       kthread+0xde/0x110
       ? kthread_complete_and_exit+0x20/0x20
       ret_from_fork+0x22/0x30
       </TASK>
      
      Below is the concurrency scenario:
      
      [Process A]		[Process B]		[Process C]
      f2fs_write_raw_pages()
        - redirty_page_for_writepage()
        - unlock page()
      			f2fs_do_write_data_page()
      			  - lock_page()
      			  - clear_page_dirty_for_io()
      			  - set_page_writeback() [1st writeback]
      			    .....
      			    - unlock page()
      
      						generic_perform_write()
      						  - f2fs_write_begin()
      						    - wait_for_stable_page()
      
      						  - f2fs_write_end()
      						    - set_page_dirty()
      
        - lock_page()
          - f2fs_do_write_data_page()
            - set_page_writeback() [2st writeback]
      
      This problem was introduced by the previous commit 7377e853 ("f2fs:
      compress: fix potential deadlock of compress file"). All pagelocks were
      released in f2fs_write_raw_pages(), but whether the page was
      in the writeback state was ignored in the subsequent writing process.
      Let's fix it by waiting for the page to writeback before writing.
      
      Cc: Christoph Hellwig <hch@lst.de>
      Fixes: 4c8ff709 ("f2fs: support data compression")
      Fixes: 7377e853 ("f2fs: compress: fix potential deadlock of compress file")
      Signed-off-by: default avatarQi Han <hanqi@vivo.com>
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      babedcba
    • Yangtao Li's avatar
      f2fs: remove else in f2fs_write_cache_pages() · c948be79
      Yangtao Li authored
      As Christoph Hellwig point out:
      
      	Please avoid the else by doing the goto in the branch.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c948be79
    • Jaegeuk Kim's avatar
      f2fs: apply zone capacity to all zone type · 0b37ed21
      Jaegeuk Kim authored
      If we manage the zone capacity per zone type, it'll break the GC assumption.
      And, the current logic complains valid block count mismatch.
      Let's apply zone capacity to all zone type, if specified.
      
      Fixes: de881df9 ("f2fs: support zone capacity less than zone size")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0b37ed21
    • Yangtao Li's avatar
      f2fs: fix to handle filemap_fdatawrite() error in f2fs_ioc_decompress_file/f2fs_ioc_compress_file · b822dc91
      Yangtao Li authored
      It seems inappropriate that the current logic does not handle
      filemap_fdatawrite() errors, so let's fix it.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b822dc91
    • Yangtao Li's avatar
      f2fs: convert to MAX_SBI_FLAG instead of 32 in stat_show() · 5bb9c111
      Yangtao Li authored
      BIW reduce the s_flag array size and make s_flag constant.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      5bb9c111
    • Yonggil Song's avatar
      f2fs: Fix discard bug on zoned block devices with 2MiB zone size · 6797ebc4
      Yonggil Song authored
      When using f2fs on a zoned block device with 2MiB zone size, IO errors
      occurs because f2fs tries to write data to a zone that has not been reset.
      
      The cause is that f2fs tries to discard multiple zones at once. This is
      caused by a condition in f2fs_clear_prefree_segments that does not check
      for zoned block devices when setting the discard range. This leads to
      invalid reset commands and write pointer mismatches.
      
      This patch fixes the zoned block device with 2MiB zone size to reset one
      zone at a time.
      Signed-off-by: default avatarYonggil Song <yonggil.song@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      6797ebc4
    • Jaegeuk Kim's avatar
      f2fs: remove entire rb_entry sharing · bf21acf9
      Jaegeuk Kim authored
      This is a last part to remove the memory sharing for rb_tree in extent_cache.
      
      This should also fix arm32 memory alignment issue.
      
      [struct extent_node]               [struct rb_entry]
      [0] struct rb_node rb_node;        [0] struct rb_node rb_node;
        union {                              union {
          struct {                             struct {
      [16]  unsigned int fofs;           [12]    unsigned int ofs;
            unsigned int len;                    unsigned int len;
                                               };
                                               unsigned long long key;
                                             } __packed;
      
      Cc: <stable@vger.kernel.org>
      Fixes: 13054c54 ("f2fs: introduce infra macro and data structure of rb-tree extent cache")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      bf21acf9
    • Jaegeuk Kim's avatar
      f2fs: factor out discard_cmd usage from general rb_tree use · f69475dd
      Jaegeuk Kim authored
      This is a second part to remove the mixed use of rb_tree in discard_cmd from
      extent_cache.
      
      This should also fix arm32 memory alignment issue caused by shared rb_entry.
      
      [struct discard_cmd]               [struct rb_entry]
      [0] struct rb_node rb_node;        [0] struct rb_node rb_node;
        union {                              union {
          struct {                             struct {
      [16]  block_t lstart;              [12]    unsigned int ofs;
            block_t len;                         unsigned int len;
                                               };
                                               unsigned long long key;
                                             } __packed;
      
      Cc: <stable@vger.kernel.org>
      Fixes: 004b6862 ("f2fs: use rb-tree to track pending discard commands")
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      f69475dd
    • Jaegeuk Kim's avatar
      f2fs: factor out victim_entry usage from general rb_tree use · 043d2d00
      Jaegeuk Kim authored
      Let's reduce the complexity of mixed use of rb_tree in victim_entry from
      extent_cache and discard_cmd.
      
      This should fix arm32 memory alignment issue caused by shared rb_entry.
      
      [struct victim_entry]              [struct rb_entry]
      [0] struct rb_node rb_node;        [0] struct rb_node rb_node;
                                             union {
                                               struct {
                                                 unsigned int ofs;
                                                 unsigned int len;
                                               };
      [16] unsigned long long mtime;     [12] unsigned long long key;
                                             } __packed;
      
      Cc: <stable@vger.kernel.org>
      Fixes: 093749e2 ("f2fs: support age threshold based garbage collection")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      043d2d00
    • Yonggil Song's avatar
      f2fs: fix uninitialized skipped_gc_rwsem · c17caf0b
      Yonggil Song authored
      When f2fs skipped a gc round during victim migration, there was a bug which
      would skip all upcoming gc rounds unconditionally because skipped_gc_rwsem
      was not initialized. It fixes the bug by correctly initializing the
      skipped_gc_rwsem inside the gc loop.
      
      Fixes: 6f8d4455 ("f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc")
      Signed-off-by: default avatarYonggil Song <yonggil.song@samsung.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c17caf0b
    • Yangtao Li's avatar
      f2fs: handle dqget error in f2fs_transfer_project_quota() · 8051692f
      Yangtao Li authored
      We should set the error code when dqget() failed.
      
      Fixes: 2c1d0305 ("f2fs: support F2FS_IOC_FS{GET,SET}XATTR")
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      8051692f
    • Yangtao Li's avatar
      f2fs: convert to use bitmap API · 447286eb
      Yangtao Li authored
      Let's use BIT() and GENMASK() instead of open it.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      447286eb
    • Yangtao Li's avatar
      f2fs: export compress_percent and compress_watermark entries · 960fa2c8
      Yangtao Li authored
      This patch export below sysfs entries for better control cached
      compress page count.
      
      /sys/fs/f2fs/<disk>/compress_watermark
      /sys/fs/f2fs/<disk>/compress_percent
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      960fa2c8
    • Li Zetao's avatar
      f2fs: make f2fs_sync_inode_meta() static · 60630375
      Li Zetao authored
      After commit 26b5a079 ("f2fs: cleanup dirty pages if recover failed"),
      f2fs_sync_inode_meta() is only used in checkpoint.c, so
      f2fs_sync_inode_meta() should only be visible inside. Delete the
      declaration in the header file and change f2fs_sync_inode_meta()
      to static.
      Signed-off-by: default avatarLi Zetao <lizetao1@huawei.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      60630375
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20230327' of https://github.com/jcmvbkbc/linux-xtensa · ffe78bbd
      Linus Torvalds authored
      Pull xtensa fixes from Max Filippov:
      
       - fix KASAN report in show_stack
      
       - drop linux-xtensa mailing list from the MAINTAINERS file
      
      * tag 'xtensa-20230327' of https://github.com/jcmvbkbc/linux-xtensa:
        MAINTAINERS: xtensa: drop linux-xtensa@linux-xtensa.org mailing list
        xtensa: fix KASAN report for show_stack
      ffe78bbd
    • Linus Torvalds's avatar
      Merge tag 'f2fs-fix-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 3577a4d3
      Linus Torvalds authored
      Pull f2fs fix from Jaegeuk Kim:
       "This fixes a tracepoint field size in f2fs in preparation for stricter
        rules for tracing fields"
      
      * tag 'f2fs-fix-6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
        f2fs: Fix f2fs_truncate_partial_nodes ftrace event
      3577a4d3
  4. 28 Mar, 2023 3 commits
  5. 27 Mar, 2023 9 commits
  6. 26 Mar, 2023 3 commits
    • Linus Torvalds's avatar
      Linux 6.3-rc4 · 197b6b60
      Linus Torvalds authored
      197b6b60
    • Linus Torvalds's avatar
      Merge tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0ec57cfa
      Linus Torvalds authored
      Pull USB / Thunderbolt driver fixes from Greg KH:
       "Here are a small set of USB and Thunderbolt driver fixes for reported
        problems and a documentation update, for 6.3-rc4.
      
        Included in here are:
      
         - documentation update for uvc gadget driver
      
         - small thunderbolt driver fixes
      
         - cdns3 driver fixes
      
         - dwc3 driver fixes
      
         - dwc2 driver fixes
      
         - chipidea driver fixes
      
         - typec driver fixes
      
         - onboard_usb_hub device id updates
      
         - quirk updates
      
        All of these have been in linux-next with no reported problems"
      
      * tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
        usb: dwc2: fix a race, don't power off/on phy for dual-role mode
        usb: dwc2: fix a devres leak in hw_enable upon suspend resume
        usb: chipidea: core: fix possible concurrent when switch role
        usb: chipdea: core: fix return -EINVAL if request role is the same with current role
        thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
        thunderbolt: Disable interrupt auto clear for rings
        thunderbolt: Use const qualifier for `ring_interrupt_index`
        usb: gadget: Use correct endianness of the wLength field for WebUSB
        uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
        usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
        usb: cdns3: Fix issue with using incorrect PCI device function
        usb: cdnsp: Fixes issue with redundant Status Stage
        MAINTAINERS: make me a reviewer of USB/IP
        thunderbolt: Use scale field when allocating USB3 bandwidth
        thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
        thunderbolt: Call tb_check_quirks() after initializing adapters
        thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
        thunderbolt: Fix memory leak in margining
        usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
        docs: usb: Add documentation for the UVC Gadget
        ...
      0ec57cfa
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 18940c88
      Linus Torvalds authored
      Pull scheduler fix from Borislav Petkov:
      
       - Fix a corner case where vruntime of a task is not being sanitized
      
      * tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Sanitize vruntime of entity being migrated
      18940c88