1. 15 Jun, 2023 4 commits
    • Mike Snitzer's avatar
      dm: use op specific max_sectors when splitting abnormal io · be04c14a
      Mike Snitzer authored
      Split abnormal IO in terms of the corresponding operation specific
      max_sectors (max_discard_sectors, max_secure_erase_sectors or
      max_write_zeroes_sectors).
      
      This fixes a significant dm-thinp discard performance regression that
      was introduced with commit e2dd8aca ("dm bio prison v1: improve
      concurrent IO performance"). Relative to discard: max_discard_sectors
      is used instead of max_sectors; which fixes excessive discard splitting
      (e.g. max_sectors=128K vs max_discard_sectors=64M).
      
      Tested by discarding an 1 Petabyte dm-thin device:
      lvcreate -V 1125899906842624B -T test/pool -n thin
      time blkdiscard /dev/test/thin
      
      Before this fix (splitting discards every 128K): ~116m
       After this fix (splitting discards every 64M) : 0m33.460s
      Reported-by: default avatarZorro Lang <zlang@redhat.com>
      Fixes: 06961c48 ("dm: split discards further if target sets max_discard_granularity")
      Requires: 13f6facf ("dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE")
      Fixes: e2dd8aca ("dm bio prison v1: improve concurrent IO performance")
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      be04c14a
    • Mike Snitzer's avatar
      dm thin: fix issue_discard to pass GFP_NOIO to __blkdev_issue_discard · 722d9082
      Mike Snitzer authored
      issue_discard() passes GFP_NOWAIT to __blkdev_issue_discard() despite
      its code assuming bio_alloc() always succeeds.
      
      Commit 3dba53a9 ("dm thin: use __blkdev_issue_discard for async
      discard support") clearly shows where things went bad:
      
      Before commit 3dba53a9, dm-thin.c's open-coded
      __blkdev_issue_discard_async() properly handled using GFP_NOWAIT.
      Unfortunately __blkdev_issue_discard() doesn't and it was missed
      during review.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      722d9082
    • Li Lingfeng's avatar
      dm thin metadata: check fail_io before using data_sm · cb65b282
      Li Lingfeng authored
      Must check pmd->fail_io before using pmd->data_sm since
      pmd->data_sm may be destroyed by other processes.
      
             P1(kworker)                             P2(message)
      do_worker
       process_prepared
        process_prepared_discard_passdown_pt2
         dm_pool_dec_data_range
                                          pool_message
                                           commit
                                            dm_pool_commit_metadata
                                              ↓
                                             // commit failed
                                            metadata_operation_failed
                                             abort_transaction
                                              dm_pool_abort_metadata
                                               __open_or_format_metadata
                                                 ↓
                                                dm_sm_disk_open
                                                  ↓
                                                 // open failed
                                                 // pmd->data_sm is NULL
          dm_sm_dec_blocks
            ↓
           // try to access pmd->data_sm --> UAF
      
      As shown above, if dm_pool_commit_metadata() and
      dm_pool_abort_metadata() fail in pool_message process, kworker may
      trigger UAF.
      
      Fixes: be500ed7 ("dm space maps: improve performance with inc/dec on ranges of blocks")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLi Lingfeng <lilingfeng3@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      cb65b282
    • Li Lingfeng's avatar
      dm: don't lock fs when the map is NULL during suspend or resume · 2760904d
      Li Lingfeng authored
      As described in commit 38d11da5 ("dm: don't lock fs when the map is
      NULL in process of resume"), a deadlock may be triggered between
      do_resume() and do_mount().
      
      This commit preserves the fix from commit 38d11da5 but moves it to
      where it also serves to fix a similar deadlock between do_suspend()
      and do_mount().  It does so, if the active map is NULL, by clearing
      DM_SUSPEND_LOCKFS_FLAG in dm_suspend() which is called by both
      do_suspend() and do_resume().
      
      Fixes: 38d11da5 ("dm: don't lock fs when the map is NULL in process of resume")
      Signed-off-by: default avatarLi Lingfeng <lilingfeng3@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@kernel.org>
      2760904d
  2. 14 May, 2023 13 commits
  3. 13 May, 2023 17 commits
  4. 12 May, 2023 6 commits
    • Borislav Petkov (AMD)'s avatar
      x86/retbleed: Fix return thunk alignment · 9a48d604
      Borislav Petkov (AMD) authored
      SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout
      (leaving only the last 2 bytes of the address):
      
        3bff <zen_untrain_ret>:
        3bff:       f3 0f 1e fa             endbr64
        3c03:       f6                      test   $0xcc,%bl
      
        3c04 <__x86_return_thunk>:
        3c04:       c3                      ret
        3c05:       cc                      int3
        3c06:       0f ae e8                lfence
      
      However, "the RET at __x86_return_thunk must be on a 64 byte boundary,
      for alignment within the BTB."
      
      Use SYM_START instead.
      Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a48d604
    • Linus Torvalds's avatar
      Merge tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 76c7f887
      Linus Torvalds authored
      Pull more btrfs fixes from David Sterba:
      
       - fix incorrect number of bitmap entries for space cache if loading is
         interrupted by some error
      
       - fix backref walking, this breaks a mode of LOGICAL_INO_V2 ioctl that
         is used in deduplication tools
      
       - zoned mode fixes:
            - properly finish zone reserved for relocation
            - correctly calculate super block zone end on ZNS
            - properly initialize new extent buffer for redirty
      
       - make mount option clear_cache work with block-group-tree, to rebuild
         free-space-tree instead of temporarily disabling it that would lead
         to a forced read-only mount
      
       - fix alignment check for offset when printing extent item
      
      * tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: make clear_cache mount option to rebuild FST without disabling it
        btrfs: zero the buffer before marking it dirty in btrfs_redirty_list_add
        btrfs: zoned: fix full zone super block reading on ZNS
        btrfs: zoned: zone finish data relocation BG with last IO
        btrfs: fix backref walking not returning all inode refs
        btrfs: fix space cache inconsistency after error loading it from disk
        btrfs: print-tree: parent bytenr must be aligned to sector size
      76c7f887
    • Linus Torvalds's avatar
      Merge tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · fd88f147
      Linus Torvalds authored
      Pull cifs client fixes from Steve French:
      
       - fix for copy_file_range bug for very large files that are multiples
         of rsize
      
       - do not ignore "isolated transport" flag if set on share
      
       - set rasize default better
      
       - three fixes related to shutdown and freezing (fixes 4 xfstests, and
         closes deferred handles faster in some places that were missed)
      
      * tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: release leases for deferred close handles when freezing
        smb3: fix problem remounting a share after shutdown
        SMB3: force unmount was failing to close deferred close files
        smb3: improve parallel reads of large files
        do not reuse connection if share marked as isolated
        cifs: fix pcchunk length type in smb2_copychunk_range
      fd88f147
    • Linus Torvalds's avatar
      Merge tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs · df8c2d13
      Linus Torvalds authored
      Pull vfs fix from Christian Brauner:
       "During the pipe nonblock rework the check for both O_NONBLOCK and
        IOCB_NOWAIT was dropped. Both checks need to be performed to ensure
        that files without O_NONBLOCK but IOCB_NOWAIT don't block when writing
        to or reading from a pipe.
      
        This just contains the fix adding the check for IOCB_NOWAIT back in"
      
      * tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
        pipe: check for IOCB_NOWAIT alongside O_NONBLOCK
      df8c2d13
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux · 584dc5db
      Linus Torvalds authored
      Pull io_uring fix from Jens Axboe:
       "Just a single fix making io_uring_sqe_cmd() available regardless of
        CONFIG_IO_URING, fixing a regression introduced during the merge
        window if nvme was selected but io_uring was not"
      
      * tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux:
        io_uring: make io_uring_sqe_cmd() unconditionally available
      584dc5db
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · ed6a75e3
      Linus Torvalds authored
      Pull RISC-V fix from Palmer Dabbelt:
       "Just a single fix this week for a build issue. That'd usually be a
        good sign, but we've started to get some reports of boot failures on
        some hardware/bootloader configurations. Nothing concrete yet, but
        I've got a funny feeling that's where much of the bug hunting is going
        right now.
      
        Nothing's reproducing on my end, though, and this fixes some pretty
        concrete issues so I figured there's no reason to delay it:
      
         - a fix to the linker script to avoid orpahaned sections in
           kernel/pi"
      
      * tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Fix orphan section warnings caused by kernel/pi
      ed6a75e3