1. 29 Jun, 2022 4 commits
    • Ruozhu Li's avatar
      nvme: fix regression when disconnect a recovering ctrl · f7f70f4a
      Ruozhu Li authored
      We encountered a problem that the disconnect command hangs.
      After analyzing the log and stack, we found that the triggering
      process is as follows:
      CPU0                          CPU1
                                      nvme_rdma_error_recovery_work
                                        nvme_rdma_teardown_io_queues
      nvme_do_delete_ctrl                 nvme_stop_queues
        nvme_remove_namespaces
        --clear ctrl->namespaces
                                          nvme_start_queues
                                          --no ns in ctrl->namespaces
          nvme_ns_remove                  return(because ctrl is deleting)
            blk_freeze_queue
              blk_mq_freeze_queue_wait
              --wait for ns to unquiesce to clean infligt IO, hang forever
      
      This problem was not found in older kernels because we will flush
      err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
      seem to be modified for functional reasons, the patch can be revert
      to solve the problem.
      
      Revert commit 794a4cb3 ("nvme: remove the .stop_ctrl callout")
      Signed-off-by: default avatarRuozhu Li <liruozhu@huawei.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      f7f70f4a
    • Pablo Greco's avatar
      nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G) · 1629de0e
      Pablo Greco authored
      ADATA XPG SPECTRIX S40G drives report bogus eui64 values that appear to
      be the same across drives in one system. Quirk them out so they are
      not marked as "non globally unique" duplicates.
      
      Before:
      [    2.258919] nvme nvme1: pci function 0000:06:00.0
      [    2.264898] nvme nvme2: pci function 0000:05:00.0
      [    2.323235] nvme nvme1: failed to set APST feature (2)
      [    2.326153] nvme nvme2: failed to set APST feature (2)
      [    2.333935] nvme nvme1: allocated 64 MiB host memory buffer.
      [    2.336492] nvme nvme2: allocated 64 MiB host memory buffer.
      [    2.339611] nvme nvme1: 7/0/0 default/read/poll queues
      [    2.341805] nvme nvme2: 7/0/0 default/read/poll queues
      [    2.346114]  nvme1n1: p1
      [    2.347197] nvme nvme2: globally duplicate IDs for nsid 1
      After:
      [    2.427715] nvme nvme1: pci function 0000:06:00.0
      [    2.427771] nvme nvme2: pci function 0000:05:00.0
      [    2.488154] nvme nvme2: failed to set APST feature (2)
      [    2.489895] nvme nvme1: failed to set APST feature (2)
      [    2.498773] nvme nvme2: allocated 64 MiB host memory buffer.
      [    2.500587] nvme nvme1: allocated 64 MiB host memory buffer.
      [    2.504113] nvme nvme2: 7/0/0 default/read/poll queues
      [    2.507026] nvme nvme1: 7/0/0 default/read/poll queues
      [    2.509467] nvme nvme2: Ignoring bogus Namespace Identifiers
      [    2.512804] nvme nvme1: Ignoring bogus Namespace Identifiers
      [    2.513698]  nvme1n1: p1
      Signed-off-by: default avatarPablo Greco <pgreco@centosproject.org>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1629de0e
    • Sagi Grimberg's avatar
      nvme-tcp: always fail a request when sending it failed · 41d07df7
      Sagi Grimberg authored
      queue stoppage and inflight requests cancellation is fully fenced from
      io_work and thus failing a request from this context. Hence we don't
      need to try to guess from the socket retcode if this failure is because
      the queue is about to be torn down or not.
      
      We are perfectly safe to just fail it, the request will not be cancelled
      later on.
      
      This solves possible very long shutdown delays when the users issues a
      'nvme disconnect-all'
      Reported-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      41d07df7
    • Sagi Grimberg's avatar
      nvmet-tcp: fix regression in data_digest calculation · ed0691cf
      Sagi Grimberg authored
      Data digest calculation iterates over command mapped iovec. However
      since commit bac04454 we unmap the iovec before we handle the data
      digest, and since commit 69b85e1f we clear nr_mapped when we unmap
      the iov.
      
      Instead of open-coding the command iov traversal, simply call
      crypto_ahash_digest with the command sg that is already allocated (we
      already do that for the send path). Rename nvmet_tcp_send_ddgst to
      nvmet_tcp_calc_ddgst and call it from send and recv paths.
      
      Fixes: 69b85e1f ("nvmet-tcp: add an helper to free the cmd buffers")
      Fixes: bac04454 ("nvmet-tcp: fix kmap leak when data digest in use")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ed0691cf
  2. 25 Jun, 2022 1 commit
  3. 23 Jun, 2022 5 commits
  4. 21 Jun, 2022 1 commit
  5. 20 Jun, 2022 1 commit
  6. 17 Jun, 2022 4 commits
  7. 16 Jun, 2022 5 commits
  8. 15 Jun, 2022 4 commits
    • Jens Axboe's avatar
      Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19 · 04cb45b4
      Jens Axboe authored
      Pull MD fixes from Song.
      
      * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid5-ppl: Fix argument order in bio_alloc_bioset()
        Revert "md: don't unregister sync_thread with reconfig_mutex held"
      04cb45b4
    • Logan Gunthorpe's avatar
      md/raid5-ppl: Fix argument order in bio_alloc_bioset() · f34fdcd4
      Logan Gunthorpe authored
      bio_alloc_bioset() takes a block device, number of vectors, the
      OP flags, the GFP mask and the bio set. However when the prototype
      was changed, the callisite in ppl_do_flush() had the OP flags and
      the GFP flags reversed. This introduced some sparse error:
      
        drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
      				    (different base types)
        drivers/md/raid5-ppl.c:632:57:    expected unsigned int opf
        drivers/md/raid5-ppl.c:632:57:    got restricted gfp_t [usertype]
        drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
        				    (different base types)
        drivers/md/raid5-ppl.c:633:61:    expected restricted gfp_t [usertype]
      				    gfp_mask
        drivers/md/raid5-ppl.c:633:61:    got unsigned long long
      
      The sparse error introduction may not have been reported correctly by
      0day due to other work that was cleaning up other sparse errors in this
      area.
      
      Fixes: 609be106 ("block: pass a block_device and opf to bio_alloc_bioset")
      Cc: stable@vger.kernel.org # 5.18+
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      f34fdcd4
    • Guoqing Jiang's avatar
      Revert "md: don't unregister sync_thread with reconfig_mutex held" · d0a18034
      Guoqing Jiang authored
      The 07reshape5intr test is broke because of below path.
      
          md_reap_sync_thread
                  -> mddev_unlock
                  -> md_unregister_thread(&mddev->sync_thread)
      
      And md_check_recovery is triggered by,
      
      mddev_unlock -> md_wakeup_thread(mddev->thread)
      
      then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
      since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
      feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
      reshape_position can't be updated accordingly.
      
      Fixes: 8b48ec23 ("md: don't unregister sync_thread with reconfig_mutex held")
      Reported-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@linux.dev>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      d0a18034
    • Jens Axboe's avatar
      Merge tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme into block-5.19 · 2396e958
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 5.19
      
       - quirks, quirks, quirks to work around buggy consumer grade devices
         (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
       - better kernel messages for devices that need quirking (Keith Bush)
       - make a kernel message more useful (Thomas Weißschuh)"
      
      * tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme:
        nvme-pci: disable write zeros support on UMIC and Samsung SSDs
        nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
        nvme-pci: sk hynix p31 has bogus namespace ids
        nvme-pci: smi has bogus namespace ids
        nvme-pci: phison e12 has bogus namespace ids
        nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
        nvme-pci: add trouble shooting steps for timeouts
        nvme: add bug report info for global duplicate id
        nvme: add device name to warning in uuid_show()
      2396e958
  9. 13 Jun, 2022 9 commits
  10. 12 Jun, 2022 6 commits
    • Linus Torvalds's avatar
      Linux 5.19-rc2 · b13baccc
      Linus Torvalds authored
      b13baccc
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.19-2' of... · 99795285
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Hans de Goede:
       "Highlights:
      
         - Fix hp-wmi regression on HP Omen laptops introduced in 5.18
      
         - Several hardware-id additions
      
         - A couple of other tiny fixes"
      
      * tag 'platform-drivers-x86-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86/intel: hid: Add Surface Go to VGBS allow list
        platform/x86: hp-wmi: Use zero insize parameter only when supported
        platform/x86: hp-wmi: Resolve WMI query failures on some devices
        platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF
        platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support
        platform/x86: barco-p50-gpio: Add check for platform_driver_register
        platform/x86/intel: pmc: Support Intel Raptorlake P
        platform/x86/intel: Fix pmt_crashlog array reference
        platform/mellanox: Add static in struct declaration.
        platform/mellanox: Spelling s/platfom/platform/
      99795285
    • Linus Torvalds's avatar
      Merge tag 'wq-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · b0cb8db3
      Linus Torvalds authored
      Pull workqueue fixes from Tejun Heo:
       "Tetsuo's patch to trigger build warnings if system-wide wq's are
        flushed along with a TP type update and trivial comment update"
      
      * tag 'wq-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: Switch to new kerneldoc syntax for named variable macro argument
        workqueue: Fix type of cpu in trace event
        workqueue: Wrap flush_workqueue() using a macro
      b0cb8db3
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.19' of... · e3b8e2de
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Make the *.mod build rule portable for POSIX awk
      
       - Fix regression of 'make nsdeps'
      
       - Make scripts/check-local-export working for older bash versions
      
       - Fix scripts/gdb to extract the .config data from vmlinux
      
      * tag 'kbuild-fixes-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        scripts/gdb: change kernel config dumping method
        scripts/check-local-export: avoid 'wait $!' for process substitution
        scripts/nsdeps: adjust to the format change of *.mod files
        kbuild: avoid regex RS for POSIX awk
      e3b8e2de
    • Linus Torvalds's avatar
      Merge tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 2275c6ba
      Linus Torvalds authored
      Pull cifs client fixes from Steve French:
       "Three reconnect fixes, all for stable as well.
      
        One of these three reconnect fixes does address a problem with
        multichannel reconnect, but this does not include the additional
        fix (still being tested) for dynamically detecting multichannel
        adapter changes which will improve those reconnect scenarios even
        more"
      
      * tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: populate empty hostnames for extra channels
        cifs: return errors during session setup during reconnects
        cifs: fix reconnect on smb3 mount types
      2275c6ba
    • Linus Torvalds's avatar
      Merge tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random · 3cae0d84
      Linus Torvalds authored
      Pull random number generator fixes from Jason Donenfeld:
      
       - A fix for a 5.19 regression for a case in which early device tree
         initializes the RNG, which flips a static branch.
      
         On most plaforms, jump labels aren't initialized until much later, so
         this caused splats. On a few mailing list threads, we cooked up easy
         fixes for arm64, arm32, and risc-v. But then things looked slightly
         more involved for xtensa, powerpc, arc, and mips. And at that point,
         when we're patching 7 architectures in a place before the console is
         even available, it seems like the cost/risk just wasn't worth it.
      
         So random.c works around it now by checking the already exported
         `static_key_initialized` boolean, as though somebody already ran into
         this issue in the past. I'm not super jazzed about that; it'd be
         prettier to not have to complicate downstream code. But I suppose
         it's practical.
      
       - A few small code nits and adding a missing __init annotation.
      
       - A change to the default config values to use the cpu and bootloader's
         seeds for initializing the RNG earlier.
      
         This brings them into line with what all the distros do (Fedora/RHEL,
         Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void... at
         least), and moreover will now give us test coverage in various test
         beds that might have caught the above device tree bug earlier.
      
       - A change to WireGuard CI's configuration to increase test coverage
         around the RNG.
      
       - A documentation comment fix to unrelated maintainerless CRC code that
         I was asked to take, I guess because it has to do with polynomials
         (which the RNG thankfully no longer uses).
      
      * tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
        wireguard: selftests: use maximum cpu features and allow rng seeding
        random: remove rng_has_arch_random()
        random: credit cpu and bootloader seeds by default
        random: do not use jump labels before they are initialized
        random: account for arch randomness in bits
        random: mark bootloader randomness code as __init
        random: avoid checking crng_ready() twice in random_init()
        crc-itu-t: fix typo in CRC ITU-T polynomial comment
      3cae0d84