1. 27 Oct, 2022 3 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-6.1-2022-10-27' of git://git.infradead.org/nvme into block-6.1 · dea31328
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.1
      
       - make the multipath dma alignment to match the non-multipath one
         (Keith Busch)
       - fix a bogus use of sg_init_marker() (Nam Cao)
       - fix circulr locking in nvme-tcp (Sagi Grimberg)"
      
      * tag 'nvme-6.1-2022-10-27' of git://git.infradead.org/nvme:
        nvme-multipath: set queue dma alignment to 3
        nvme-tcp: fix possible circular locking when deleting a controller under memory pressure
        nvme-tcp: replace sg_init_marker() with sg_init_table()
      dea31328
    • Ming Lei's avatar
      blk-mq: don't add non-pt request with ->end_io to batch · 2d87d455
      Ming Lei authored
      dm-rq implements ->end_io callback for request issued to underlying queue,
      and it isn't passthrough request.
      
      Commit ab3e1d3b ("block: allow end_io based requests in the completion
      batch handling") doesn't clear rq->bio and rq->__data_len for request
      with ->end_io in blk_mq_end_request_batch(), and this way is actually
      dangerous, but so far it is only for nvme passthrough request.
      
      dm-rq needs to clean up remained bios in case of partial completion,
      and req->bio is required, then use-after-free is triggered, so the
      underlying clone request can't be completed in blk_mq_end_request_batch.
      
      Fix panic by not adding such request into batch list, and the issue
      can be triggered simply by exposing nvme pci to dm-mpath simply.
      
      Fixes: ab3e1d3b ("block: allow end_io based requests in the completion batch handling")
      Cc: dm-devel@redhat.com
      Cc: Mike Snitzer <snitzer@kernel.org>
      Reported-by: default avatarChanghui Zhong <czhong@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20221027085709.513175-1-ming.lei@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2d87d455
    • Yang Yingliang's avatar
      rbd: fix possible memory leak in rbd_sysfs_init() · 7f21735f
      Yang Yingliang authored
      If device_register() returns error in rbd_sysfs_init(), name of kobject
      which is allocated in dev_set_name() called in device_add() is leaked.
      
      As comment of device_add() says, it should call put_device() to drop
      the reference count that was set in device_initialize() when it fails,
      so the name can be freed in kobject_cleanup().
      
      Fault injection test can trigger this problem:
      
      unreferenced object 0xffff88810173aa78 (size 8):
        comm "modprobe", pid 247, jiffies 4294714278 (age 31.789s)
        hex dump (first 8 bytes):
          72 62 64 00 81 88 ff ff                          rbd.....
        backtrace:
          [<00000000f58fae56>] __kmalloc_node_track_caller+0x44/0x1b0
          [<00000000bdd44fe7>] kstrdup+0x3a/0x70
          [<00000000f7844d0b>] kstrdup_const+0x63/0x80
          [<000000001b0a0eeb>] kvasprintf_const+0x10b/0x190
          [<00000000a47bd894>] kobject_set_name_vargs+0x56/0x150
          [<00000000d5edbf18>] dev_set_name+0xab/0xe0
          [<00000000f5153e80>] device_add+0x106/0x1f20
      
      Fixes: dfc5606d ("rbd: replace the rbd sysfs interface")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/20221027091918.2294132-1-yangyingliang@huawei.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7f21735f
  2. 25 Oct, 2022 3 commits
    • Keith Busch's avatar
      nvme-multipath: set queue dma alignment to 3 · fe8714b0
      Keith Busch authored
      NVMe spec requires all transports support dword aligned addresses, which
      is already set in the namespace request_queue. Set the same limit in the
      multipath device's request_queue as well.
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      fe8714b0
    • Sagi Grimberg's avatar
      nvme-tcp: fix possible circular locking when deleting a controller under memory pressure · 83e1226b
      Sagi Grimberg authored
      When destroying a queue, when calling sock_release, the network stack
      might need to allocate an skb to send a FIN/RST. When that happens
      during memory pressure, there is a need to reclaim memory, which
      in turn may ask the nvme-tcp device to write out dirty pages, however
      this is not possible due to a ctrl teardown that is going on.
      
      Set PF_MEMALLOC to the task that releases the socket to grant access
      to PF_MEMALLOC reserves. In addition, do the same for the nvme-tcp
      thread as this may also originate from the swap itself and should
      be more resilient to memory pressure situations.
      
      This fixes the following lockdep complaint:
      --
      ======================================================
       WARNING: possible circular locking dependency detected
       6.0.0-rc2+ #25 Tainted: G        W
       ------------------------------------------------------
       kswapd0/92 is trying to acquire lock:
       ffff888114003240 (sk_lock-AF_INET-NVME){+.+.}-{0:0}, at: tcp_sendpage+0x23/0xa0
      
       but task is already holding lock:
       ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (fs_reclaim){+.+.}-{0:0}:
              fs_reclaim_acquire+0x11e/0x160
              kmem_cache_alloc_node+0x44/0x530
              __alloc_skb+0x158/0x230
              tcp_send_active_reset+0x7e/0x730
              tcp_disconnect+0x1272/0x1ae0
              __tcp_close+0x707/0xd90
              tcp_close+0x26/0x80
              inet_release+0xfa/0x220
              sock_release+0x85/0x1a0
              nvme_tcp_free_queue+0x1fd/0x470 [nvme_tcp]
              nvme_do_delete_ctrl+0x130/0x13d [nvme_core]
              nvme_sysfs_delete.cold+0x8/0xd [nvme_core]
              kernfs_fop_write_iter+0x356/0x530
              vfs_write+0x4e8/0xce0
              ksys_write+0xfd/0x1d0
              do_syscall_64+0x58/0x80
              entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
       -> #0 (sk_lock-AF_INET-NVME){+.+.}-{0:0}:
              __lock_acquire+0x2a0c/0x5690
              lock_acquire+0x18e/0x4f0
              lock_sock_nested+0x37/0xc0
              tcp_sendpage+0x23/0xa0
              inet_sendpage+0xad/0x120
              kernel_sendpage+0x156/0x440
              nvme_tcp_try_send+0x48a/0x2630 [nvme_tcp]
              nvme_tcp_queue_rq+0xefb/0x17e0 [nvme_tcp]
              __blk_mq_try_issue_directly+0x452/0x660
              blk_mq_plug_issue_direct.constprop.0+0x207/0x700
              blk_mq_flush_plug_list+0x6f5/0xc70
              __blk_flush_plug+0x264/0x410
              blk_finish_plug+0x4b/0xa0
              shrink_lruvec+0x1263/0x1ea0
              shrink_node+0x736/0x1a80
              balance_pgdat+0x740/0x10d0
              kswapd+0x5f2/0xaf0
              kthread+0x256/0x2f0
              ret_from_fork+0x1f/0x30
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(fs_reclaim);
                                     lock(sk_lock-AF_INET-NVME);
                                     lock(fs_reclaim);
        lock(sk_lock-AF_INET-NVME);
      
       *** DEADLOCK ***
      
      3 locks held by kswapd0/92:
       #0: ffffffff97e95ca0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x987/0x10d0
       #1: ffff88811f21b0b0 (q->srcu){....}-{0:0}, at: blk_mq_flush_plug_list+0x6b3/0xc70
       #2: ffff888170b11470 (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0xeb9/0x17e0 [nvme_tcp]
      
      Fixes: 3f2304f8 ("nvme-tcp: add NVMe over TCP host driver")
      Reported-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Tested-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      83e1226b
    • Nam Cao's avatar
      nvme-tcp: replace sg_init_marker() with sg_init_table() · 5fa9add6
      Nam Cao authored
      In nvme_tcp_ddgst_update(), sg_init_marker() is called with an
      uninitialized scatterlist. This is probably fine, but gcc complains:
      
        CC [M]  drivers/nvme/host/tcp.o
      In file included from ./include/linux/dma-mapping.h:10,
                       from ./include/linux/skbuff.h:31,
                       from ./include/net/net_namespace.h:43,
                       from ./include/linux/netdevice.h:38,
                       from ./include/net/sock.h:46,
                       from drivers/nvme/host/tcp.c:12:
      In function ‘sg_mark_end’,
          inlined from ‘sg_init_marker’ at ./include/linux/scatterlist.h:356:2,
          inlined from ‘nvme_tcp_ddgst_update’ at drivers/nvme/host/tcp.c:390:2:
      ./include/linux/scatterlist.h:234:11: error: ‘sg.page_link’ is used uninitialized [-Werror=uninitialized]
        234 |         sg->page_link |= SG_END;
            |         ~~^~~~~~~~~~~
      drivers/nvme/host/tcp.c: In function ‘nvme_tcp_ddgst_update’:
      drivers/nvme/host/tcp.c:388:28: note: ‘sg’ declared here
        388 |         struct scatterlist sg;
            |                            ^~
      cc1: all warnings being treated as errors
      
      Use sg_init_table() instead, which basically memset the scatterlist to
      zero first before calling sg_init_marker().
      Signed-off-by: default avatarNam Cao <namcaov@gmail.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      5fa9add6
  3. 22 Oct, 2022 1 commit
  4. 20 Oct, 2022 8 commits
  5. 19 Oct, 2022 8 commits
  6. 18 Oct, 2022 1 commit
  7. 16 Oct, 2022 1 commit
  8. 12 Oct, 2022 6 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme into block-6.1 · 3bc429c1
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 6.1
      
       - add NVME_QUIRK_BOGUS_NID for Lexar NM760 (Abhijit)
       - avoid the deepest sleep state on ZHITAI TiPro5000 SSDs (Xi Ruoyao)
       - fix possible hang caused during ctrl deletion (Sagi Grimberg)
       - fix possible hang in live ns resize with ANA access (Sagi Grimberg)"
      
      * tag 'nvme-6.1-2022-10-12' of git://git.infradead.org/nvme:
        nvme-multipath: fix possible hang in live ns resize with ANA access
        nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
        nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
        nvme-tcp: fix possible hang caused during ctrl deletion
        nvme-rdma: fix possible hang caused during ctrl deletion
      3bc429c1
    • Sagi Grimberg's avatar
      nvme-multipath: fix possible hang in live ns resize with ANA access · 72e3b888
      Sagi Grimberg authored
      When we revalidate paths as part of ns size change (as of commit
      e7d65803), it is possible that during the path revalidation, the
      only paths that is IO capable (i.e. optimized/non-optimized) are the
      ones that ns resize was not yet informed to the host, which will cause
      inflight requests to be requeued (as we have available paths but none
      are IO capable). These requests on the requeue list are waiting for
      someone to resubmit them at some point.
      
      The IO capable paths will eventually notify the ns resize change to the
      host, but there is nothing that will kick the requeue list to resubmit
      the queued requests.
      
      Fix this by always kicking the requeue list, and if no IO capable path
      exists, these requests will be queued again.
      
      A typical log that indicates that IOs are requeued:
      --
      nvme nvme1: creating 4 I/O queues.
      nvme nvme1: new ctrl: "testnqn1"
      nvme nvme2: creating 4 I/O queues.
      nvme nvme2: mapped 4/0/0 default/read/poll queues.
      nvme nvme2: new ctrl: NQN "testnqn1", addr 127.0.0.1:8009
      nvme nvme1: rescanning namespaces.
      nvme1n1: detected capacity change from 2097152 to 4194304
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      block nvme1n1: no usable path - requeuing I/O
      nvme nvme2: rescanning namespaces.
      --
      Reported-by: default avatarYogev Cohen <yogev@lightbitslabs.com>
      Fixes: e7d65803 ("nvme-multipath: revalidate paths during rescan")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Cc: <stable@vger.kernel.org> # v5.15+
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      72e3b888
    • Xi Ruoyao's avatar
      nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs · d5d3c100
      Xi Ruoyao authored
      ZHITAI TiPro5000 SSDs has the same APST sleep problem as its cousin,
      TiPro7000.  The quirk for TiPro7000 has been added in
      commit 6b961bce ("nvme-pci: avoid the deepest sleep state on
      ZHITAI TiPro7000 SSDs"), use the same quirk for TiPro5000.
      
      The ASPT data from "nvme id-ctrl /dev/nvme1":
      
      vid       : 0x1e49
      ssvid     : 0x1e49
      sn        : ZTA21T0KA2227304LM
      mn        : ZHITAI TiPlus5000 1TB
      fr        : ZTA09139
      [...]
      ps    0 : mp:6.50W operational enlat:0 exlat:0 rrt:0 rrl:0
               rwt:0 rwl:0 idle_power:- active_power:-
      ps    1 : mp:5.80W operational enlat:0 exlat:0 rrt:1 rrl:1
               rwt:1 rwl:1 idle_power:- active_power:-
      ps    2 : mp:3.60W operational enlat:0 exlat:0 rrt:2 rrl:2
               rwt:2 rwl:2 idle_power:- active_power:-
      ps    3 : mp:0.0500W non-operational enlat:5000 exlat:10000 rrt:3 rrl:3
               rwt:3 rwl:3 idle_power:- active_power:-
      ps    4 : mp:0.0025W non-operational enlat:8000 exlat:45000 rrt:4 rrl:4
               rwt:4 rwl:4 idle_power:- active_power:-
      Reported-and-tested-by: default avatarChang Feng <flukehn@gmail.com>
      Signed-off-by: default avatarXi Ruoyao <xry111@xry111.site>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      d5d3c100
    • Abhijit's avatar
      nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760 · 80b26240
      Abhijit authored
      Add a quirk to fix Lexar NM760 SSD drives reporting duplicate nsids.
      Signed-off-by: default avatarAbhijit <abhijit@abhijittomar.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      80b26240
    • Sagi Grimberg's avatar
      nvme-tcp: fix possible hang caused during ctrl deletion · c4abd875
      Sagi Grimberg authored
      When we delete a controller, we execute the following:
      1. nvme_stop_ctrl() - stop some work elements that may be
      	inflight or scheduled (specifically also .stop_ctrl
      	which cancels ctrl error recovery work)
      2. nvme_remove_namespaces() - which first flushes scan_work
      	to avoid competing ns addition/removal
      3. continue to teardown the controller
      
      However, if err_work was scheduled to run in (1), it is designed to
      cancel any inflight I/O, particularly I/O that is originating from ns
      scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
      prevent forward progress of (2) as ns scanning is blocking on I/O
      (that will never be cancelled).
      
      The race is:
      1. transport layer error observed -> err_work is scheduled
      2. scan_work executes, discovers ns, generate I/O to it
      3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
         - err_work never executed
      4. nvme_remove_namespaces() -> flush_work(scan_work)
      --> deadlock, because scan_work is blocked on I/O that was supposed
      to be cancelled by err_work, but was cancelled before executing (see
      stack trace [1]).
      
      Fix this by flushing err_work instead of cancelling it, to force it
      to execute and cancel all inflight I/O.
      
      [1]:
      --
      Call Trace:
       <TASK>
       __schedule+0x390/0x910
       ? scan_shadow_nodes+0x40/0x40
       schedule+0x55/0xe0
       io_schedule+0x16/0x40
       do_read_cache_page+0x55d/0x850
       ? __page_cache_alloc+0x90/0x90
       read_cache_page+0x12/0x20
       read_part_sector+0x3f/0x110
       amiga_partition+0x3d/0x3e0
       ? osf_partition+0x33/0x220
       ? put_partition+0x90/0x90
       bdev_disk_changed+0x1fe/0x4d0
       blkdev_get_whole+0x7b/0x90
       blkdev_get_by_dev+0xda/0x2d0
       device_add_disk+0x356/0x3b0
       nvme_mpath_set_live+0x13c/0x1a0 [nvme_core]
       ? nvme_parse_ana_log+0xae/0x1a0 [nvme_core]
       nvme_update_ns_ana_state+0x3a/0x40 [nvme_core]
       nvme_mpath_add_disk+0x120/0x160 [nvme_core]
       nvme_alloc_ns+0x594/0xa00 [nvme_core]
       nvme_validate_or_alloc_ns+0xb9/0x1a0 [nvme_core]
       ? __nvme_submit_sync_cmd+0x1d2/0x210 [nvme_core]
       nvme_scan_work+0x281/0x410 [nvme_core]
       process_one_work+0x1be/0x380
       worker_thread+0x37/0x3b0
       ? process_one_work+0x380/0x380
       kthread+0x12d/0x150
       ? set_kthread_struct+0x50/0x50
       ret_from_fork+0x1f/0x30
       </TASK>
      INFO: task nvme:6725 blocked for more than 491 seconds.
            Not tainted 5.15.65-f0.el7.x86_64 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:nvme            state:D
       stack:    0 pid: 6725 ppid:  1761 flags:0x00004000
      Call Trace:
       <TASK>
       __schedule+0x390/0x910
       ? sched_clock+0x9/0x10
       schedule+0x55/0xe0
       schedule_timeout+0x24b/0x2e0
       ? try_to_wake_up+0x358/0x510
       ? finish_task_switch+0x88/0x2c0
       wait_for_completion+0xa5/0x110
       __flush_work+0x144/0x210
       ? worker_attach_to_pool+0xc0/0xc0
       flush_work+0x10/0x20
       nvme_remove_namespaces+0x41/0xf0 [nvme_core]
       nvme_do_delete_ctrl+0x47/0x66 [nvme_core]
       nvme_sysfs_delete.cold.96+0x8/0xd [nvme_core]
       dev_attr_store+0x14/0x30
       sysfs_kf_write+0x38/0x50
       kernfs_fop_write_iter+0x146/0x1d0
       new_sync_write+0x114/0x1b0
       ? intel_pmu_handle_irq+0xe0/0x420
       vfs_write+0x18d/0x270
       ksys_write+0x61/0xe0
       __x64_sys_write+0x1a/0x20
       do_syscall_64+0x37/0x90
       entry_SYSCALL_64_after_hwframe+0x61/0xcb
      --
      
      Fixes: 3f2304f8 ("nvme-tcp: add NVMe over TCP host driver")
      Reported-by: default avatarJonathan Nicklin <jnicklin@blockbridge.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Tested-by: default avatarJonathan Nicklin <jnicklin@blockbridge.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      c4abd875
    • Sagi Grimberg's avatar
      nvme-rdma: fix possible hang caused during ctrl deletion · a1ae8d4d
      Sagi Grimberg authored
      When we delete a controller, we execute the following:
      1. nvme_stop_ctrl() - stop some work elements that may be
              inflight or scheduled (specifically also .stop_ctrl
              which cancels ctrl error recovery work)
      2. nvme_remove_namespaces() - which first flushes scan_work
              to avoid competing ns addition/removal
      3. continue to teardown the controller
      
      However, if err_work was scheduled to run in (1), it is designed to
      cancel any inflight I/O, particularly I/O that is originating from ns
      scan_work in (2), but because it is cancelled in .stop_ctrl(), we can
      prevent forward progress of (2) as ns scanning is blocking on I/O
      (that will never be cancelled).
      
      The race is:
      1. transport layer error observed -> err_work is scheduled
      2. scan_work executes, discovers ns, generate I/O to it
      3. nvme_ctop_ctrl() -> .stop_ctrl() -> cancel_work_sync(err_work)
         - err_work never executed
      4. nvme_remove_namespaces() -> flush_work(scan_work)
      --> deadlock, because scan_work is blocked on I/O that was supposed
      to be cancelled by err_work, but was cancelled before executing.
      
      Fix this by flushing err_work instead of cancelling it, to force it
      to execute and cancel all inflight I/O.
      
      Fixes: b435ecea ("nvme: Add .stop_ctrl to nvme ctrl ops")
      Fixes: f6c8e432 ("nvme: flush namespace scanning work just before removing namespaces")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      a1ae8d4d
  9. 10 Oct, 2022 3 commits
  10. 09 Oct, 2022 6 commits
    • Linus Torvalds's avatar
      Merge tag 'ucount-rlimits-cleanups-for-v5.19' of... · 493ffd66
      Linus Torvalds authored
      Merge tag 'ucount-rlimits-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull ucounts update from Eric Biederman:
       "Split rlimit and ucount values and max values
      
        After the ucount rlimit code was merged a bunch of small but
        siginificant bugs were found and fixed. At the time it was realized
        that part of the problem was that while the ucount rlimits were very
        similar to the oridinary ucounts (in being nested counts with limits)
        the semantics were slightly different and the code would be less error
        prone if there was less sharing.
      
        This is the long awaited cleanup that should hopefully keep things
        more comprehensible and less error prone for whoever needs to touch
        that code next"
      
      * tag 'ucount-rlimits-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        ucounts: Split rlimit and ucount values and max values
      493ffd66
    • Linus Torvalds's avatar
      Merge tag 'signal-for-v5.20' of... · e572410e
      Linus Torvalds authored
      Merge tag 'signal-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull ptrace update from Eric Biederman:
       "ptrace: Stop supporting SIGKILL for PTRACE_EVENT_EXIT
      
        Recently I had a conversation where it was pointed out to me that
        SIGKILL sent to a tracee stropped in PTRACE_EVENT_EXIT is quite
        difficult for a tracer to handle.
      
        Keeping SIGKILL working after the process has been killed is pain from
        an implementation point of view.
      
        So since the debuggers don't want this behavior let's see if we can
        remove this wart for the userspace API
      
        If a regression is detected it should only need to be the last change
        that is the reverted. The other two are just general cleanups that
        make the last patch simpler"
      
      * tag 'signal-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        signal: Drop signals received after a fatal signal has been processed
        signal: Guarantee that SIGNAL_GROUP_EXIT is set on process exit
        signal: Ensure SIGNAL_GROUP_EXIT gets set in do_group_exit
      e572410e
    • Linus Torvalds's avatar
      Merge tag 'retire_mq_sysctls-for-v5.19' of... · 86fb9c53
      Linus Torvalds authored
      Merge tag 'retire_mq_sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull mqueue fix from Eric Biederman:
       "A fix for an unlikely but possible memory leak"
      
      * tag 'retire_mq_sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        ipc: mqueue: fix possible memory leak in init_mqueue_fs()
      86fb9c53
    • Linus Torvalds's avatar
      Merge tag 'interrupting_kthread_stop-for-v5.20' of... · c71370bd
      Linus Torvalds authored
      Merge tag 'interrupting_kthread_stop-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull kthread update from Eric Biederman:
       "Break out of wait loops on kthread_stop()
      
        This is a small tweak to kthread_stop so it breaks out of
        interruptible waits, that don't explicitly test for kthread_stop.
      
        These interruptible waits occassionaly occur in kernel threads do to
        code sharing"
      
      * tag 'interrupting_kthread_stop-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        signal: break out of wait loops on kthread_stop()
      c71370bd
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 4899a36f
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Remove our now never-true definitions for pgd_huge() and p4d_leaf().
      
       - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.
      
       - Add support for syscall wrappers.
      
       - Add support for KFENCE on 64-bit.
      
       - Update 64-bit HV KVM to use the new guest state entry/exit accounting
         API.
      
       - Support execute-only memory when using the Radix MMU (P9 or later).
      
       - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.
      
       - Updates to our linker script to move more data into read-only
         sections.
      
       - Allow the VDSO to be randomised on 32-bit.
      
       - Many other small features and fixes.
      
      Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
      Rajeev, Christophe Leroy, David Hildenbrand, Disha Goel, Fabiano Rosas,
      Gaosheng Cui, Gustavo A. R. Silva, Haren Myneni, Hari Bathini, Jilin
      Yuan, Joel Stanley, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent
      Dufour, Liang He, Li Huafei, Lukas Bulwahn, Madhavan Srinivasan, Nathan
      Chancellor, Nathan Lynch, Nicholas Miehlbradt, Nicholas Piggin, Pali
      Rohár, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool,
      Shrikanth Hegde, Tyrel Datwyler, Wolfram Sang, ye xingchen, and Zheng
      Yongjun.
      
      * tag 'powerpc-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
        KVM: PPC: Book3S HV: Fix stack frame regs marker
        powerpc: Don't add __powerpc_ prefix to syscall entry points
        powerpc/64s/interrupt: Fix stack frame regs marker
        powerpc/64: Fix msr_check_and_set/clear MSR[EE] race
        powerpc/64s/interrupt: Change must-hard-mask interrupt check from BUG to WARN
        powerpc/pseries: Add firmware details to the hardware description
        powerpc/powernv: Add opal details to the hardware description
        powerpc: Add device-tree model to the hardware description
        powerpc/64: Add logical PVR to the hardware description
        powerpc: Add PVR & CPU name to hardware description
        powerpc: Add hardware description string
        powerpc/configs: Enable PPC_UV in powernv_defconfig
        powerpc/configs: Update config files for removed/renamed symbols
        powerpc/mm: Fix UBSAN warning reported on hugetlb
        powerpc/mm: Always update max/min_low_pfn in mem_topology_setup()
        powerpc/mm/book3s/hash: Rename flush_tlb_pmd_range
        powerpc: Drops STABS_DEBUG from linker scripts
        powerpc/64s: Remove lost/old comment
        powerpc/64s: Remove old STAB comment
        powerpc: remove orphan systbl_chk.sh
        ...
      4899a36f
    • Linus Torvalds's avatar
      Merge tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 03785a69
      Linus Torvalds authored
      Pull s390 updates from Vasily Gorbik:
      
       - Make use of the IBM z16 processor activity instrumentation facility
         extension to count neural network processor assist operations: add a
         new PMU device driver so that perf can make use of this.
      
       - Rework memcpy_real() to avoid DAT-off mode.
      
       - Rework absolute lowcore access code.
      
       - Various small fixes and improvements all over the code.
      
      * tag 's390-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/pci: remove unused bus_next field from struct zpci_dev
        s390/cio: remove unused ccw_device_force_console() declaration
        s390/pai: Add support for PAI Extension 1 NNPA counters
        s390/mm: fix no previous prototype warnings in maccess.c
        s390/mm: uninline copy_oldmem_kernel() function
        s390/mm,ptdump: add real memory copy page markers
        s390/mm: rework memcpy_real() to avoid DAT-off mode
        s390/dump: save IPL CPU registers once DAT is available
        s390/pci: convert high_memory to physical address
        s390/smp,ptdump: add absolute lowcore markers
        s390/smp: rework absolute lowcore access
        s390/smp: call smp_reinit_ipl_cpu() before scheduler is available
        s390/ptdump: add missing amode31 markers
        s390/mm: split lowcore pages with set_memory_4k()
        s390/mm: remove unused access parameter from do_fault_error()
        s390/delay: sync comment within __delay() with reality
        s390: move from strlcpy with unused retval to strscpy
      03785a69