1. 09 Feb, 2021 1 commit
  2. 29 Jan, 2021 1 commit
  3. 21 Jan, 2021 2 commits
  4. 15 Jan, 2021 2 commits
    • Brian King's avatar
      scsi: ibmvfc: Set default timeout to avoid crash during migration · 76490729
      Brian King authored
      While testing live partition mobility, we have observed occasional crashes
      of the Linux partition. What we've seen is that during the live migration,
      for specific configurations with large amounts of memory, slow network
      links, and workloads that are changing memory a lot, the partition can end
      up being suspended for 30 seconds or longer. This resulted in the following
      scenario:
      
      CPU 0                          CPU 1
      -------------------------------  ----------------------------------
      scsi_queue_rq                    migration_store
       -> blk_mq_start_request          -> rtas_ibm_suspend_me
        -> blk_add_timer                 -> on_each_cpu(rtas_percpu_suspend_me
                    _______________________________________V
                   |
                   V
          -> IPI from CPU 1
           -> rtas_percpu_suspend_me
                                           -> __rtas_suspend_last_cpu
      
      -- Linux partition suspended for > 30 seconds --
                                            -> for_each_online_cpu(cpu)
                                                 plpar_hcall_norets(H_PROD
       -> scsi_dispatch_cmd
                                            -> scsi_times_out
                                             -> scsi_abort_command
                                              -> queue_delayed_work
        -> ibmvfc_queuecommand_lck
         -> ibmvfc_send_event
          -> ibmvfc_send_crq
           - returns H_CLOSED
         <- returns SCSI_MLQUEUE_HOST_BUSY
      -> __blk_mq_requeue_request
      
                                            -> scmd_eh_abort_handler
                                             -> scsi_try_to_abort_cmd
                                               - returns SUCCESS
                                             -> scsi_queue_insert
      
      Normally, the SCMD_STATE_COMPLETE bit would protect against the command
      completion and the timeout, but that doesn't work here, since we don't
      check that at all in the SCSI_MLQUEUE_HOST_BUSY path.
      
      In this case we end up calling scsi_queue_insert on a request that has
      already been queued, or possibly even freed, and we crash.
      
      The patch below simply increases the default I/O timeout to avoid this race
      condition. This is also the timeout value that nearly all IBM SAN storage
      recommends setting as the default value.
      
      Link: https://lore.kernel.org/r/1610463998-19791-1-git-send-email-brking@linux.vnet.ibm.comSigned-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      76490729
    • Shin'ichiro Kawasaki's avatar
      scsi: target: tcmu: Fix use-after-free of se_cmd->priv · 780e1384
      Shin'ichiro Kawasaki authored
      Commit a3512902 ("scsi: target: tcmu: Use priv pointer in se_cmd")
      modified tcmu_free_cmd() to set NULL to priv pointer in se_cmd. However,
      se_cmd can be already freed by work queue triggered in
      target_complete_cmd(). This caused BUG KASAN use-after-free [1].
      
      To fix the bug, do not touch priv pointer in tcmu_free_cmd(). Instead, set
      NULL to priv pointer before target_complete_cmd() calls. Also, to avoid
      unnecessary priv pointer change in tcmu_queue_cmd(), modify priv pointer in
      the function only when tcmu_free_cmd() is not called.
      
      [1]
      BUG: KASAN: use-after-free in tcmu_handle_completions+0x1172/0x1770 [target_core_user]
      Write of size 8 at addr ffff88814cf79a40 by task cmdproc-uio0/14842
      
      CPU: 2 PID: 14842 Comm: cmdproc-uio0 Not tainted 5.11.0-rc2 #1
      Hardware name: Supermicro Super Server/X10SRL-F, BIOS 3.2 11/22/2019
      Call Trace:
       dump_stack+0x9a/0xcc
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       print_address_description.constprop.0+0x18/0x130
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       kasan_report.cold+0x7f/0x10e
       ? tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       tcmu_handle_completions+0x1172/0x1770 [target_core_user]
       ? queue_tmr_ring+0x5d0/0x5d0 [target_core_user]
       tcmu_irqcontrol+0x28/0x60 [target_core_user]
       uio_write+0x155/0x230
       ? uio_vma_fault+0x460/0x460
       ? security_file_permission+0x4f/0x440
       vfs_write+0x1ce/0x860
       ksys_write+0xe9/0x1b0
       ? __ia32_sys_read+0xb0/0xb0
       ? syscall_enter_from_user_mode+0x27/0x70
       ? trace_hardirqs_on+0x1c/0x110
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7fcf8b61905f
      Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b9 fc ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 0c fd ff ff 48
      RSP: 002b:00007fcf7b3e6c30 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcf8b61905f
      RDX: 0000000000000004 RSI: 00007fcf7b3e6c78 RDI: 000000000000000c
      RBP: 00007fcf7b3e6c80 R08: 0000000000000000 R09: 00007fcf7b3e6aa8
      R10: 000000000b01c000 R11: 0000000000000293 R12: 00007ffe0c32a52e
      R13: 00007ffe0c32a52f R14: 0000000000000000 R15: 00007fcf7b3e7640
      
      Allocated by task 383:
       kasan_save_stack+0x1b/0x40
       ____kasan_kmalloc.constprop.0+0x84/0xa0
       kmem_cache_alloc+0x142/0x330
       tcm_loop_queuecommand+0x2a/0x4e0 [tcm_loop]
       scsi_queue_rq+0x12ec/0x2d20
       blk_mq_dispatch_rq_list+0x30a/0x1db0
       __blk_mq_do_dispatch_sched+0x326/0x830
       __blk_mq_sched_dispatch_requests+0x2c8/0x3f0
       blk_mq_sched_dispatch_requests+0xca/0x120
       __blk_mq_run_hw_queue+0x93/0xe0
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      Freed by task 11655:
       kasan_save_stack+0x1b/0x40
       kasan_set_track+0x1c/0x30
       kasan_set_free_info+0x20/0x30
       ____kasan_slab_free+0xec/0x120
       slab_free_freelist_hook+0x53/0x160
       kmem_cache_free+0xf4/0x5c0
       target_release_cmd_kref+0x3ea/0x9e0 [target_core_mod]
       transport_generic_free_cmd+0x28b/0x2f0 [target_core_mod]
       target_complete_ok_work+0x250/0xac0 [target_core_mod]
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      Last potentially related work creation:
       kasan_save_stack+0x1b/0x40
       kasan_record_aux_stack+0xa3/0xb0
       insert_work+0x48/0x2e0
       __queue_work+0x4e8/0xdf0
       queue_work_on+0x78/0x80
       tcmu_handle_completions+0xad0/0x1770 [target_core_user]
       tcmu_irqcontrol+0x28/0x60 [target_core_user]
       uio_write+0x155/0x230
       vfs_write+0x1ce/0x860
       ksys_write+0xe9/0x1b0
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Second to last potentially related work creation:
       kasan_save_stack+0x1b/0x40
       kasan_record_aux_stack+0xa3/0xb0
       insert_work+0x48/0x2e0
       __queue_work+0x4e8/0xdf0
       queue_work_on+0x78/0x80
       tcm_loop_queuecommand+0x1c3/0x4e0 [tcm_loop]
       scsi_queue_rq+0x12ec/0x2d20
       blk_mq_dispatch_rq_list+0x30a/0x1db0
       __blk_mq_do_dispatch_sched+0x326/0x830
       __blk_mq_sched_dispatch_requests+0x2c8/0x3f0
       blk_mq_sched_dispatch_requests+0xca/0x120
       __blk_mq_run_hw_queue+0x93/0xe0
       process_one_work+0x7b6/0x1290
       worker_thread+0x590/0xf80
       kthread+0x362/0x430
       ret_from_fork+0x22/0x30
      
      The buggy address belongs to the object at ffff88814cf79800 which belongs
      to the cache tcm_loop_cmd_cache of size 896.
      
      Link: https://lore.kernel.org/r/20210113024508.1264992-1-shinichiro.kawasaki@wdc.com
      Fixes: a3512902 ("scsi: target: tcmu: Use priv pointer in se_cmd")
      Cc: stable@vger.kernel.org # v5.9+
      Acked-by: default avatarBodo Stroesser <bostroesser@gmail.com>
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      780e1384
  5. 13 Jan, 2021 4 commits
  6. 08 Jan, 2021 6 commits
  7. 06 Jan, 2021 9 commits
  8. 04 Jan, 2021 1 commit
  9. 03 Jan, 2021 1 commit
  10. 02 Jan, 2021 3 commits
    • Linus Torvalds's avatar
      Merge tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3516bd72
      Linus Torvalds authored
      Pull s390 cleanups from Vasily Gorbik:
       "Update defconfigs and sort config select list"
      
      * tag 's390-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/Kconfig: sort config S390 select list once again
        s390: update defconfigs
      3516bd72
    • Linus Torvalds's avatar
      Merge tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d9296a7b
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a crash in intel_pstate during resume from suspend-to-RAM
        that may occur after recent changes and two resource leaks in error
        paths in the operating performance points (OPP) framework, add a new
        C-states table to intel_idle and update the cpuidle MAINTAINERS entry
        to cover the governors too.
      
        Specifics:
      
         - Fix recently introduced crash in the intel_pstate driver that
           occurs if scale-invariance is disabled during resume from
           suspend-to-RAM due to inconsistent changes of APERF or MPERF MSR
           values made by the platform firmware (Rafael Wysocki).
      
         - Fix a memory leak and add a missing clk_put() in error paths in the
           OPP framework (Quanyang Wang, Viresh Kumar).
      
         - Add new C-states table for SnowRidge processors to the intel_idle
           driver (Artem Bityutskiy).
      
         - Update the MAINTAINERS entry for cpuidle to make it clear that the
           governors are covered by it too (Lukas Bulwahn)"
      
      * tag 'pm-5.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        intel_idle: add SnowRidge C-state table
        cpufreq: intel_pstate: Fix fast-switch fallback path
        opp: Call the missing clk_put() on error
        opp: fix memory leak in _allocate_opp_table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      d9296a7b
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq' and 'pm-cpuidle' · 89ecf09e
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: intel_pstate: Fix fast-switch fallback path
      
      * pm-cpuidle:
        intel_idle: add SnowRidge C-state table
        MAINTAINERS: include governors into CPU IDLE TIME MANAGEMENT FRAMEWORK
      89ecf09e
  11. 01 Jan, 2021 4 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · eda809ae
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi).
      
        The big core two fixes are for power management ("block: Do not accept
        any requests while suspended" and "block: Fix a race in the runtime
        power management code") which finally sorts out the resume problems
        we've occasionally been having.
      
        To make the resume fix, there are seven necessary precursors which
        effectively renames REQ_PREEMPT to REQ_PM, so every "special" request
        in block is automatically a power management exempt one.
      
        All of the non-PM preempt cases are removed except for the one in the
        SCSI Parallel Interface (spi) domain validation which is a genuine
        case where we have to run requests at high priority to validate the
        bus so this becomes an autopm get/put protected request"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits)
        scsi: cxgb4i: Fix TLS dependency
        scsi: ufs: Un-inline ufshcd_vops_device_reset function
        scsi: ufs: Re-enable WriteBooster after device reset
        scsi: ufs-mediatek: Use correct path to fix compile error
        scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
        scsi: block: Do not accept any requests while suspended
        scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT
        scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE
        scsi: scsi_transport_spi: Set RQF_PM for domain validation commands
        scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT
        scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
        scsi: block: Introduce BLK_MQ_REQ_PM
        scsi: block: Fix a race in the runtime power management code
        scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers
        scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers
        scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
        scsi: ufs-pci: Fix restore from S4 for Intel controllers
        scsi: ufs-mediatek: Keep VCC always-on for specific devices
        scsi: ufs: Allow regulators being always-on
        scsi: ufs: Clear UAC for RPMB after ufshcd resets
        ...
      eda809ae
    • Linus Torvalds's avatar
      Merge tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block · 8b4805c6
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Two minor block fixes from this last week that should go into 5.11:
      
         - Add missing NOWAIT debugfs definition (Andres)
      
         - Fix kerneldoc warning introduced this merge window (Randy)"
      
      * tag 'block-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        block: add debugfs stanza for QUEUE_FLAG_NOWAIT
        fs: block_dev.c: fix kernel-doc warnings from struct block_device changes
      8b4805c6
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block · dc3e24b2
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A few fixes that should go into 5.11, all marked for stable as well:
      
         - Fix issue around identity COW'ing and users that share a ring
           across processes
      
         - Fix a hang associated with unregistering fixed files (Pavel)
      
         - Move the 'process is exiting' cancelation a bit earlier, so
           task_works aren't affected by it (Pavel)"
      
      * tag 'io_uring-5.11-2021-01-01' of git://git.kernel.dk/linux-block:
        kernel/io_uring: cancel io_uring before task works
        io_uring: fix io_sqe_files_unregister() hangs
        io_uring: add a helper for setting a ref node
        io_uring: don't assume mm is constant across submits
      dc3e24b2
    • Linus Torvalds's avatar
      depmod: handle the case of /sbin/depmod without /sbin in PATH · cedd1862
      Linus Torvalds authored
      Commit 436e980e ("kbuild: don't hardcode depmod path") stopped
      hard-coding the path of depmod, but in the process caused trouble for
      distributions that had that /sbin location, but didn't have it in the
      PATH (generally because /sbin is limited to the super-user path).
      
      Work around it for now by just adding /sbin to the end of PATH in the
      depmod.sh script.
      Reported-and-tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cedd1862
  12. 31 Dec, 2020 3 commits
  13. 30 Dec, 2020 3 commits
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client · f6e1ea19
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A fix for an edge case in MClientRequest encoding and a couple of
        trivial fixups for the new msgr2 support"
      
      * tag 'ceph-for-5.11-rc2' of git://github.com/ceph/ceph-client:
        libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE
        libceph: align session_key and con_secret to 16 bytes
        libceph: fix auth_signature buffer allocation in secure mode
        ceph: reencode gid_list when reconnecting
      f6e1ea19
    • Artem Bityutskiy's avatar
      intel_idle: add SnowRidge C-state table · 9cf93f05
      Artem Bityutskiy authored
      Add C-state table for the SnowRidge SoC which is found on Intel Jacobsville
      platforms.
      
      The following has been changed.
      
       1. C1E latency changed from 10us to 15us. It was measured using the
          open source "wult" tool (the "nic" method, 15us is the 99.99th
          percentile).
      
       2. C1E power break even changed from 20us to 25us, which may result
          in less C1E residency in some workloads.
      
       3. C6 latency changed from 50us to 130us. Measured the same way as C1E.
      
      The C6 C-state is supported only by some SnowRidge revisions, so add a C-state
      table commentary about this.
      
      On SnowRidge, C6 support is enumerated via the usual mechanism: "mwait" leaf of
      the "cpuid" instruction. The 'intel_idle' driver does check this leaf, so even
      though C6 is present in the table, the driver will only use it if the CPU does
      support it.
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9cf93f05
    • Rafael J. Wysocki's avatar
      cpufreq: intel_pstate: Fix fast-switch fallback path · be128345
      Rafael J. Wysocki authored
      When sugov_update_single_perf() falls back to the "frequency"
      path due to the missing scale-invariance, it will call
      cpufreq_driver_fast_switch() via sugov_fast_switch()
      and the driver's ->fast_switch() callback will be invoked,
      so it must not be NULL.
      
      However, after commit a365ab6b ("cpufreq: intel_pstate: Implement
      the ->adjust_perf() callback") intel_pstate sets ->fast_switch() to
      NULL when it is going to use intel_cpufreq_adjust_perf(), which is a
      mistake, because on x86 the scale-invariance may be turned off
      dynamically, so modify it to retain the original ->adjust_perf()
      callback pointer.
      
      Fixes: a365ab6b ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
      Reported-by: default avatarKenneth R. Crudup <kenny@panix.com>
      Tested-by: default avatarKenneth R. Crudup <kenny@panix.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      be128345