Commits · e31e18128eb9dbcda8c169cb33421ae4813afa71 · Kirill Smelkov / linux

23 Dec, 2021 7 commits

scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host · e31e1812

Xiang Chen authored Dec 20, 2021

If a new disk is inserted through an expander when the host was suspended,
it will not necessarily be detected as the topology is not re-scanned
during resume. To detect possible changes in topology during suspension,
insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a
revalidation.

Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.comReviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e31e1812

scsi: mvsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list · 133b688b

Xiang Chen authored Dec 20, 2021

phy_list_lock is not held when using asd_sas_port->phy_list in the mvsas
driver. Add spin_lock/unlock in those places.

Link: https://lore.kernel.org/r/1639999298-244569-7-git-send-email-chenxiang66@hisilicon.comSigned-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

133b688b

scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list · 29e2bac8

Xiang Chen authored Dec 20, 2021

Most places that use asd_sas_port->phy_list are protected by spinlock
asd_sas_port->phy_list_lock, however there are still some places which miss
grabbing the lock. Add it in function hisi_sas_refresh_port_id() when
accessing asd_sas_port->phy_list. This carries a risk that list mutates
while at the same time dropping the lock in function
hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of
accessing asd_sas_port->phy_list to avoid this risk.

Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.comAcked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

29e2bac8

scsi: libsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list · 42159d3c

Xiang Chen authored Dec 20, 2021

Most places that use asd_sas_port->phy_list in libsas are protected by
spinlock asd_sas_port->phy_list_lock. However, there are still a few places
which miss the lock. Add it in those places.

Link: https://lore.kernel.org/r/1639999298-244569-5-git-send-email-chenxiang66@hisilicon.comReviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

42159d3c

scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() · 6e1fcab0

Alan Stern authored Dec 20, 2021

John Garry reported a deadlock that occurs when trying to access a
runtime-suspended SATA device. For obscure reasons, the rescan procedure
causes the link to be hard-reset, which disconnects the device.

The rescan tries to carry out a runtime resume when accessing the device.
scsi_rescan_device() holds the SCSI device lock and won't release it until
it can put commands onto the device's block queue. This can't happen until
the queue is successfully runtime-resumed or the device is unregistered.
But the runtime resume fails because the device is disconnected, and
__scsi_remove_device() can't do the unregistration because it can't get the
device lock.

The best way to resolve this deadlock appears to be to allow the block
queue to start running again even after an unsuccessful runtime resume.
The idea is that the driver or the SCSI error handler will need to be able
to use the queue to resolve the runtime resume failure.

This patch removes the err argument to blk_post_runtime_resume() and makes
the routine act as though the resume was successful always. This fixes the
deadlock.

Link: https://lore.kernel.org/r/1639999298-244569-4-git-send-email-chenxiang66@hisilicon.com
Fixes: e27829dc ("scsi: serialize ->rescan against ->remove")
Reported-and-tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6e1fcab0

scsi: Revert "scsi: hisi_sas: Filter out new PHY up events during suspend" · 6cc73908

John Garry authored Dec 20, 2021

This reverts commit b14a37e0.

In that commit, we had to filter out phy-up events during suspend, as it
work cause a deadlock between processing the phyup event and the resume HA
function try to drain the HA event workqueue to complete the resume
process.

Now that we no longer try to drain the HA event queue during the HA resume
processor, the deadlock would not occur, so remove the special handling for
it.

Link: https://lore.kernel.org/r/1639999298-244569-3-git-send-email-chenxiang66@hisilicon.comSigned-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6cc73908

scsi: libsas: Don't always drain event workqueue for HA resume · fbefe228

John Garry authored Dec 20, 2021

For the hisi_sas driver, if a directly attached disk is removed during
suspend, a hang will occur in the resume process:

The background is that in commit 16fd4a7c ("scsi: hisi_sas: Add device
link between SCSI devices and hisi_hba"), it is ensured that the HBA device
cannot be runtime suspended when any SCSI device associated is active.

Other drivers which use libsas don't worry about this as none support
runtime suspend.

The mentioned hang occurs when an disk is removed during suspend. In the
removal process - from PHYE_RESUME_TIMEOUT event processing - we call into
scsi_remove_device(), which is being processed in the HA event workqueue.
Here we wait for all suppliers of the SCSI device to resume, which includes
the HBA device (from the above commit). However the HBA device cannot
resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from
calling sas_resume_ha() -> sas_drain_work()). This is the deadlock.

There does not appear to be any need for the sas_drain_work() to be called
at all in sas_resume_ha() as it is not syncing against anything, so allow
LLDDs to avoid this by providing a variant of sas_resume_ha() which does
"sync", i.e. doesn't drain the event workqueue.

Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.comSigned-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fbefe228

17 Dec, 2021 12 commits

scsi: libsas: Decode SAM status and host byte codes · 4be6181f

John Garry authored Dec 15, 2021

Value 0 is used for SAM status and libsas exec_status bytes codes in
sas_end_task() - use defined macros instead. In addition, change to proper
enum types.

Also replace SAM_STAT_CHECK_CONDITION with SAS_SAM_STAT_CHECK_CONDITION,
the former being a proper member of enum exec_status.

Link: https://lore.kernel.org/r/1639579061-179473-9-git-send-email-john.garry@huawei.comSigned-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4be6181f

scsi: hisi_sas: Fix phyup timeout on FPGA · 37310bad

Qi Liu authored Dec 15, 2021

The OOB interrupt and phyup interrupt handlers may run out-of-order in high
CPU usage scenarios. Since the hisi_sas_phy.timer is added in
hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this out-of-order
execution will cause hisi_sas_phy.timer timeout to trigger.

To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure
that the timer won't be added after phyup handler completes.

Link: https://lore.kernel.org/r/1639579061-179473-8-git-send-email-john.garry@huawei.comSigned-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

37310bad

scsi: hisi_sas: Prevent parallel FLR and controller reset · 16775db6

Qi Liu authored Dec 15, 2021

If we issue a controller reset command during executing a FLR a hung task
may be found:

 Call trace:
  __switch_to+0x158/0x1cc
  __schedule+0x2e8/0x85c
  schedule+0x7c/0x110
  schedule_timeout+0x190/0x1cc
  __down+0x7c/0xd4
  down+0x5c/0x7c
  hisi_sas_task_exec+0x510/0x680 [hisi_sas_main]
  hisi_sas_queue_command+0x24/0x30 [hisi_sas_main]
  smp_execute_task_sg+0xf4/0x23c [libsas]
  sas_smp_phy_control+0x110/0x1e0 [libsas]
  transport_sas_phy_reset+0xc8/0x190 [libsas]
  phy_reset_work+0x2c/0x40 [libsas]
  process_one_work+0x1dc/0x48c
  worker_thread+0x15c/0x464
  kthread+0x160/0x170
  ret_from_fork+0x10/0x18

This is a race condition which occurs when the FLR completes first.

Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as
HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held, so
now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem .

Link: https://lore.kernel.org/r/1639579061-179473-7-git-send-email-john.garry@huawei.comSigned-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

16775db6

scsi: hisi_sas: Prevent parallel controller reset and control phy command · 20c63493

Qi Liu authored Dec 15, 2021

A user may issue a control phy command from sysfs at any time, even if the
controller is resetting.

If a phy is disabled by hardreset/linkreset command before calling
get_phys_state() in the reset path, the saved phy state may be incorrect.

To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure
that the controller reset may not run at the same time as when the phy
control function is running.

Link: https://lore.kernel.org/r/1639579061-179473-6-git-send-email-john.garry@huawei.comSigned-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

20c63493

scsi: hisi_sas: Factor out task prep and delivery code · dc313f6b

John Garry authored Dec 15, 2021

The task prep code is the same between the normal path (in
hisi_sas_task_prep()) and the internal abort path, so factor is out into a
common function.

Link: https://lore.kernel.org/r/1639579061-179473-5-git-send-email-john.garry@huawei.comSigned-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

dc313f6b

scsi: hisi_sas: Pass abort structure for internal abort · 08c61b5d

John Garry authored Dec 15, 2021

To help factor out code in future, it's useful to know if we're executing
an internal abort, so pass a pointer to the structure. The idea is that a
NULL pointer means not an internal abort.

Link: https://lore.kernel.org/r/1639579061-179473-4-git-send-email-john.garry@huawei.comReviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

08c61b5d

scsi: hisi_sas: Make internal abort have no task proto · 934385a4

John Garry authored Dec 15, 2021

For an internal abort, the task does not have a protocol, so set to none.

This will make it easier to differentiate internal abort tasks in future.

Link: https://lore.kernel.org/r/1639579061-179473-3-git-send-email-john.garry@huawei.comReviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

934385a4

scsi: hisi_sas: Start delivery hisi_sas_task_exec() directly · 0e462085

John Garry authored Dec 15, 2021

Currently we start delivery of commands to the DQ after returning from
hisi_sas_task_exec() with success.

Let's just start delivery directly in that function without having to check
if some local variable is set.

Link: https://lore.kernel.org/r/1639579061-179473-2-git-send-email-john.garry@huawei.comReviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

0e462085

scsi: efct: Don't pass GFP_DMA to dma_alloc_coherent() · efac162a

Christoph Hellwig authored Dec 14, 2021

dma_alloc_coherent() ignores the zone specifiers so this is pointless and
confusing.

Link: https://lore.kernel.org/r/20211214163605.416288-1-hch@lst.deReviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

efac162a

scsi: ufs: core: Fix deadlock issue in ufshcd_wait_for_doorbell_clr() · 99c66a88

Bean Huo authored Dec 14, 2021

Call shost_for_each_device() with holding host->host_lock will cause a
deadlock situation, which will cause the system to stall (the log as
follow). Fix this issue by using __shost_for_each_device() in
ufshcd_pending_cmds().

stalls on CPUs/tasks:
all trace:
__switch_to+0x120/0x170
0xffff800011643998
ask dump for CPU 5:
ask:kworker/u16:2   state:R  running task     stack:    0 pid:   80 ppid:     2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
0x0
ask dump for CPU 6:
ask:kworker/u16:6   state:R  running task     stack:    0 pid:  164 ppid:     2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
0xffff54e7c4429f80
ask dump for CPU 7:
ask:kworker/u16:4   state:R  running task     stack:    0 pid:  153 ppid:     2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
blk_mq_run_hw_queue+0x34/0x110
blk_mq_sched_insert_request+0xb0/0x120
blk_execute_rq_nowait+0x68/0x88
blk_execute_rq+0x4c/0xd8
__scsi_execute+0xec/0x1d0
scsi_vpd_inquiry+0x84/0xf0
scsi_get_vpd_buf+0x34/0xb8
scsi_attach_vpd+0x34/0x140
scsi_probe_and_add_lun+0xa6c/0xab8
__scsi_scan_target+0x438/0x4f8
scsi_scan_channel+0x6c/0xa8
scsi_scan_host_selected+0xf0/0x150
do_scsi_scan_host+0x88/0x90
scsi_scan_host+0x1b4/0x1d0
ufshcd_async_scan+0x248/0x310
async_run_entry_fn+0x30/0x178
process_one_work+0x1e8/0x368
worker_thread+0x40/0x478
kthread+0x174/0x180
ret_from_fork+0x10/0x20

Link: https://lore.kernel.org/r/20211214120537.531628-1-huobean@gmail.com
Fixes: 8d077ede ("scsi: ufs: Optimize the command queueing code")
Reported-by: YongQin Liu <yongqin.liu@linaro.org>
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

99c66a88

scsi: qla2xxx: Synchronize rport dev_loss_tmo setting · baea0e83

Hannes Reinecke authored Dec 14, 2021

Currently, the dev_loss_tmo setting is only ever used for SCSI
devices. This patch reshuffles initialisation such that the SCSI remote
ports are registered before the NVMe ones, allowing the dev_loss_tmo
setting to be synchronized between SCSI and NVMe.

Link: https://lore.kernel.org/r/20211214111139.52503-1-dwagner@suse.deReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

baea0e83

Merge branch '5.16/scsi-fixes' into 5.17/scsi-staging · 87f77d37

Martin K. Petersen authored Dec 16, 2021

Pull in the 5.16 fixes branch to resolve a conflict in the UFS driver
core.

Conflicts:
	drivers/scsi/ufs/ufshcd.c
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

87f77d37

14 Dec, 2021 3 commits

scsi: hpsa: Remove an unused variable in hpsa_update_scsi_devices() · 8c2d0455

Christophe JAILLET authored Dec 09, 2021

'lunzerobits' is unused. Remove it.

This a left over of commit 2d62a33e ("hpsa: eliminate fake lun0
enclosures")

Link: https://lore.kernel.org/r/9f80ea569867b5f7ae1e0f99d656e5a8bacad34e.1639084205.git.christophe.jaillet@wanadoo.frSigned-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8c2d0455

scsi: lpfc: Use struct_group to isolate cast to larger object · c167dd0b

Kees Cook authored Dec 03, 2021

When building under -Warray-bounds, a warning is generated when casting a
u32 into MAILBOX_t (which is larger). This warning is conservative, but
it's not an unreasonable change to make to improve future robustness. Use a
tagged struct_group that can refer to either the specific fields or the
first u32 separately, silencing this warning:

drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_reset_barrier':
drivers/scsi/lpfc/lpfc_sli.c:4787:29: error: array subscript 'MAILBOX_t[0]' is partly outside array bounds of 'volatile uint32_t[1]' {aka 'volatile unsigned int[1]'} [-Werror=array-bounds]
 4787 |         ((MAILBOX_t *)&mbox)->mbxCommand = MBX_KILL_BOARD;
      |                             ^~
drivers/scsi/lpfc/lpfc_sli.c:4752:27: note: while referencing 'mbox'
 4752 |         volatile uint32_t mbox;
      |                           ^~~~

There is no change to the resulting executable instruction code.

Link: https://lore.kernel.org/r/20211203223351.107323-1-keescook@chromium.orgReviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c167dd0b

scsi: lpfc: Use struct_group() to initialize struct lpfc_cgn_info · 532adda9

Kees Cook authored Dec 08, 2021

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add struct_group() to mark "stat" region of struct lpfc_cgn_info that
should be initialized to zero, and refactor the "data" region memset()
to wipe everything up to the cgn_stats region.

Link: https://lore.kernel.org/r/20211208195957.1603092-1-keescook@chromium.orgReviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

532adda9

07 Dec, 2021 18 commits

scsi: qla2xxx: Format log strings only if needed · 69002c8c

Roman Bolshakov authored Nov 12, 2021

Commit 598a90f2 ("scsi: qla2xxx: add ring buffer for tracing debug
logs") introduced unconditional log string formatting to ql_dbg() even if
ql_dbg_log event is disabled. It harms performance because some strings are
formatted in fastpath and/or interrupt context.

Link: https://lore.kernel.org/r/20211112145446.51210-1-r.bolshakov@yadro.com
Fixes: 598a90f2 ("scsi: qla2xxx: add ring buffer for tracing debug logs")
Cc: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

69002c8c

scsi: lpfc: Update lpfc version to 14.0.0.4 · 4437503b

James Smart authored Dec 03, 2021

Update lpfc version to 14.0.0.4.

Link: https://lore.kernel.org/r/20211204002644.116455-10-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4437503b

scsi: lpfc: Add additional debugfs support for CMF · 6014a246

James Smart authored Dec 03, 2021

Dump raw CMF parameter information in debugfs cgn_buffer.

Link: https://lore.kernel.org/r/20211204002644.116455-9-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6014a246

scsi: lpfc: Cap CMF read bytes to MBPI · 05116ef9

James Smart authored Dec 03, 2021

Ensure read bytes data does not go over MBPI for CMF timer intervals that
are purposely shortened.

Link: https://lore.kernel.org/r/20211204002644.116455-8-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

05116ef9

scsi: lpfc: Adjust CMF total bytes and rxmonitor · a6269f83

James Smart authored Dec 03, 2021

Calculate any extra bytes needed to account for timer accuracy. If we are
less than LPFC_CMF_INTERVAL, then calculate the adjustment needed for total
to reflect a full LPFC_CMF_INTERVAL.

Add additional info to rxmonitor, and adjust some log formatting.

Link: https://lore.kernel.org/r/20211204002644.116455-7-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a6269f83

scsi: lpfc: Trigger SLI4 firmware dump before doing driver cleanup · 7dd2e2a9

James Smart authored Dec 03, 2021

Extraneous teardown routines are present in the firmware dump path causing
altered states in firmware captures.

When a firmware dump is requested via sysfs, trigger the dump immediately
without tearing down structures and changing adapter state.

The driver shall rely on pre-existing firmware error state clean up
handlers to restore the adapter.

Link: https://lore.kernel.org/r/20211204002644.116455-6-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7dd2e2a9

scsi: lpfc: Fix NPIV port deletion crash · 8ed190a9

James Smart authored Dec 03, 2021

The driver is calling schedule_timeout after the DA_ID nameserver request
and LOGO commands are issued to the fabric by the initiator virtual
endport. These fixed delay functions are causing long delays in the
driver's worker thread when processing discovery I/Os in a serialized
fashion, which is then triggering mailbox timeout errors artificially.

To fix this, don't wait on the DA_ID request to complete and call
wait_event_timeout to allow the vport delete thread to make progress on an
event driven basis rather than fixing the wait time.

Link: https://lore.kernel.org/r/20211204002644.116455-5-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8ed190a9

scsi: lpfc: Fix lpfc_force_rscn ndlp kref imbalance · 7576d48c

James Smart authored Dec 03, 2021

Issuing lpfc_force_rscn twice results in an ndlp kref use-after-free call
trace.

A prior patch reworked the get/put handling by ensuring nlp_get was done
before WQE submission and a put was done in the completion path.
Unfortunately, the issue_els_rscn path had a piece of legacy code that did
a nlp_put, causing an imbalance on the ref counts.

Fixed by removing the unnecessary legacy code snippet.

Link: https://lore.kernel.org/r/20211204002644.116455-4-jsmart2021@gmail.com
Fixes: 4430f7fd ("scsi: lpfc: Rework locations of ndlp reference taking")
Cc: <stable@vger.kernel.org> # v5.11+
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7576d48c

scsi: lpfc: Change return code on I/Os received during link bounce · 2e81b1a3

James Smart authored Dec 03, 2021

During heavy I/O testing with issue_lip to bounce the link, occasionally
I/O is terminated with status 3 result 9, which means the RPI is suspended.
The I/O is completed and this type of error will result in immediate retry
by the SCSI layer. The retry count expires and the I/O fails and returns
error to the application.

To avoid these quick retry/retries exhausted scenarios change the return
code given to the midlayer to DID_REQUEUE rather than DID_ERROR. This gets
them retried, and eventually succeed when the link recovers.

Link: https://lore.kernel.org/r/20211204002644.116455-3-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

2e81b1a3

scsi: lpfc: Fix leaked lpfc_dmabuf mbox allocations with NPIV · f0d39196

James Smart authored Dec 03, 2021

During rmmod testing, messages appeared indicating lpfc_mbuf_pool entries
were still busy. This situation was only seen doing rmmod after at least 1
vport (NPIV) instance was created and destroyed. The number of messages
scaled with the number of vports created.

When a vport is created, it can receive a PLOGI from another initiator
Nport. When this happens, the driver prepares to ack the PLOGI and
prepares an RPI for registration (via mbx cmd) which includes an mbuf
allocation. During the unsolicited PLOGI processing and after the RPI
preparation, the driver recognizes it is one of the vport instances and
decides to reject the PLOGI. During the LS_RJT preparation for the PLOGI,
the mailbox struct allocated for RPI registration is freed, but the mbuf
that was also allocated is not released.

Fix by freeing the mbuf with the mailbox struct in the LS_RJT path.

As part of the code review to figure the issue out a couple of other areas
where found that also would not have released the mbuf. Those are cleaned
up as well.

Link: https://lore.kernel.org/r/20211204002644.116455-2-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f0d39196

scsi: ufs: Implement polling support · eaab9b57

Bart Van Assche authored Dec 03, 2021

The time spent in io_schedule() and also the interrupt latency are
significant when submitting direct I/O to a UFS device. Hence this patch
that implements polling support. User space software can enable polling by
passing the RWF_HIPRI flag to the preadv2() system call or the
IORING_SETUP_IOPOLL flag to the io_uring interface.

Although the block layer supports to partition the tag space for
interrupt-based completions (HCTX_TYPE_DEFAULT) purposes and polling
(HCTX_TYPE_POLL), the choice has been made to use the same hardware queue
for both hctx types because partitioning the tag space would negatively
affect performance.

On my test setup this patch increases IOPS from 2736 to 22000 (8x) for the
following test:

for hipri in 0 1; do
    fio --ioengine=io_uring --iodepth=1 --rw=randread \
    --runtime=60 --time_based=1 --direct=1 --name=qd1 \
    --filename=/dev/block/sda --ioscheduler=none --gtod_reduce=1 \
    --norandommap --hipri=$hipri
done

Link: https://lore.kernel.org/r/20211203231950.193369-18-bvanassche@acm.orgTested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

eaab9b57

scsi: ufs: Optimize the command queueing code · 8d077ede

Bart Van Assche authored Dec 03, 2021

Remove the clock scaling lock from ufshcd_queuecommand() since it is a
performance bottleneck. Instead check the SCSI device budget bitmaps in the
code that waits for ongoing ufshcd_queuecommand() calls. A bit is set in
sdev->budget_map just before scsi_queue_rq() is called and a bit is cleared
from that bitmap if scsi_queue_rq() does not submit the request or after
the request has finished. See also the blk_mq_{get,put}_dispatch_budget()
calls in the block layer.

There is no risk for a livelock since the block layer delays queue reruns
if queueing a request fails because the SCSI host has been blocked.

Link: https://lore.kernel.org/r/20211203231950.193369-17-bvanassche@acm.org
Cc: Asutosh Das (asd) <asutoshd@codeaurora.org>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8d077ede

scsi: ufs: Stop using the clock scaling lock in the error handler · 5675c381

Bart Van Assche authored Dec 03, 2021

Instead of locking and unlocking the clock scaling lock, surround the
command queueing code with an RCU reader lock and call synchronize_rcu().
This patch prepares for removal of the clock scaling lock.

Link: https://lore.kernel.org/r/20211203231950.193369-16-bvanassche@acm.orgTested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

5675c381

scsi: ufs: Fix a kernel crash during shutdown · 3489c34b

Bart Van Assche authored Dec 03, 2021

Fix the following kernel crash:

Unable to handle kernel paging request at virtual address ffffffc91e735000
Call trace:
 __queue_work+0x26c/0x624
 queue_work_on+0x6c/0xf0
 ufshcd_hold+0x12c/0x210
 __ufshcd_wl_suspend+0xc0/0x400
 ufshcd_wl_shutdown+0xb8/0xcc
 device_shutdown+0x184/0x224
 kernel_restart+0x4c/0x124
 __arm64_sys_reboot+0x194/0x264
 el0_svc_common+0xc8/0x1d4
 do_el0_svc+0x30/0x8c
 el0_svc+0x20/0x30
 el0_sync_handler+0x84/0xe4
 el0_sync+0x1bc/0x1c0

Fix this crash by ungating the clock before destroying the work queue on
which clock gating work is queued.

Link: https://lore.kernel.org/r/20211203231950.193369-15-bvanassche@acm.orgTested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3489c34b

scsi: ufs: Improve SCSI abort handling further · 1fbaa02d

Bart Van Assche authored Dec 03, 2021

Release resources when aborting a command. Make sure that aborted commands
are completed once by clearing the corresponding tag bit from
hba->outstanding_reqs. This patch is an improved version of commit
3ff1f6b6 ("scsi: ufs: core: Improve SCSI abort handling").

Link: https://lore.kernel.org/r/20211203231950.193369-14-bvanassche@acm.org
Fixes: 7a3e97b0 ("[SCSI] ufshcd: UFS Host controller driver")
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1fbaa02d

scsi: ufs: Introduce ufshcd_release_scsi_cmd() · 6f8dafde

Bart Van Assche authored Dec 03, 2021

The only functional change in this patch is that scsi_done() is now called
after ufshcd_release() and ufshcd_clk_scaling_update_busy() instead of
before.

The next patch in this series will introduce a call to
ufshcd_release_scsi_cmd() in the abort handler.

Link: https://lore.kernel.org/r/20211203231950.193369-13-bvanassche@acm.orgTested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6f8dafde

scsi: ufs: Remove the 'update_scaling' local variable · 3eb9dcc0

Bart Van Assche authored Dec 03, 2021

This patch does not change any functionality but makes the next patch in
this series easier to read.

Link: https://lore.kernel.org/r/20211203231950.193369-12-bvanassche@acm.orgTested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3eb9dcc0

scsi: ufs: Remove hba->cmd_queue · 511a083b

Bart Van Assche authored Dec 03, 2021

The previous patch removed all code that uses hba->cmd_queue. Hence also
remove hba->cmd_queue itself.

Link: https://lore.kernel.org/r/20211203231950.193369-11-bvanassche@acm.orgSuggested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

511a083b