Commits · 596fc8adb171dce3751a359018e2ade612af8d97 · Kirill Smelkov / linux

11 May, 2022 17 commits

scsi: lpfc: Fix dmabuf ptr assignment in lpfc_ct_reject_event() · 596fc8ad

James Smart authored May 05, 2022

Upon driver receipt of a CT cmd for type = 0xFA (Management Server) and
subtype = 0x11 (Fabric Device Management Interface), the driver is
responding with garbage CT cmd data when it should send a properly formed
RJT.

The __lpfc_prep_xmit_seq64_s4() routine was using the wrong buffer for the
reject.

Fix by converting the routine to use the buffer specified in the bde within
the wqe rather than the ill-set bmp element.

Link: https://lore.kernel.org/r/20220506035519.50908-6-jsmart2021@gmail.com
Fixes: 61910d6a ("scsi: lpfc: SLI path split: Refactor CT paths")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

596fc8ad

scsi: lpfc: Inhibit aborts if external loopback plug is inserted · ead76d4c

James Smart authored May 05, 2022

After running a short external loopback test, when the external loopback is
removed and a normal cable inserted that is directly connected to a target
device, the system oops in the llpfc_set_rrq_active() routine.

When the loopback was inserted an FLOGI was transmit. As we're looped back,
we receive the FLOGI request. The FLOGI is ABTS'd as we recognize the same
wppn thus understand it's a loopback. However, as the ABTS sends address
information the port is not set to (fffffe), the ABTS is dropped on the
wire. A short 1 frame loopback test is run and completes before the ABTS
times out. The looback is unplugged and the new cable plugged in, and the
an FLOGI to the new device occurs and completes. Due to a mixup in ref
counting the completion of the new FLOGI releases the fabric ndlp. Then the
original ABTS completes and references the released ndlp generating the
oops.

Correct by no-op'ing the ABTS when in loopback mode (it will be dropped
anyway). Added a flag to track the mode to recognize when it should be
no-op'd.

Link: https://lore.kernel.org/r/20220506035519.50908-5-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ead76d4c

scsi: lpfc: Fix ndlp put following a LOGO completion · b7e952cb

James Smart authored May 05, 2022

During testing with repeated asynchronous resets of the target, an issue
was found when the driver issues a LOGO to disconnect its login and recover
all exchanges. The LOGO command takes a node reference but neglects to
remove it, keeping the node reference count artifically high.

Add a call to lpfc_nlp_put() to lpfc_nlp_logo_unreg() and move the mempool
free call to the routine exit along with the needed put. This is always
safe as this will not be the last reference removed as lpfc_unreg_rpi()
ensures there is an additional reference on the ndlp.

Link: https://lore.kernel.org/r/20220506035519.50908-4-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b7e952cb

scsi: lpfc: Fill in missing ndlp kref puts in error paths · ba3d58a1

James Smart authored May 05, 2022

Code review, following every lpfc_nlp_get() call vs calls during error
handling, discovered cases of missing put calls.

Correct by adding ndlp kref puts in the respective error paths.

Also added comments to several of the error paths to record relationships
to reference counts.

Link: https://lore.kernel.org/r/20220506035519.50908-3-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ba3d58a1

scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4() · 84c6f99e

James Smart authored May 05, 2022

The prior commit that moved from iocb elements to explicit wqe elements
missed a name change.

Correct __lpfc_sli_release_iocbq_s4() to reference wqe rather than iocb.

Link: https://lore.kernel.org/r/20220506035519.50908-2-jsmart2021@gmail.com
Fixes: a680a929 ("scsi: lpfc: SLI path split: Refactor lpfc_iocbq")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

84c6f99e

scsi: ufs: ufshpb: Clean up ufshpb_suspend()/resume() · 18ebe239

Bean Huo authored May 05, 2022

ufshpb_resume() is only called when the HPB state is HPB_SUSPEND, so the
check statement for "ufshpb_get_state(hpb) != HPB_PRESENT" is useless.

Link: https://lore.kernel.org/r/20220505134707.35929-7-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

18ebe239

scsi: ufs: ufshpb: Add handing of device reset regions in HPB device mode · 32d6eab3

Bean Huo authored May 05, 2022

In UFS HPB Spec JESD220-3A,

"5.8. Active and inactive information upon power cycle
...
When the device is powered off by the host, the device may restore L2P map
data upon power up or build from the host's HPB READ command. In case
device powered up and lost HPB information, device can signal to the host
through HPB Sense data, by setting HPB Operation as '2' which will inform
the host that device reset HPB information."

Therefore, for HPB device control mode, if the UFS device is reset via the
RST_N pin, the active region information in the device will be reset. If
the host side receives this notification from the device side, it is
recommended to inactivate all active regions in the host's HPB cache.

Link: https://lore.kernel.org/r/20220505134707.35929-6-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

32d6eab3

scsi: ufs: ufshpb: Change sysfs node hpb_stats/rb_* prefix to start with rcmd_* · d4300c55

Bean Huo authored May 05, 2022

According to the documentation of the sysfs nodes rb_noti_cnt,
rb_active_cnt and rb_inactive_cnt, these are all related to HPB
recommendation in UPIU response packet. 'rcmd' (recommendation) should be
the correct abbreviation.

Change the sysfs documentation about these sysfs nodes to highlight what
they mean under different HPB control modes.

Link: https://lore.kernel.org/r/20220505134707.35929-5-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d4300c55

scsi: ufs: ufshpb: Clean up the handler when device resets HPB information · a3f3c26d

Bean Huo authored May 05, 2022

"When the device is powered off by the host, the device may restore L2P map
data upon power up or build from the host's HPB READ command. In case
device powered up and lost HPB information, device can signal to the host
through HPB Sense data, by setting HPB Operation as '2' which will inform
the host that device reset HPB information."

Clean up the handler and make the intent of this handler more readable, no
functional change.

Link: https://lore.kernel.org/r/20220505134707.35929-4-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a3f3c26d

scsi: ufs: ufshpb: Remove enum initialization value · 6f341ed5

Bean Huo authored May 05, 2022

If the first enumerator has no initializer, the value of the corresponding
constant is zero.

Link: https://lore.kernel.org/r/20220505134707.35929-3-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6f341ed5

scsi: ufs: ufshpb: Merge ufshpb_reset() and ufshpb_reset_host() · facc239c

Bean Huo authored May 05, 2022

There is no functional change in this patch, just merge ufshpb_reset() and
ufshpb_reset_host() into one function ufshpb_toggle_state().

Link: https://lore.kernel.org/r/20220505134707.35929-2-huobean@gmail.comReviewed-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

facc239c

scsi: ufs: qcom: Enable RPM_AUTOSUSPEND for runtime PM · 6f21d927

Manivannan Sadhasivam authored May 04, 2022

In order to allow the block devices to enter autosuspend mode during
runtime, thereby allowing the ufshcd host driver to also runtime suspend,
let's make use of the RPM_AUTOSUSPEND flag.

Without this flag, userspace needs to enable the autosuspend feature of
the block devices through sysfs.

Link: https://lore.kernel.org/r/20220504084212.11605-6-manivannan.sadhasivam@linaro.orgReviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6f21d927

scsi: ufs: core: Remove redundant wmb() in ufshcd_send_command() · 23803bac

Manivannan Sadhasivam authored May 04, 2022

The wmb() inside ufshcd_send_command() is added to make sure that the
doorbell is committed immediately. This leads to couple of expectations:

1. The doorbell write should complete before the function return.

2. The doorbell write should not cross the function boundary.

2nd expectation is fullfilled by the Linux memory model as there is a
guarantee that the critical section won't cross the unlock (release)
operation.

1st expectation is not really needed here as there is no following read/
write that depends on the doorbell to be complete implicitly. Even if the
doorbell write is in a CPUs Write Buffer (WB), wmb() won't flush it. And
there is no real need of a WB flush here as well.

So let's get rid of the wmb() that seems redundant.

Link: https://lore.kernel.org/r/20220504084212.11605-5-manivannan.sadhasivam@linaro.orgReviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

23803bac

scsi: ufs: qcom: Add a readl() to make sure ref_clk gets enabled · 8eecddfc

Manivannan Sadhasivam authored May 04, 2022

In ufs_qcom_dev_ref_clk_ctrl(), it was noted that the ref_clk needs to be
stable for at least 1us. Even though there is wmb() to make sure the write
gets "completed", there is no guarantee that the write actually reached the
UFS device. There is a good chance that the write could be stored in a
Write Buffer (WB). In that case, even though the CPU waits for 1us, the
ref_clk might not be stable for that period.

So lets do a readl() to make sure that the previous write has reached the
UFS device before udelay().

Also, the wmb() after writel_relaxed() is not really needed. Both writel()
and readl() are ordered on all architectures and the CPU won't speculate
instructions after readl() due to the in-built control dependency with read
value on weakly ordered architectures. So it can be safely removed.

Link: https://lore.kernel.org/r/20220504084212.11605-4-manivannan.sadhasivam@linaro.org
Fixes: f06fcc71 ("scsi: ufs-qcom: add QUniPro hardware support and power optimizations")
Cc: stable@vger.kernel.org
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

8eecddfc

scsi: ufs: qcom: Simplify handling of devm_phy_get() · c9ed9a6c

Manivannan Sadhasivam authored May 04, 2022

There is no need to call devm_phy_get() if ACPI is used, so skip it. The
host->generic_phy pointer should already be NULL due to the kzalloc(), so
no need to set it NULL again.

While at it, also remove the comment that has no relationship with
devm_phy_get().

Link: https://lore.kernel.org/r/20220504084212.11605-3-manivannan.sadhasivam@linaro.orgReviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c9ed9a6c

scsi: ufs: qcom: Fix acquiring the optional reset control line · 223b17ed

Manivannan Sadhasivam authored May 04, 2022

On Qcom UFS platforms, the reset control line seems to be optional (for
SoCs like MSM8996 and probably for others too). The current logic tries to
mimic the devm_reset_control_get_optional() API but it also continues the
probe if there is an error with the declared reset line in DT/ACPI.

In an ideal case, if the reset line is not declared in DT/ACPI, the probe
should continue. But if there is problem in acquiring the declared reset
line (like EPROBE_DEFER) it should fail and return the appropriate error
code.

Link: https://lore.kernel.org/r/20220504084212.11605-2-manivannan.sadhasivam@linaro.orgReviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

223b17ed

scsi: hisi_sas: Undo RPM resume for failed notify phy event for v3 HW · 9b5387fe

Xiang Chen authored May 06, 2022

If we fail to notify the phy up event then undo the RPM resume, as the phy
up notify event handling pairs with that RPM resume.

Link: https://lore.kernel.org/r/1651839939-101188-1-git-send-email-john.garry@huawei.comReported-by: Yihang Li <liyihang6@hisilicon.com>
Tested-by: Yihang Li <liyihang6@hisilicon.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9b5387fe

02 May, 2022 21 commits

scsi: mpi3mr: Update driver version to 8.0.0.69.0 · f304d35e

Sumit Saxena authored Apr 29, 2022

Link: https://lore.kernel.org/r/20220429211641.642010-9-sumit.saxena@broadcom.comSigned-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f304d35e

scsi: mpi3mr: Add support for NVMe passthrough · 7dbd0dd8

Sumit Saxena authored Apr 29, 2022

Add support for management applications to send an MPI3 Encapsulated NVMe
passthru command to the NVMe devices attached to an Avenger controller.
Since the NVMe drives are exposed as SCSI devices by the controller, the
standard NVMe applications cannot be used to interact with the drives and
the command sets supported are also limited by the controller firmware.
Special handling is required for MPI3 Encapsulated NVMe passthru commands
for PRP/SGL setup in the commands.

Link: https://lore.kernel.org/r/20220429211641.642010-8-sumit.saxena@broadcom.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7dbd0dd8

scsi: mpi3mr: Expose adapter state to sysfs · 986d6bad

Sumit Saxena authored Apr 29, 2022

Link: https://lore.kernel.org/r/20220429211641.642010-7-sumit.saxena@broadcom.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

986d6bad

scsi: mpi3mr: Add support for PEL commands · 43ca1100

Sumit Saxena authored Apr 29, 2022

Implement driver support for management applications to enable persistent
event log (PEL) notifications. Upon receipt of events, the driver will
increment a sysfs variable named event_counter. The management application
will poll for event_counter value changes and signal the application about
events.

Link: https://lore.kernel.org/r/20220429211641.642010-6-sumit.saxena@broadcom.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

43ca1100

scsi: mpi3mr: Add support for MPT commands · 506bc1a0

Sumit Saxena authored Apr 29, 2022

There are certain management commands which require firmware intervention.
These commands are termed MPT commands. Add support for them.

Link: https://lore.kernel.org/r/20220429211641.642010-5-sumit.saxena@broadcom.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

506bc1a0

scsi: mpi3mr: Move data structures/definitions from MPI headers to uapi header · f3de4706

Sumit Saxena authored Apr 29, 2022

This patch moves the data structures/definitions which are used by
userspace applications from MPI headers to uapi/scsi/scsi_bsg_mpi3mr.h

Link: https://lore.kernel.org/r/20220429211641.642010-4-sumit.saxena@broadcom.com
Reported by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f3de4706

scsi: mpi3mr: Add support for driver commands · f5e6d5a3

Sumit Saxena authored Apr 29, 2022

There are certain bsg commands which need to be completed by the driver
without involving firmware. These requests are termed driver commands. Add
support for these.

Link: https://lore.kernel.org/r/20220429211641.642010-3-sumit.saxena@broadcom.com
Reported by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f5e6d5a3

scsi: mpi3mr: Add bsg device support · 4268fa75

Sumit Saxena authored Apr 29, 2022

Create bsg device per controller for controller management purposes.

bsg device nodes will be named /dev/bsg/mpi3mrctl0, /dev/bsg/mpi3mrctl1,
etc.

Link: https://lore.kernel.org/r/20220429211641.642010-2-sumit.saxena@broadcom.comReported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4268fa75

scsi: sr: Add memory allocation failure handling for get_capabilities() · ebc95c79

Enze Li authored Apr 27, 2022

The function get_capabilities() has the possibility of failing to allocate
the transfer buffer but it does not currently handle this. This may lead to
exceptions when accessing the buffer.

Add error handling when memory allocation fails.

Link: https://lore.kernel.org/r/20220427025647.298358-1-lienze@kylinos.cnSigned-off-by: Enze Li <lienze@kylinos.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ebc95c79

scsi: target: tcmu: Fix possible data corruption · bb9b9eb0

Xiaoguang Wang authored Apr 21, 2022

When tcmu_vma_fault() gets a page successfully, before the current context
completes page fault procedure, find_free_blocks() may run and call
unmap_mapping_range() to unmap the page. Assume that when
find_free_blocks() initially completes and the previous page fault
procedure starts to run again and completes, then one truncated page has
been mapped to userspace. But note that tcmu_vma_fault() has gotten a
refcount for the page so any other subsystem won't be able to use the page
unless the userspace address is unmapped later.

If another command subsequently runs and needs to extend dbi_thresh it may
reuse the corresponding slot for the previous page in data_bitmap. Then
though we'll allocate new page for this slot in data_area, no page fault
will happen because we have a valid map and the real request's data will be
lost.

Filesystem implementations will also run into this issue but they usually
lock the page when vm_operations_struct->fault gets a page and unlock the
page after finish_fault() completes. For truncate filesystems lock pages in
truncate_inode_pages() to protect against racing wrt. page faults.

To fix this possible data corruption scenario we can apply a method similar
to the filesystems. For pages that are to be freed, tcmu_blocks_release()
locks and unlocks. Make tcmu_vma_fault() also lock found page under
cmdr_lock. At the same time, since tcmu_vma_fault() gets an extra page
refcount, tcmu_blocks_release() won't free pages if pages are in page fault
procedure, which means it is safe to call tcmu_blocks_release() before
unmap_mapping_range().

With these changes tcmu_blocks_release() will wait for all page faults to
be completed before calling unmap_mapping_range(). And later, if
unmap_mapping_range() is called, it will ensure stale mappings are removed.

Link: https://lore.kernel.org/r/20220421023735.9018-1-xiaoguang.wang@linux.alibaba.comReviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

bb9b9eb0

scsi: lpfc: Remove redundant lpfc_sli_prep_wqe() call · c2024e3b

James Smart authored Apr 27, 2022

Prior patch added a call to lpfc_sli_prep_wqe() prior to
lpfc_sli_issue_iocb().  This call should not have been added as prep_wqe is
called within the issue_iocb routine. So it's called twice now.

Remove the redundant prep call.

Link: https://lore.kernel.org/r/20220427222223.57920-1-jsmart2021@gmail.com
Fixes: 31a59f75 ("scsi: lpfc: SLI path split: Refactor Abort paths")
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c2024e3b

scsi: lpfc: Fix additional reference counting in lpfc_bsg_rport_els() · 92bd903d

James Smart authored Apr 27, 2022

Code inspection has found an additional reference is taken in
lpfc_bsg_rport_els(). Results in the ndlp not being freed thus is leaked.

Fix by removing the redundant refcount taken before WQE submission.

Link: https://lore.kernel.org/r/20220427222158.57867-1-jsmart2021@gmail.comCo-developed-by: Nigel Kirkland <nigel.kirkland@broadcom.com>
Signed-off-by: Nigel Kirkland <nigel.kirkland@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

92bd903d

scsi: sd: Reorganize DIF/DIX code to avoid calling revalidate twice · 1e029397

Martin K. Petersen authored Mar 02, 2022

During device discovery we ended up calling revalidate twice and thus
requested the same parameters multiple times. This was originally necessary
due to the request_queue and gendisk needing to be instantiated to
configure the block integrity profile.

Since this dependency no longer exists, reorganize the integrity probing
code so it can be run once at the end of discovery and drop the superfluous
revalidate call. Postponing the registration step involves splitting
sd_read_protection() into two functions, one to read the device protection
type and one to configure the mode of operation.

As part of this cleanup, make the printing code a bit less verbose.

Link: https://lore.kernel.org/r/20220302053559.32147-14-martin.petersen@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1e029397

scsi: sd: Optimal I/O size should be a multiple of reported granularity · 631669a2

Martin K. Petersen authored Mar 02, 2022

Commit a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple of
physical block size") validated the reported optimal I/O size against the
physical block size to overcome problems with devices reporting nonsensical
transfer sizes.

However, some devices claim conformity to older SCSI versions that predate
the physical block size being reported. Other devices do not report a
physical block size at all. We need to be able to validate the optimal I/O
size on those devices as well.

Many devices report an OPTIMAL TRANSFER LENGTH GRANULARITY in the same VPD
page as the OPTIMAL TRANSFER LENGTH. Use this value to validate the optimal
I/O size. Also check that the reported granularity is a multiple of the
physical block size, if supported.

Link: https://lore.kernel.org/r/33fb522e-4f61-1b76-914f-c9e6a3553c9b@gmail.com
Link: https://lore.kernel.org/r/20220302053559.32147-9-martin.petersen@oracle.comReported-by: Bernhard Sulzer <micraft.b@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

631669a2

scsi: sd: Switch to using scsi_device VPD pages · 7fb019c4

Martin K. Petersen authored Mar 02, 2022

Use the VPD pages already provided by the SCSI midlayer. No need to request
them individually in the SCSI disk driver.

Link: https://lore.kernel.org/r/20220302053559.32147-8-martin.petersen@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7fb019c4

scsi: sd: Use cached ATA Information VPD page · e38d9e83

Martin K. Petersen authored Mar 02, 2022

Since the ATA Information VPD is now cached at device discovery time it is
no longer necessary to request this page when we configure WRITE SAME.
Instead use the cached information to determine if this disk sits behind a
SCSI-ATA translation layer.

Link: https://lore.kernel.org/r/20220302053559.32147-7-martin.petersen@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e38d9e83

scsi: core: Do not truncate INQUIRY data on modern devices · d657700c

Martin K. Petersen authored Mar 02, 2022

Low-level device drivers have had the ability to limit the size of an
INQUIRY for many years. This made sense for a wide variety of legacy
devices. However, we are unnecessarily truncating the INQUIRY response for
many modern devices. This prevents us from consulting fields beyond the
first 36 bytes.

If a device reports that it supports a larger INQUIRY response, and the
device also reports that it implements SPC-4 or newer, allow the larger
INQUIRY to proceed.

Link: https://lore.kernel.org/r/20220302053559.32147-4-martin.petersen@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d657700c

scsi: core: Cache VPD pages b0, b1, b2 · e60ac0b9

Martin K. Petersen authored Mar 02, 2022

The SCSI disk driver consults VPD pages b0 (Block Limits), b1 (Block Device
Characteristics), and b2 (Logical Block Provisioning). Instead of having
sd.c request these pages every revalidate cycle, cache them along with the
other commonly used VPDs.

Link: https://lore.kernel.org/r/20220302053559.32147-6-martin.petersen@oracle.comReviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e60ac0b9

scsi: core: Pick suitable allocation length in scsi_report_opcode() · e17d6340

Martin K. Petersen authored Mar 02, 2022

Some devices hang when a buffer size larger than expected is passed in the
ALLOCATION LENGTH field. For REPORT SUPPORTED OPERATION CODES we currently
only request a single command descriptor at a time and therefore the actual
size of the command is known ahead of time. Limit the ALLOCATION LENGTH to
the header size plus the command length of the opcode we are asking about.

Link: https://lore.kernel.org/r/20220302053559.32147-5-martin.petersen@oracle.comReviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e17d6340

scsi: core: Query VPD size before getting full page · c92a6b5d

Martin K. Petersen authored Mar 02, 2022

We currently default to 255 bytes when fetching VPD pages during discovery.
However, we have had a few devices that are known to wedge if the requested
buffer exceeds a certain size. See commit af73623f ("[SCSI] sd: Reduce
buffer size for vpd request") which works around one example of this
problem in the SCSI disk driver.

With commit d188b067 ("scsi: core: Add sysfs attributes for VPD pages
0h and 89h") we now risk triggering the same issue in the generic midlayer
code.

The problem with the ATA VPD page in particular is that the SCSI portion of
the page is trailed by 512 bytes of verbatim ATA Identify Device
information. However, not all controllers actually provide the additional
512 bytes and will lock up if one asks for more than the 64 bytes
containing the SCSI protocol fields.

Instead of picking a new, somewhat arbitrary, number of bytes for the VPD
buffer size, start fetching the 4-byte header for each page. The header
contains the size of the page as far as the device is concerned. We can use
the reported size to specify the correct allocation length when
subsequently fetching the full page.

The header validation is done by a new helper function scsi_get_vpd_size()
and both scsi_get_vpd_page() and scsi_get_vpd_buf() now rely on this to
query the page size.

In addition, scsi_get_vpd_page() is simplified to mirror the logic in
scsi_get_vpd_page(). This involves removing the Supported VPD Pages lookup
prior to attempting to query a page. There does not appear any evidence,
even in the oldest SCSI specs, that this step is required. We already rely
on scsi_get_vpd_page() throughout the stack and this function never
consulted the Supported VPD Pages. Since this has not caused any problems
it should be safe to remove the precondition from scsi_get_vpd_page().

Instrumented runs also revealed that the Supported VPD Pages lookup had
little effect since the device page index often was larger than the
supplied buffer size. As a result, inquiries frequently bypassed the index
check and went through the "If we ran off the end of the buffer, give us
the benefit of the doubt" code path which assumed the page was present
despite not being listed. The revised code takes both the page size
reported by the device as well as the size of the buffer provided by the
scsi_get_vpd_page() caller into account.

Link: https://lore.kernel.org/r/20220302053559.32147-3-martin.petersen@oracle.com
Fixes: d188b067 ("scsi: core: Add sysfs attributes for VPD pages 0h and 89h")
Reported-by: Maciej W. Rozycki <macro@orcam.me.uk>
Tested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c92a6b5d

scsi: mpt3sas: Use cached ATA Information VPD page · dc117876

Martin K. Petersen authored Mar 02, 2022

We now cache VPD page 0x89 (ATA Information) so there is no need to request
it from the hardware. Make mpt3sas use the cached page.

Link: https://lore.kernel.org/r/20220302053559.32147-2-martin.petersen@oracle.com
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

dc117876

27 Apr, 2022 2 commits

scsi: lpfc: Fix resource leak in lpfc_sli4_send_seq_to_ulp() · 646db1a5

James Smart authored Apr 26, 2022

If no handler is found in lpfc_complete_unsol_iocb() to match the rctl of a
received frame, the frame is dropped and resources are leaked.

Fix by returning resources when discarding an unhandled frame type. Update
lpfc_fc_frame_check() handling of NOP basic link service.

Link: https://lore.kernel.org/r/20220426181419.9154-1-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

646db1a5

scsi: lpfc: Remove unnecessary null ndlp check in lpfc_sli_prep_wqe() · 3d1d34ec

James Smart authored Apr 26, 2022

Smatch had the following warning:

drivers/scsi/lpfc/lpfc_sli.c:22305 lpfc_sli_prep_wqe() error: we previously assumed 'ndlp' could be null (see line 22298)

Remove the unnecessary null check.

Link: https://lore.kernel.org/r/20220426181315.8990-1-jsmart2021@gmail.com
Fixes: d51cf5bd ("scsi: lpfc: Fix field overload in lpfc_iocbq data structure")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3d1d34ec