Commits · 626bac73371eed79e2afa2966de393da96cf925e · nexedi / linux

27 Mar, 2020 5 commits

scsi: target: iscsi: calling iscsit_stop_session() inside iscsit_close_session() has no effect · 626bac73

Maurizio Lombardi authored Mar 13, 2020

iscsit_close_session() can only be called when nconn is zero (otherwise a
kernel panic is triggered). If nconn is zero then iscsit_stop_session()
does nothing and exits, so calling it makes no sense.

We still need to call iscsit_check_session_usage_count() because this
function will sleep if the session's refcount is not zero and we don't want
to destroy the session structure if it's still being referenced.

Link: https://lore.kernel.org/r/20200313170656.9716-4-mlombard@redhat.comTested-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

626bac73

scsi: target: fix hang when multiple threads try to destroy the same iscsi session · 57c46e9f

Maurizio Lombardi authored Mar 13, 2020

A number of hangs have been reported against the target driver; they are
due to the fact that multiple threads may try to destroy the iscsi session
at the same time. This may be reproduced for example when a "targetcli
iscsi/iqn.../tpg1 disable" command is executed while a logout operation is
underway.

When this happens, two or more threads may end up sleeping and waiting for
iscsit_close_connection() to execute "complete(session_wait_comp)". Only
one of the threads will wake up and proceed to destroy the session
structure, the remaining threads will hang forever.

Note that if the blocked threads are somehow forced to wake up with
complete_all(), they will try to free the same iscsi session structure
destroyed by the first thread, causing double frees, memory corruptions
etc...

With this patch, the threads that want to destroy the iscsi session will
increase the session refcount and will set the "session_close" flag to 1;
then they wait for the driver to close the remaining active connections.
When the last connection is closed, iscsit_close_connection() will wake up
all the threads and will wait for the session's refcount to reach zero;
when this happens, iscsit_close_connection() will destroy the session
structure because no one is referencing it anymore.

INFO: task targetcli:5971 blocked for more than 120 seconds.
Tainted: P OE 4.15.0-72-generic #81~16.04.1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
targetcli D 0 5971 1 0x00000080
Call Trace:
__schedule+0x3d6/0x8b0
? vprintk_func+0x44/0xe0
schedule+0x36/0x80
schedule_timeout+0x1db/0x370
? __dynamic_pr_debug+0x8a/0xb0
wait_for_completion+0xb4/0x140
? wake_up_q+0x70/0x70
iscsit_free_session+0x13d/0x1a0 [iscsi_target_mod]
iscsit_release_sessions_for_tpg+0x16b/0x1e0 [iscsi_target_mod]
iscsit_tpg_disable_portal_group+0xca/0x1c0 [iscsi_target_mod]
lio_target_tpg_enable_store+0x66/0xe0 [iscsi_target_mod]
configfs_write_file+0xb9/0x120
__vfs_write+0x1b/0x40
vfs_write+0xb8/0x1b0
SyS_write+0x5c/0xe0
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Link: https://lore.kernel.org/r/20200313170656.9716-3-mlombard@redhat.comReported-by: Matt Coleman <mcoleman@datto.com>
Tested-by: Matt Coleman <mcoleman@datto.com>
Tested-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

57c46e9f

scsi: target: remove boilerplate code · e49a7d99

Maurizio Lombardi authored Mar 13, 2020

iscsit_free_session() is equivalent to iscsit_stop_session() followed by a
call to iscsit_close_session().

Link: https://lore.kernel.org/r/20200313170656.9716-2-mlombard@redhat.comTested-by: Rahul Kundu <rahul.kundu@chelsio.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e49a7d99

scsi: aha1740: Fix an errro handling path in aha1740_probe() · 0f3d6791

Christophe JAILLET authored Feb 28, 2020

If 'dma_map_single()' fails, the ref counted 'shpnt' will be decremented
twice because 'scsi_host_put()' is called in the if block, and in the error
handling path.

Axe one of these calls.

Link: https://lore.kernel.org/r/20200228215948.7473-1-christophe.jaillet@wanadoo.fr
Fixes: 1dc09e12 ("scsi: aha1740: stop using scsi_unregister")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

0f3d6791

scsi: qla2xxx: Remove non functional code · 1b72e86d

Daniel Wagner authored Feb 06, 2020

Remove code which has no functional use anymore since commit 3c75ad1d
("scsi: qla2xxx: Remove defer flag to indicate immeadiate port loss").

While at it remove also the stale function documentation.

Link: https://lore.kernel.org/r/20200206135443.110701-1-dwagner@suse.deReviewed-by: Arun Easi <aeasi@marvell.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1b72e86d

17 Mar, 2020 29 commits

scsi: pm80xx: Introduce read and write length for IOCTL payload structure · 9b889846

Viswas G authored Mar 16, 2020

Removed the common length and introduce read and write length for IOCTL
payload structure.

[mkp: fixed SoB ordering]

Link: https://lore.kernel.org/r/20200316074906.9119-7-deepak.ukey@microchip.comAcked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Viswas G <viswas.g@microchip.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9b889846

scsi: pm80xx: sysfs attribute for non fatal dump · dba2cc03

Deepak Ukey authored Mar 16, 2020

Added the sysfs attribute for non fatal log so that management utility can
get the non fatal dump from driver. The non-fatal error is an error
condition or abnormal behavior detected by the host, or detected and
reported by the controller to the host.The non-fatal error does not stop
the controller firmware and enables it to still respond to host requests.
A typical example of a non-fatal error is an I/O timeout or an unusual
error notification from the controller. Since the firmware is operational,
the error dump information is pushed to host memory (by firmware) upon
request from the host.

Link: https://lore.kernel.org/r/20200316074906.9119-6-deepak.ukey@microchip.comAcked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

dba2cc03

scsi: pm80xx: Cleanup initialization loading fail path · b40f2882

Peter Chang authored Mar 16, 2020

1) Move the instance tracking down after we think the instance is good to
   go. Avoids having a use-after free.

2) There are goto targets for trying to cleanup if the hw fails to
   initialize, but there's some overlap depending on who thinks they own
   the sub-structures.

Link: https://lore.kernel.org/r/20200316074906.9119-5-deepak.ukey@microchip.comAcked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Peter Chang <dpf@google.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b40f2882

scsi: pm80xx: Free the tag when mpi_set_phy_profile_resp is received · 9d9c7c20

yuuzheng authored Mar 16, 2020

In pm80xx driver, the command mpi_set_phy_profile_req is sent by host
during boot to configure the phy profile such as analog setting page, rate
control page. However, the tag is not freed when its response is
received. As a result, 16 tags are missing for each HBA after boot.  When
NCQ is enabled with queue depth 16, it needs at least, 15 * 16 = 240 tags
for each HBA to achieve the best performance. In current pm80xx driver with
setting CCB_MAX = 256, the total number of tags in each HBA is 255 for data
IO. Hence, without returning those tags to the pool after boot, some device
will finally be forced to non-ncq mode by ATA layer due to excessive errors
(i.e. LLDD cannot allocate tag for queued task).

Link: https://lore.kernel.org/r/20200316074906.9119-4-deepak.ukey@microchip.comAcked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: yuuzheng <yuuzheng@google.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9d9c7c20

scsi: pm80xx: Deal with kexec reboots · d384be6e

Vikram Auradkar authored Mar 16, 2020

A kexec reboot causes the controller fw to assert. This assertion shows up
in two ways, the controller doesn't show up as ready and an interrupt is
waiting as soon as the handler is registered. To resolve this added below
fix:

 - Split the interrupt handling setup into two parts, setup and request.

 - If the controller ready register indicates not-ready, but that the not
   readiness is only on the IOC units we can still try a reset to bring the
   system back to the pre-reboot state.

Link: https://lore.kernel.org/r/20200316074906.9119-3-deepak.ukey@microchip.comAcked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Vikram Auradkar <auradkar@google.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d384be6e

scsi: pm80xx: Increase request sg length · 58bf14c1

Peter Chang authored Mar 16, 2020

Increasing the per-request size maximum (max_sectors_kb) runs into the
per-device DMA scatter gather list limit (max_segments) for users of the io
vector system calls (eg, readv and writev). This is because the kernel
combines io vectors into DMA segments when possible, but it doesn't work
for our user because the vectors in the buffer cache get scrambled. This
change bumps the advertised max scatter gather length to 528 to cover 2M w/
x86's 4k pages and some extra for the user checksum. It trims the size of
some of the tables we don't care about and exposes all of the command slots
upstream to the SCSI layer. Also reduced the PM8001_MAX_CCB to 256 as
pm8001 driver has memory limit depend on machine capability. If we increase
the sg length, we need to trade-off it by decreasing PM8001_MAX_CCB.
PM8001_MAX_CCB = 256 does not have any influence on normal use

Link: https://lore.kernel.org/r/20200316074906.9119-2-deepak.ukey@microchip.comReported-by: kbuild test robot <lkp@intel.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Peter Chang <dpf@google.com>
Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com>
Signed-off-by: Viswas G <Viswas.G@microchip.com>
Signed-off-by: Radha Ramachandran <radha@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

58bf14c1

scsi: smartpqi: Use scnprintf() for avoiding potential buffer overflow · 181aea89

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-9-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Don Brace <don.brace@microsemi.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

181aea89

scsi: core: Use scnprintf() for avoiding potential buffer overflow · 81546b32

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-8-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Bart van Assche <bvanassche@acm.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

81546b32

scsi: megaraid_sas: Use scnprintf() for avoiding potential buffer overflow · ff33d0e2

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Also corrected the wrongly passed limit size.  The remaining buffer size
must be decremented.

Link: https://lore.kernel.org/r/20200315094241.9086-7-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ff33d0e2

scsi: ipr: Use scnprintf() for avoiding potential buffer overflow · 6f0cf424

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-6-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Brian King <brking@us.ibm.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6f0cf424

scsi: gdth: Use scnprintf() for avoiding potential buffer overflow · 473e554d

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

[mkp: checkpatch fix]

Link: https://lore.kernel.org/r/20200315094241.9086-5-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Achim Leubner <achim_leubner@adaptec.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

473e554d

scsi: fnic: Use scnprintf() for avoiding potential buffer overflow · 2605fbd8

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-4-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Satish Kharat <satishkh@cisco.com>
Cc: Sesidhar Baddela <sebaddel@cisco.com>
Cc: Karan Tilak Kumar <kartilak@cisco.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

2605fbd8

scsi: be2iscsi: Use scnprintf() for avoiding potential buffer overflow · 7cd1615e

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-3-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Subbu Seetharaman <subbu.seetharaman@broadcom.com>
Cc: Ketan Mukadam <ketan.mukadam@broadcom.com>
Cc: Jitendra Bhivare <jitendra.bhivare@broadcom.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7cd1615e

scsi: aacraid: Use scnprintf() for avoiding potential buffer overflow · 82893ced

Takashi Iwai authored Mar 15, 2020

Since snprintf() returns the would-be-output size instead of the actual
output size, the succeeding calls may go beyond the given buffer limit.
Fix it by replacing with scnprintf().

Link: https://lore.kernel.org/r/20200315094241.9086-2-tiwai@suse.de
Cc: "James E . J . Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K . Petersen" <martin.petersen@oracle.com>
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Cc: linux-scsi@vger.kernel.org
Acked-by: Balsundar P <Balsundar.P@microchip.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

82893ced

scsi: zfcp: log FC Endpoint Security errors · 42cabdaf

Jens Remus authored Mar 12, 2020

Log any FC Endpoint Security errors to the kernel ring buffer with rate-
limiting.

Link: https://lore.kernel.org/r/20200312174505.51294-11-maier@linux.ibm.comReviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

42cabdaf

scsi: zfcp: enhance handling of FC Endpoint Security errors · e53d9285

Jens Remus authored Mar 12, 2020

Enable for explicit FCP channel FC Endpoint Security error reporting and
handle any FSF security errors according to specification. Take the
following recovery actions when a FSF_SECURITY_ERROR is reported for the
specified FSF commands:

- Open Port: Retry the command if possible
- Send FCP : Physically close the remote port and reopen

For Open Port the command status is set to error, which triggers a retry.
For Send FCP the command status is set to error and recovery is triggered
to physically reopen the remote port.

Link: https://lore.kernel.org/r/20200312174505.51294-10-maier@linux.ibm.comReviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e53d9285

scsi: zfcp: trace FC Endpoint Security of FCP devices and connections · 616da39e

Jens Remus authored Mar 12, 2020

Trace changes in Fibre Channel Endpoint Security capabilities of FCP
devices as well as changes in Fibre Channel Endpoint Security state of
their connections to FC remote ports as FC Endpoint Security changes with
trace level 3 in HBA DBF.

A change in FC Endpoint Security capabilities of FCP devices is traced as
response to FSF command FSF_QTCB_EXCHANGE_PORT_DATA with a trace tag of
"fsfcesa" and a WWPN of ZFCP_DBF_INVALID_WWPN = 0x0000000000000000 (see
FC-FS-4 §18 "Name_Identifier Formats", NAA field).

A change in FC Endpoint Security state of connections between FCP devices
and FC remote ports is traced as response to FSF command
FSF_QTCB_OPEN_PORT_WITH_DID with a trace tag of "fsfcesp".

Example trace record of FC Endpoint Security capability change of FCP
device formatted with zfcpdbf from s390-tools:

Timestamp : ...
Area : HBA
Subarea : 00
Level : 3
Exception : -
CPU ID : ...
Caller : 0x...
Record ID : 5 ZFCP_DBF_HBA_FCES
Tag : fsfcesa FSF FC Endpoint Security adapter
Request ID : 0x...
Request status : 0x00000010
FSF cmnd : 0x0000000e FSF_QTCB_EXCHANGE_PORT_DATA
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : n/a
Prot stat : n/a
Prot stat qual : n/a
Port handle : 0x00000000 none (invalid)
LUN handle : n/a
WWPN : 0x0000000000000000 ZFCP_DBF_INVALID_WWPN
FCES old : 0x00000000 old FC Endpoint Security
FCES new : 0x00000007 new FC Endpoint Security

Example trace record of FC Endpoint Security change of connection to
FC remote port formatted with zfcpdbf from s390-tools:

Timestamp : ...
Area : HBA
Subarea : 00
Level : 3
Exception : -
CPU ID : ...
Caller : 0x...
Record ID : 5 ZFCP_DBF_HBA_FCES
Tag : fsfcesp FSF FC Endpoint Security port
Request ID : 0x...
Request status : 0x00000010
FSF cmnd : 0x00000005 FSF_QTCB_OPEN_PORT_WITH_DID
FSF sequence no: 0x...
FSF issued : ...
FSF stat : 0x00000000 FSF_GOOD
FSF stat qual : n/a
Prot stat : n/a
Prot stat qual : n/a
Port handle : 0x...
WWPN : 0x500507630401120c WWPN
FCES old : 0x00000000 old FC Endpoint Security
FCES new : 0x00000004 new FC Endpoint Security

Link: https://lore.kernel.org/r/20200312174505.51294-9-maier@linux.ibm.comReviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

616da39e

scsi: zfcp: log FC Endpoint Security of connections · f0d26ae8

Jens Remus authored Mar 12, 2020

Log the usage of and subsequent changes in FC Endpoint Security of
connections between FCP devices and FC remote ports to the kernel ring
buffer. Activation of FC Endpoint Security is logged as informational.
Change and deactivation are logged as warning.

No logging takes place, if FC Endpoint Security is not used (i.e. never
activated) on a connection or if it does not change during reopen of a port
(e.g. due to adapter or port recovery).

Link: https://lore.kernel.org/r/20200312174505.51294-8-maier@linux.ibm.comReviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f0d26ae8

scsi: zfcp: report FC Endpoint Security in sysfs · a17c7846

Jens Remus authored Mar 12, 2020

Add an interface to read Fibre Channel Endpoint Security information of FCP
channels and their connections to FC remote ports. It comes in the form of
new sysfs attributes that are attached to the CCW device representing the
FCP device and its zfcp port objects.

The read-only sysfs attribute "fc_security" of a CCW device representing a
FCP device shows the FC Endpoint Security capabilities of the device.
Possible values are: "unknown", "unsupported", "none", or a comma-
separated list of one or more mnemonics and/or one hexadecimal value
representing the supported FC Endpoint Security:

  Authentication: Authentication supported
  Encryption    : Encryption supported

The read-only sysfs attribute "fc_security" of a zfcp port object shows the
FC Endpoint Security used on the connection between its parent FCP device
and the FC remote port. Possible values are: "unknown", "unsupported",
"none", or a mnemonic or hexadecimal value representing the FC Endpoint
Security used:

  Authentication: Connection has been authenticated
  Encryption    : Connection is encrypted

Both sysfs attributes may return hexadecimal values instead of mnemonics,
if the mnemonic lookup table does not contain an entry for the FC Endpoint
Security reported by the FCP device.

Link: https://lore.kernel.org/r/20200312174505.51294-7-maier@linux.ibm.comReviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a17c7846

scsi: zfcp: auto variables for dereferenced structs in open port handler · 185f2d2d

Jens Remus authored Mar 12, 2020

Introduce automatic variables for adapter and QTCB bottom in
zfcp_fsf_open_port_handler(). This facilitates subsequent changes to meet
the 80 character per line limit.

Link: https://lore.kernel.org/r/20200312174505.51294-6-maier@linux.ibm.comReviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

185f2d2d

scsi: zfcp: fix fc_host attributes that should be unknown on local link down · 7e0e4e09

Steffen Maier authored Mar 12, 2020

When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.

Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.

Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down.  Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.

The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value.  Only if something triggered an
exchange port data, some values changed.  Active_fc4s became all zeros as
unknown equivalent during link down.  Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph).  Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.

(cf. v4.9 commit bd77befa ("zfcp: fix fc_host port_type with NPIV"))

This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x01 0x00 ..."
    ...
    fabric_name         = "0x10000027f8e04c49"
    ...
    permanent_port_name = "0xc05076e4588059c1"
    port_id             = "0x244800"
    port_state          = "Linkdown"
    port_type           = "NPort (fabric via point-to-point)"
    ...
    speed               = "16 Gbit"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "37"
    ...
    utilization         = "0 0 0"

The second and next invocations produce (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x00 0x00 ..."
    ...
    fabric_name         = "0x0"
    ...
    permanent_port_name = "0x0"
    port_id             = "0x000000"
    port_state          = "Linkdown"
    port_type           = "Unknown"
    ...
    speed               = "unknown"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "38"
    ...
    utilization         = "0 0 0"

Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down().  All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval().  Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().

Port_name and node_name are always valid, so never reset them.

Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.

Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().

When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.

Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.comReviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7e0e4e09

scsi: zfcp: wire previously driver-specific sysfs attributes also to fc_host · 538c6e91

Steffen Maier authored Mar 12, 2020

Manufacturer, HBA model, firmware version, and hardware version. Use the
same value format as for the driver-specific attributes. Keep the
driver-specific attributes for stable user space sysfs API.

Link: https://lore.kernel.org/r/20200312174505.51294-4-maier@linux.ibm.comReviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

538c6e91

scsi: zfcp: expose fabric name as common fc_host sysfs attribute · e05a10a0

Steffen Maier authored Mar 12, 2020

FICON Express8S or older, as well as card features newer than FICON
Express16S+ have no certain firmware level requirement.

FICON Express16S or FICON Express16S+ have the following
minimum firmware level requirements to show a proper fabric name value:

z13 machine
FICON Express16S , MCL P08424.005 , LIC version 0x00000721
z14 machine
FICON Express16S , MCL P42611.008 , LIC version 0x10200069
FICON Express16S+ , MCL P42625.010 , LIC version 0x10300147

Otherwise, the read value is not the fabric name.

Each FCP channel of these card features might need one SAN fabric re-login
after concurrent microcode update in order to show the proper fabric name.
Possible ways to trigger a SAN fabric re-login are one of: Pull fibres
between FCP channel port and SAN switch port on either side and re-plug,
disable SAN switch port adjacent to FCP channel port and re-enable switch
port, or at Service Element toggle off all CHPIDs of FCP channel over all
LPARs and toggle CHPIDs on again. Zfcp operating subchannels (FCP devices)
on such FCP channel recovers a fabric re-login.

Initialize fabric name for any topology and have it an invalid WWPN 0x0 for
anything but fabric topology. Otherwise for e.g. point-to-point topology
one could see the initial -1 from fc_host_setup() and after a link unplug
our fabric name would turn to 0x0 (with subsequent commit ("zfcp: fix
fc_host attributes that should be unknown on local link down") and stay 0x0
on link replug. I did not initialize to 0x0 somewhere even earlier in the
code path such that it would not flap from real to 0x0 to real on e.g. an
exchange config data with fabric topology.

Link: https://lore.kernel.org/r/20200312174505.51294-3-maier@linux.ibm.comReviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e05a10a0

scsi: zfcp: fix missing erp_lock in port recovery trigger for point-to-point · 819732be

Steffen Maier authored Mar 12, 2020

v2.6.27 commit cc8c2829 ("[SCSI] zfcp: Automatically attach remote
ports") introduced zfcp automatic port scan.

Before that, the user had to use the sysfs attribute "port_add" of an FCP
device (adapter) to add and open remote (target) ports, even for the remote
peer port in point-to-point topology. That code path did a proper port open
recovery trigger taking the erp_lock.

Since above commit, a new helper function zfcp_erp_open_ptp_port()
performed an UNlocked port open recovery trigger. This can race with other
parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt
e.g. adapter->erp_total_count or adapter->erp_ready_head.

As already found for fabric topology in v4.17 commit fa89adba ("scsi:
zfcp: fix infinite iteration on ERP ready list"), there was an endless loop
during tracing of rport (un)block.  A subsequent v4.18 commit 9e156c54
("scsi: zfcp: assert that the ERP lock is held when tracing a recovery
trigger") introduced a lockdep assertion for that case.

As a side effect, that lockdep assertion now uncovered the unlocked code
path for PtP. It is from within an adapter ERP action:

zfcp_erp_strategy[1479]  intentionally DROPs erp lock around
                         zfcp_erp_strategy_do_action()
zfcp_erp_strategy_do_action[1441]      NO erp lock
zfcp_erp_adapter_strategy[876]         NO erp lock
zfcp_erp_adapter_strategy_open[855]    NO erp lock
zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock
zfcp_erp_adapter_strat_fsf_xconf[772]  erp lock only around
                                       zfcp_erp_action_to_running(),
                                       BUT *_not_* around
                                       zfcp_erp_enqueue_ptp_port()
zfcp_erp_enqueue_ptp_port[728]         BUG: *_not_* taking erp lock
_zfcp_erp_port_reopen[432]             assumes to be called with erp lock
zfcp_erp_action_enqueue[314]           assumes to be called with erp lock
zfcp_dbf_rec_trig[288]                 _checks_ to be called with erp lock:
	lockdep_assert_held(&adapter->erp_lock);

It causes the following lockdep warning:

WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288
                            zfcp_dbf_rec_trig+0x16a/0x188
no locks held by zfcperp0.0.17c0/775.

Fix this by using the proper locked recovery trigger helper function.

Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com
Fixes: cc8c2829 ("[SCSI] zfcp: Automatically attach remote ports")
Cc: <stable@vger.kernel.org> #v2.6.27+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

819732be

scsi: scsi_trace: Use get_unaligned_be24() · 3cef5948

Bart Van Assche authored Mar 13, 2020

This makes the SCSI tracing code slightly easier to read.

Link: https://lore.kernel.org/r/20200313203102.16613-6-bvanassche@acm.org
Fixes: bf816235 ("[SCSI] add scsi trace core functions and put trace points")
Cc: Christoph Hellwig <hch@lst.de>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3cef5948

scsi: st: Use get_unaligned_be24() and sign_extend32() · 35b703db

Bart Van Assche authored Mar 13, 2020

Use these functions instead of open-coding them.

Link: https://lore.kernel.org/r/20200313203102.16613-5-bvanassche@acm.org
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

35b703db

scsi: treewide: Consolidate {get,put}_unaligned_[bl]e24() definitions · a7afff31

Bart Van Assche authored Mar 13, 2020

Move the get_unaligned_be24(), get_unaligned_le24() and
put_unaligned_le24() definitions from various drivers into
include/linux/unaligned/generic.h. Add a put_unaligned_be24()
implementation.

Link: https://lore.kernel.org/r/20200313203102.16613-4-bvanassche@acm.org
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jens Axboe <axboe@fb.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> # For drivers/usb
Reviewed-by: Felipe Balbi <balbi@kernel.org> # For drivers/usb/gadget
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a7afff31

scsi: c6x: Include <linux/unaligned/generic.h> instead of duplicating it · 7251c0a4

Bart Van Assche authored Mar 13, 2020

Use the generic __{get,put}_unaligned_[bl]e() definitions instead of
duplicating these. Since a later patch will add more definitions into
<linux/unaligned/generic.h>, this patch ensures that these definitions have
to be added only once. See also commit a7f626c1 ("C6X: headers").  See
also commit 6510d419 ("kernel: Move arches to use common unaligned
access").

Link: https://lore.kernel.org/r/20200313203102.16613-3-bvanassche@acm.org
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7251c0a4

scsi: linux/unaligned/byteshift.h: Remove superfluous casts · 19f747f7

Bart Van Assche authored Mar 13, 2020

The C language supports implicitly casting a void pointer into a non-void
pointer. Remove explicit void pointer to non-void pointer casts because
these are superfluous.

Link: https://lore.kernel.org/r/20200313203102.16613-2-bvanassche@acm.org
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

19f747f7

16 Mar, 2020 2 commits

scsi: core: Allow non-root users to perform ZBC commands · 6fdb79ff

Ryan Attard authored Feb 26, 2020

Allow users with read permissions to issue REPORT ZONE commands and users
with write permissions to manage zones on block devices supporting the ZBC
specification.

Link: https://lore.kernel.org/r/20200226170518.92963-2-ryanattard@ryanattard.infoSigned-off-by: Ryan Attard <ryanattard@ryanattard.info>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6fdb79ff

scsi: hisi_sas: Use dev_err() in read_iost_itct_cache_v3_hw() · 1e067dd8

Luo Jiaxing authored Mar 11, 2020

The print of pr_err() does not come with device information, so replace it
with dev_err(). Also improve the grammar in the message.

Link: https://lore.kernel.org/r/1583940144-230800-1-git-send-email-john.garry@huawei.comSigned-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1e067dd8

12 Mar, 2020 4 commits

scsi: core: avoid repetitive logging of device offline messages · b0962c53

Ewan D. Milne authored Mar 11, 2020

Large queues of I/O to offline devices that are eventually submitted when
devices are unblocked result in a many repeated "rejecting I/O to offline
device" messages. These messages can fill up the dmesg buffer in crash
dumps so no useful prior messages remain. In addition, if a serial console
is used, the flood of messages can cause a hard lockup in the console code.

Introduce a flag indicating the message has already been logged for the
device, and reset the flag when scsi_device_set_state() changes the device
state.

Link: https://lore.kernel.org/r/20200311143930.20674-1-emilne@redhat.comReviewed-by: Bart van Assche <bvanassche@acm.org>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b0962c53

scsi: docs: convert arcmsr_spec.txt to ReST · dade67f4

Mauro Carvalho Chehab authored Mar 02, 2020

This file had its own peculiar style, not following any other
files inside the Kernel (as far as I saw).

Had to do a number of changes here, starting by removing the two
leading asterisks from each line, adding table and literal
block markups and changing whitespace and blank lines.

The end result is that (IMHO), it is now a lot easier to read
it as a text file, while producing a good html output.

Link: https://lore.kernel.org/r/6f8e4da4ea643adbe048f55504a59427c5e50c97.1583136624.git.mchehab+huawei@kernel.orgSigned-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

dade67f4

scsi: docs: convert wd719x.txt to ReST · 40ee6309

Mauro Carvalho Chehab authored Mar 02, 2020

Link: https://lore.kernel.org/r/23e5b13d810b7dd8126b126173999c02eac50e74.1583136624.git.mchehab+huawei@kernel.orgSigned-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

40ee6309

scsi: docs: convert ufs.txt to ReST · b64f6822

Mauro Carvalho Chehab authored Mar 02, 2020

Link: https://lore.kernel.org/r/052d45576e342a217185e91a83793b384b1592a4.1583136624.git.mchehab+huawei@kernel.orgAcked-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b64f6822