• Steffen Maier's avatar
    scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) · e0c4ec19
    Steffen Maier authored
    commit ef4021fe upstream.
    
    When the user tries to remove a zfcp port via sysfs, we only rejected it if
    there are zfcp unit children under the port. With purely automatically
    scanned LUNs there are no zfcp units but only SCSI devices. In such cases,
    the port_remove erroneously continued. We close the port and this
    implicitly closes all LUNs under the port. The SCSI devices survive with
    their private zfcp_scsi_dev still holding a reference to the "removed"
    zfcp_port (still allocated but invisible in sysfs) [zfcp_get_port_by_wwpn
    in zfcp_scsi_slave_alloc]. This is not a problem as long as the fc_rport
    stays blocked. Once (auto) port scan brings back the removed port, we
    unblock its fc_rport again by design.  However, there is no mechanism that
    would recover (open) the LUNs under the port (no "ersfs_3" without
    zfcp_unit [zfcp_erp_strategy_followup_success]).  Any pending or new I/O to
    such LUN leads to repeated:
    
      Done: NEEDS_RETRY Result: hostbyte=DID_IMM_RETRY driverbyte=DRIVER_OK
    
    See also v4.10 commit 6f2ce1c6 ("scsi: zfcp: fix rport unblock race
    with LUN recovery"). Even a manual LUN recovery
    (echo 0 > /sys/bus/scsi/devices/H:C:T:L/zfcp_failed)
    does not help, as the LUN links to the old "removed" port which remains
    to lack ZFCP_STATUS_COMMON_RUNNING [zfcp_erp_required_act].
    The only workaround is to first ensure that the fc_rport is blocked
    (e.g. port_remove again in case it was re-discovered by (auto) port scan),
    then delete the SCSI devices, and finally re-discover by (auto) port scan.
    The port scan includes an fc_rport unblock, which in turn triggers
    a new scan on the scsi target to freshly get new pure auto scan LUNs.
    
    Fix this by rejecting port_remove also if there are SCSI devices
    (even without any zfcp_unit) under this port. Re-use mechanics from v3.7
    commit d99b601b ("[SCSI] zfcp: restore refcount check on port_remove").
    However, we have to give up zfcp_sysfs_port_units_mutex earlier in unit_add
    to prevent a deadlock with scsi_host scan taking shost->scan_mutex first
    and then zfcp_sysfs_port_units_mutex now in our zfcp_scsi_slave_alloc().
    Signed-off-by: default avatarSteffen Maier <maier@linux.ibm.com>
    Fixes: b62a8d9b ("[SCSI] zfcp: Use SCSI device data zfcp scsi dev instead of zfcp unit")
    Fixes: f8210e34 ("[SCSI] zfcp: Allow midlayer to scan for LUNs when running in NPIV mode")
    Cc: <stable@vger.kernel.org> #2.6.37+
    Reviewed-by: default avatarBenjamin Block <bblock@linux.ibm.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    e0c4ec19
zfcp_sysfs.c 20 KB