1. 24 Aug, 2016 23 commits
  2. 19 Aug, 2016 7 commits
  3. 12 Aug, 2016 10 commits
    • Steffen Maier's avatar
      zfcp: trace full payload of all SAN records (req,resp,iels) · aceeffbb
      Steffen Maier authored
      This was lost with commit 2c55b750
      ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      but is necessary for problem determination, e.g. to see the
      currently active zone set during automatic port scan.
      
      For the large GPN_FT response (4 pages), save space by not dumping
      any empty residual entries.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 2c55b750 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarAlexey Ishchuk <aishchuk@linux.vnet.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      aceeffbb
    • Steffen Maier's avatar
      zfcp: fix payload trace length for SAN request&response · 94db3725
      Steffen Maier authored
      commit 2c55b750
      ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      started to add FC_CT_HDR_LEN which made zfcp dump random data
      out of bounds for RSPN GS responses because u.rspn.rsp
      is the largest and last field in the union of struct zfcp_fc_req.
      Other request/response types only happened to stay within bounds
      due to the padding of the union or
      due to the trace capping of u.gspn.rsp to ZFCP_DBF_SAN_MAX_PAYLOAD.
      
      Timestamp      : ...
      Area           : SAN
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU id         : ..
      Caller         : ...
      Record id      : 2
      Tag            : fsscth2
      Request id     : 0x...
      Destination ID : 0x00fffffc
      Payload short  : 01000000 fc020000 80020000 00000000
                       xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx <===
                       00000000 00000000 00000000 00000000
      Payload length : 32                                  <===
      
      struct zfcp_fc_req {
          [0] struct zfcp_fsf_ct_els ct_els;
         [56] struct scatterlist sg_req;
         [96] struct scatterlist sg_rsp;
              union {
                  struct {req; rsp;} adisc;    SIZE: 28+28=   56
                  struct {req; rsp;} gid_pn;   SIZE: 24+20=   44
                  struct {rspsg; req;} gpn_ft; SIZE: 40*4+20=180
                  struct {req; rsp;} gspn;     SIZE: 20+273= 293
                  struct {req; rsp;} rspn;     SIZE: 277+16= 293
        [136] } u;
      }
      SIZE: 432
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 2c55b750 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarAlexey Ishchuk <aishchuk@linux.vnet.ibm.com>
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      94db3725
    • Steffen Maier's avatar
      zfcp: fix D_ID field with actual value on tracing SAN responses · 771bf035
      Steffen Maier authored
      With commit 2c55b750
      ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      we lost the N_Port-ID where an ELS response comes from.
      With commit 7c7dc196
      ("[SCSI] zfcp: Simplify handling of ct and els requests")
      we lost the N_Port-ID where a CT response comes from.
      It's especially useful if the request SAN trace record
      with D_ID was already lost due to trace buffer wrap.
      
      GS uses an open WKA port handle and ELS just a D_ID, and
      only for ELS we could get D_ID from QTCB bottom via zfcp_fsf_req.
      To cover both cases, add a new field to zfcp_fsf_ct_els
      and fill it in on request to use in SAN response trace.
      Strictly speaking the D_ID on SAN response is the FC frame's S_ID.
      We don't need a field for the other end which is always us.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 2c55b750 ("[SCSI] zfcp: Redesign of the debug tracing for SAN records.")
      Fixes: 7c7dc196 ("[SCSI] zfcp: Simplify handling of ct and els requests")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      771bf035
    • Steffen Maier's avatar
      zfcp: restore tracing of handle for port and LUN with HBA records · 7c964ffe
      Steffen Maier authored
      This information was lost with
      commit a54ca0f6
      ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      but is required to debug e.g. invalid handle situations.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: a54ca0f6 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      7c964ffe
    • Steffen Maier's avatar
      zfcp: trace on request for open and close of WKA port · d27a7cb9
      Steffen Maier authored
      Since commit a54ca0f6
      ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      HBA records no longer contain WWPN, D_ID, or LUN
      to reduce duplicate information which is already in REC records.
      In contrast to "regular" target ports, we don't use recovery to open
      WKA ports such as directory/nameserver, so we don't get REC records.
      Therefore, introduce pseudo REC running records without any
      actual recovery action but including D_ID of WKA port on open/close.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: a54ca0f6 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      d27a7cb9
    • Steffen Maier's avatar
      zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace · 0102a30a
      Steffen Maier authored
      bring back
      commit d21e9daa
      ("[SCSI] zfcp: Dont use 0 to indicate invalid LUN in rec trace")
      which was lost with
      commit ae0904f6
      ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: ae0904f6 ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      0102a30a
    • Steffen Maier's avatar
      zfcp: retain trace level for SCSI and HBA FSF response records · 35f040df
      Steffen Maier authored
      While retaining the actual filtering according to trace level,
      the following commits started to write such filtered records
      with a hardcoded record level of 1 instead of the actual record level:
      commit 250a1352
      ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
      commit a54ca0f6
      ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      
      Now we can distinguish written records again for offline level filtering.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 250a1352 ("[SCSI] zfcp: Redesign of the debug tracing for SCSI records.")
      Fixes: a54ca0f6 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
      Cc: <stable@vger.kernel.org> #2.6.38+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      35f040df
    • Steffen Maier's avatar
      zfcp: close window with unblocked rport during rport gone · 4eeaa4f3
      Steffen Maier authored
      On a successful end of reopen port forced,
      zfcp_erp_strategy_followup_success() re-uses the port erp_action
      and the subsequent zfcp_erp_action_cleanup() now
      sees ZFCP_ERP_SUCCEEDED with
      erp_action->action==ZFCP_ERP_ACTION_REOPEN_PORT
      instead of ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
      but must not perform zfcp_scsi_schedule_rport_register().
      
      We can detect this because the fresh port reopen erp_action
      is in its very first step ZFCP_ERP_STEP_UNINITIALIZED.
      
      Otherwise this opens a time window with unblocked rport
      (until the followup port reopen recovery would block it again).
      If a scsi_cmnd timeout occurs during this time window
      fc_timed_out() cannot work as desired and such command
      would indeed time out and trigger scsi_eh. This prevents
      a clean and timely path failover.
      This should not happen if the path issue can be recovered
      on FC transport layer such as path issues involving RSCNs.
      
      Also, unnecessary and repeated DID_IMM_RETRY for pending and
      undesired new requests occur because internally zfcp still
      has its zfcp_port blocked.
      
      As follow-on errors with scsi_eh, it can cause,
      in the worst case, permanently lost paths due to one of:
      sd <scsidev>: [<scsidisk>] Medium access timeout failure. Offlining disk!
      sd <scsidev>: Device offlined - not ready after error recovery
      
      For fix validation and to aid future debugging with other recoveries
      we now also trace (un)blocking of rports.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 5767620c ("[SCSI] zfcp: Do not unblock rport from REOPEN_PORT_FORCED")
      Fixes: a2fa0aed ("[SCSI] zfcp: Block FC transport rports early on errors")
      Fixes: 5f852be9 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
      Fixes: 338151e0 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
      Fixes: 3859f6a2 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
      Cc: <stable@vger.kernel.org> #2.6.32+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      4eeaa4f3
    • Steffen Maier's avatar
      zfcp: fix ELS/GS request&response length for hardware data router · 70369f8e
      Steffen Maier authored
      In the hardware data router case, introduced with kernel 3.2
      commit 86a9668a ("[SCSI] zfcp: support for hardware data router")
      the ELS/GS request&response length needs to be initialized
      as in the chained SBAL case.
      
      Otherwise, the FCP channel rejects ELS requests with
      FSF_REQUEST_SIZE_TOO_LARGE.
      
      Such ELS requests can be issued by user space through BSG / HBA API,
      or zfcp itself uses ADISC ELS for remote port link test on RSCN.
      The latter can cause a short path outage due to
      unnecessary remote target port recovery because the always
      failing ADISC cannot detect extremely short path interruptions
      beyond the local FCP channel.
      
      Below example is decoded with zfcpdbf from s390-tools:
      
      Timestamp      : ...
      Area           : SAN
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU id         : ..
      Caller         : zfcp_dbf_san_req+0408
      Record id      : 1
      Tag            : fssels1
      Request id     : 0x<reqid>
      Destination ID : 0x00<target d_id>
      Payload info   : 52000000 00000000 <our wwpn       >           [ADISC]
                       <our wwnn       > 00<s_id> 00000000
                       00000000 00000000 00000000 00000000
      
      Timestamp      : ...
      Area           : HBA
      Subarea        : 00
      Level          : 1
      Exception      : -
      CPU id         : ..
      Caller         : zfcp_dbf_hba_fsf_res+0740
      Record id      : 1
      Tag            : fs_ferr
      Request id     : 0x<reqid>
      Request status : 0x00000010
      FSF cmnd       : 0x0000000b               [FSF_QTCB_SEND_ELS]
      FSF sequence no: 0x...
      FSF issued     : ...
      FSF stat       : 0x00000061		  [FSF_REQUEST_SIZE_TOO_LARGE]
      FSF stat qual  : 00000000 00000000 00000000 00000000
      Prot stat      : 0x00000100
      Prot stat qual : 00000000 00000000 00000000 00000000
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 86a9668a ("[SCSI] zfcp: support for hardware data router")
      Cc: <stable@vger.kernel.org> # 3.2+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      70369f8e
    • Steffen Maier's avatar
      zfcp: fix fc_host port_type with NPIV · bd77befa
      Steffen Maier authored
      For an NPIV-enabled FCP device, zfcp can erroneously show
      "NPort (fabric via point-to-point)" instead of "NPIV VPORT"
      for the port_type sysfs attribute of the corresponding
      fc_host.
      s390-tools that can be affected are dbginfo.sh and ziomon.
      
      zfcp_fsf_exchange_config_evaluate() ignores
      fsf_qtcb_bottom_config.connection_features indicating NPIV
      and only sets fc_host_port_type to FC_PORTTYPE_NPORT if
      fsf_qtcb_bottom_config.fc_topology is FSF_TOPO_FABRIC.
      
      Only the independent zfcp_fsf_exchange_port_evaluate()
      evaluates connection_features to overwrite fc_host_port_type
      to FC_PORTTYPE_NPIV in case of NPIV.
      Code was introduced with upstream kernel 2.6.30
      commit 0282985d
      ("[SCSI] zfcp: Report fc_host_port_type as NPIV").
      
      This works during FCP device recovery (such as set online)
      because it performs FSF_QTCB_EXCHANGE_CONFIG_DATA followed by
      FSF_QTCB_EXCHANGE_PORT_DATA in sequence.
      
      However, the zfcp-specific scsi host sysfs attributes
      "requests", "megabytes", or "seconds_active" trigger only
      zfcp_fsf_exchange_config_evaluate() resetting fc_host
      port_type to FC_PORTTYPE_NPORT despite NPIV.
      
      The zfcp-specific scsi host sysfs attribute "utilization"
      triggers only zfcp_fsf_exchange_port_evaluate() correcting
      the fc_host port_type again in case of NPIV.
      
      Evaluate fsf_qtcb_bottom_config.connection_features
      in zfcp_fsf_exchange_config_evaluate() where it belongs to.
      Signed-off-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Fixes: 0282985d ("[SCSI] zfcp: Report fc_host_port_type as NPIV")
      Cc: <stable@vger.kernel.org> #2.6.30+
      Reviewed-by: default avatarBenjamin Block <bblock@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      bd77befa