Commit 07aaefdf authored by James Smart's avatar James Smart Committed by Martin K. Petersen

scsi: lpfc: Fix crash when a fabric node is released prematurely

The driver's management of the fabric controller (aka pseudo-scsi
initiator) node in SLI3 mode is causing this crash. The crash occurs
because of a node reference imbalance that frees the fabric controller node
while devloss is outstanding from the SCSI transport.  This is triggered by
an odd behavior where the switch reacts to a rejected RDP request with a
PLOGI and nothing else, not even a LOGO.  The driver ACKS the PLOGI and
after successfully registering the RPI, incorrectly registers the fabric
controller node because it has the NLP_FC4_FCP flag still set from the
fabric controller PRLI.  If a LIP is issued, the driver attempts to cleanup
on Link Up and ends up executing too many puts.

Fix by detecting the fabric node type and clearing out the nodes internal
flags that triggered a SCSI transport registration and subsequence dev_loss
event.  The driver cannot count on any persistence from fabric controller
nodes.

Link: https://lore.kernel.org/r/20210104180240.46824-5-jsmart2021@gmail.comCo-developed-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
parent ecf041fe
......@@ -73,6 +73,16 @@ static void lpfc_unregister_fcfi_cmpl(struct lpfc_hba *, LPFC_MBOXQ_t *);
static int lpfc_fcf_inuse(struct lpfc_hba *);
static void lpfc_mbx_cmpl_read_sparam(struct lpfc_hba *, LPFC_MBOXQ_t *);
static int
lpfc_valid_xpt_node(struct lpfc_nodelist *ndlp)
{
if (ndlp->nlp_fc4_type ||
ndlp->nlp_DID == Fabric_DID ||
ndlp->nlp_DID == NameServer_DID ||
ndlp->nlp_DID == FDMI_DID)
return 1;
return 0;
}
/* The source of a terminate rport I/O is either a dev_loss_tmo
* event or a call to fc_remove_host. While the rport should be
* valid during these downcalls, the transport can call twice
......@@ -4318,7 +4328,8 @@ lpfc_nlp_state_cleanup(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
/* FCP and NVME Transport interface */
if ((old_state == NLP_STE_MAPPED_NODE ||
old_state == NLP_STE_UNMAPPED_NODE)) {
if (ndlp->rport) {
if (ndlp->rport &&
lpfc_valid_xpt_node(ndlp)) {
vport->phba->nport_event_cnt++;
lpfc_unregister_remote_port(ndlp);
}
......@@ -4340,10 +4351,7 @@ lpfc_nlp_state_cleanup(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
if (new_state == NLP_STE_MAPPED_NODE ||
new_state == NLP_STE_UNMAPPED_NODE) {
if (ndlp->nlp_fc4_type ||
ndlp->nlp_DID == Fabric_DID ||
ndlp->nlp_DID == NameServer_DID ||
ndlp->nlp_DID == FDMI_DID) {
if (lpfc_valid_xpt_node(ndlp)) {
vport->phba->nport_event_cnt++;
/*
* Tell the fc transport about the port, if we haven't
......
......@@ -1021,7 +1021,12 @@ lpfc_rcv_prli(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
ndlp->nlp_fc4_type |= NLP_FC4_NVME;
lpfc_nlp_set_state(vport, ndlp, NLP_STE_UNMAPPED_NODE);
}
if (npr->prliType == PRLI_FCP_TYPE)
/* Fabric Controllers send FCP PRLI as an initiator but should
* not get recognized as FCP type and registered with transport.
*/
if (npr->prliType == PRLI_FCP_TYPE &&
!(ndlp->nlp_type & NLP_FABRIC))
ndlp->nlp_fc4_type |= NLP_FC4_FCP;
}
if (rport) {
......@@ -2044,6 +2049,7 @@ lpfc_cmpl_reglogin_reglogin_issue(struct lpfc_vport *vport,
* must complete PRLI.
*/
if (ndlp->nlp_type & NLP_FABRIC) {
ndlp->nlp_fc4_type &= ~NLP_FC4_FCP;
ndlp->nlp_prev_state = NLP_STE_REG_LOGIN_ISSUE;
lpfc_nlp_set_state(vport, ndlp, NLP_STE_UNMAPPED_NODE);
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment