-
James Smart authored
In tests with remote ports contantly logging out/logging coupled with occassional local link bounce, if a remote port is disocnnected for longer than devloss_tmo and then subsequently reconnected, eventually the test will fail to login with the remote port and remote port connectivity is lost. When devloss_tmo expires, the driver does not free the node struct until the port or npiv instances is being deleted. The node is left allocated but the state set to UNUSED. If the node was in the process of logging in when the local link drop occurred, meaning the RPI was allocated for the node in order to send the ELS, but not yet registered which comes after successful login, the node is moved to the NPR state, and if devloss expires, to UNUSED state. If the remote port comes back, the node associated with it is restarted and this path happens to allocate a new RPI and overwrites the prior RPI value. In the cases where the port was logged in and loggs out, the path did release the RPI but did not set the node rpi value. In the cases where the remote port never finished logging in, the path never did the call to release the rpi. In this latter case, when the node is subsequently restore, the new rpi allocation overwrites the rpi that was not released, and the rpi is now leaked. Eventually the port will run out of RPI resources to log into new remote ports. Fix by following changes: - When an rpi is released, do so under locks and ensure the node rpi value is set to a non-allocated value (LPFC_RPI_ALLOC_ERROR). Note: refactored to a small service routine to avoid indentation issues. - When re-enabling a node, check the rpi value to determine if a new allocation is necessary. If already set, use the prior rpi. Enhanced logging to help in the future. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
b95b2119