• James Smart's avatar
    scsi: lpfc: Fix crash due to port reset racing vs adapter error handling · 8c24a4f6
    James Smart authored
    If the adapter encounters a condition which causes the adapter to fail
    (driver must detect the failure) simultaneously to a request to the driver
    to reset the adapter (such as a host_reset), the reset path will be racing
    with the asynchronously-detect adapter failure path.  In the failing
    situation, one path has started to tear down the adapter data structures
    (io_wq's) while the other path has initiated a repeat of the teardown and
    is in the lpfc_sli_flush_xxx_rings path and attempting to access the
    just-freed data structures.
    
    Fix by the following:
    
     - In cases where an adapter failure is detected, rather than explicitly
       calling offline_eratt() to start the teardown, change the adapter state
       and let the later calls of posted work to the slowpath thread invoke the
       adapter recovery.  In essence, this means all requests to reset are
       serialized on the slowpath thread.
    
     - Clean up the routine that restarts the adapter. If there is a failure
       from brdreset, don't immediately error and leave things in a partial
       state. Instead, ensure the adapter state is set and finish the teardown
       of structures before returning.
    
     - If in the scsi host reset handler and the board fails to reset and
       restart (which can be due to parallel reset/recovery paths), instead of
       hard failing and explicitly calling offline_eratt() (which gets into the
       redundant path), just fail out and let the asynchronous path resolve the
       adapter state.
    Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
    Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    8c24a4f6
lpfc_sli.c 618 KB