• Suganath Prabu's avatar
    scsi: mpt3sas: Run SAS DEVICE STATUS CHANGE EVENT from ISR · 54d74e6b
    Suganath Prabu authored
    In some cases, like while performing extensive expander reset or phy reset,
    user may observe that drives are not visible in OS. Driver's
    firmware-worker thread is blocked for more than 120 seconds resulting in a
    call trace.
    
    1. Received target add event for Device A and hence driver has registered
    this device to SML by calling sas_rphy_add(). SML has half added this
    device and returned the control to the driver by quitting from
    sas_rphy_add() API, and started some background scanning on this device A.
    
    2. While background scanning is going on device A, driver has received SAS
    DEVICE STATUS CHANGE EVENT with RC code "Internal device reset" event and
    hence driver has set tm_busy flag for this Device A from FW worker thread
    context. When tm_busy flag is set then driver return scsi commands with
    device busy status asking the kernel to retry the command after some time.
    So background scanning for device A will be waiting for this tm_busy to be
    cleared.
    
    3. Meanwhile driver has received a target add event for Device B and hence
    driver called sas_rphy_add() API to register this device with SML. But
    since background scanning for Device A is still pending and SML is not
    quitting from sas_rphy_add(), the driver’s firmware worker thread got
    blocked.
    
    4. Now driver has received SAS DEVICE STATUS CHANGE EVENT with RC code
    "Internal device reset complete" event. But as driver’s firmware worker
    thread got blocked in Step3, it can’t process this event and it was not
    clearing the tm_busy flag and deadlock occurred (where SML was waiting for
    tm_busy flag to be cleared and our FW worker thread is waiting for SML to
    quit from sas_device_rphy_add() API).
    
    Same deadlock will be observed even if device B is getting removed in
    step3. So to limit these types of deadlocks driver will process the SAS
    DEVICE STATUS CHANGE EVENT events from ISR context instead of processing
    this event from worker thread context.  This improvement avoids above
    deadlock.
    Signed-off-by: default avatarSuganath Prabu <suganath-prabu.subramani@broadcom.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    54d74e6b
mpt3sas_scsih.c 319 KB