• Xingui Yang's avatar
    scsi: hisi_sas: Modify v3 HW SATA disk error state completion processing · 4ef4f1a6
    Xingui Yang authored
    When an NCQ error occurs, the controller will abnormally complete the I/Os
    that are newly delivered to disk, and bit8 in CQ dw3 will be set which
    indicates that the SATA disk is in error state. The current processing flow
    is to set ts->stat to SAS_OPEN_REJECT and then sas_ata_task_done() will set
    FIS stat to ATA_ERR. After analyzing the I/O by ata_eh_analyze_tf(),
    err_mask will set to AC_ERR_HSM. If media error occurs for four times
    within 10 minutes and the chip rejects new I/Os for four times, NCQ will be
    disabled due to excessive errors, which is undesirable.
    
    Therefore, use sas_task_abort() to handle abnormally completed I/Os when
    SATA disk is in error state, as these abnormally completed I/Os are already
    processed by sas_ata_device_link_abort() and qc->flag are set to
    ATA_QCFLAG_FAILED. If sas_task_abort() is used, qc->err_mask will not be
    modified in EH. Unlike the current process flow, it will not increase the
    count of ECAT_TOUT_HSM and not turn off NCQ. Like other I/Os on the disk
    that do not have an error but do not return after the NCQ error, they are
    retried after the EH.
    Signed-off-by: default avatarXingui Yang <yangxingui@huawei.com>
    Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
    Link: https://lore.kernel.org/r/1665998435-199946-5-git-send-email-john.garry@huawei.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    4ef4f1a6
hisi_sas_v3_hw.c 146 KB