1. 25 Jan, 2024 5 commits
    • Martin K. Petersen's avatar
      Merge patch series "scsi: hisi_sas: Minor fixes and cleanups" · 2b9bc9ef
      Martin K. Petersen authored
      chenxiang <chenxiang66@hisilicon.com> says:
      
      This series contains some fixes and cleanups including:
      
       - Fix a deadlock issue related to automatic debugfs;
      
       - Remove redundant checks for automatic debugfs;
      
       - Check whether debugfs is enabled before removing or releasing it;
      
       - Remove hisi_hba->timer for v3 hw;
      
      Link: https://lore.kernel.org/r/1705904747-62186-1-git-send-email-chenxiang66@hisilicon.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      2b9bc9ef
    • Xiang Chen's avatar
      scsi: hisi_sas: Remove hisi_hba->timer for v3 hw · f9242f16
      Xiang Chen authored
      hisi_hba->timer is not used for v3 hw but there are two places that some
      operations related to hisi_hba->timer are called by v3 hw:
      
       - Deleting the timer in function hisi_sas_v3_hw() which is only for v3 hw;
      
       - Deleting the timer in function hisi_sas_controller_reset_prepare() which
         is common for v1/v2/v3 hw.
      
      We can remove the timer in the first case, but for the second scenario we
      need to remove it only for v3 hw, so check hw->sht which is NULL only for
      v3 hw before deleting hisi_hba->timer.
      Signed-off-by: default avatarXiang Chen <chenxiang66@hisilicon.com>
      Link: https://lore.kernel.org/r/1705904747-62186-5-git-send-email-chenxiang66@hisilicon.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      f9242f16
    • Yihang Li's avatar
      scsi: hisi_sas: Check whether debugfs is enabled before removing or releasing it · 69097a63
      Yihang Li authored
      hisi_sas debugfs remove should be executed only when debugfs is enabled.
      Check whether debugfs is enabled and then remove it only if enabled.
      Signed-off-by: default avatarYihang Li <liyihang9@huawei.com>
      Signed-off-by: default avatarXiang Chen <chenxiang66@hisilicon.com>
      Link: https://lore.kernel.org/r/1705904747-62186-4-git-send-email-chenxiang66@hisilicon.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      69097a63
    • Yihang Li's avatar
      scsi: hisi_sas: Remove redundant checks for automatic debugfs dump · 3f030550
      Yihang Li authored
      In commit 63f0733d ("scsi: hisi_sas: Allocate DFX memory during dump
      trigger"), the memory allocation time of the DFX is changed from device
      initialization to dump occurs, so .debugfs_itct is not a valid address and
      do not need to check.
      
      The parameter hisi_sas_debugfs_enable is enough to check whether automatic
      debugfs dump is triggered, so remove redunant checks.
      
      Fixes: 63f0733d ("scsi: hisi_sas: Allocate DFX memory during dump trigger")
      Signed-off-by: default avatarYihang Li <liyihang9@huawei.com>
      Signed-off-by: default avatarXiang Chen <chenxiang66@hisilicon.com>
      Link: https://lore.kernel.org/r/1705904747-62186-3-git-send-email-chenxiang66@hisilicon.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      3f030550
    • Yihang Li's avatar
      scsi: hisi_sas: Fix a deadlock issue related to automatic dump · 3c4f53b2
      Yihang Li authored
      If we issue a disabling PHY command, the device attached with it will go
      offline, if a 2 bit ECC error occurs at the same time, a hung task may be
      found:
      
      [ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds.
      [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 4613.674809] task:kworker/u256:0  state:D stack:    0 pid:165233 ppid:     2 flags:0x00000208
      [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas]
      [ 4613.691518] Call trace:
      [ 4613.694678]  __switch_to+0xf8/0x17c
      [ 4613.698872]  __schedule+0x660/0xee0
      [ 4613.703063]  schedule+0xac/0x240
      [ 4613.706994]  schedule_timeout+0x500/0x610
      [ 4613.711705]  __down+0x128/0x36c
      [ 4613.715548]  down+0x240/0x2d0
      [ 4613.719221]  hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main]
      [ 4613.726618]  sas_execute_internal_abort+0x144/0x310 [libsas]
      [ 4613.732976]  sas_execute_internal_abort_dev+0x44/0x60 [libsas]
      [ 4613.739504]  hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main]
      [ 4613.747499]  hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main]
      [ 4613.753682]  sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas]
      [ 4613.759781]  sas_unregister_common_dev+0x4c/0x7a0 [libsas]
      [ 4613.765962]  sas_destruct_devices+0xb8/0x120 [libsas]
      [ 4613.771709]  sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas]
      [ 4613.778930]  sas_revalidate_domain+0x60/0xa4 [libsas]
      [ 4613.784716]  process_one_work+0x248/0x950
      [ 4613.789424]  worker_thread+0x318/0x934
      [ 4613.793878]  kthread+0x190/0x200
      [ 4613.797810]  ret_from_fork+0x10/0x18
      [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds.
      [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 4613.824538] task:kworker/u256:4  state:D stack:    0 pid:316722 ppid:     2 flags:0x00000208
      [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main]
      [ 4613.841491] Call trace:
      [ 4613.844647]  __switch_to+0xf8/0x17c
      [ 4613.848852]  __schedule+0x660/0xee0
      [ 4613.853052]  schedule+0xac/0x240
      [ 4613.856984]  schedule_timeout+0x500/0x610
      [ 4613.861695]  __down+0x128/0x36c
      [ 4613.865542]  down+0x240/0x2d0
      [ 4613.869216]  hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main]
      [ 4613.876324]  hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main]
      [ 4613.883019]  process_one_work+0x248/0x950
      [ 4613.887732]  worker_thread+0x318/0x934
      [ 4613.892204]  kthread+0x190/0x200
      [ 4613.896118]  ret_from_fork+0x10/0x18
      [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds.
      [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [ 4613.922852] task:kworker/u256:1  state:D stack:    0 pid:348985 ppid:     2 flags:0x00000208
      [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas]
      [ 4613.939549] Call trace:
      [ 4613.942702]  __switch_to+0xf8/0x17c
      [ 4613.946892]  __schedule+0x660/0xee0
      [ 4613.951083]  schedule+0xac/0x240
      [ 4613.955015]  schedule_timeout+0x500/0x610
      [ 4613.959725]  wait_for_common+0x200/0x610
      [ 4613.964349]  wait_for_completion+0x3c/0x5c
      [ 4613.969146]  flush_workqueue+0x198/0x790
      [ 4613.973776]  sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas]
      [ 4613.979960]  sas_port_event_worker+0x54/0xa0 [libsas]
      [ 4613.985708]  process_one_work+0x248/0x950
      [ 4613.990420]  worker_thread+0x318/0x934
      [ 4613.994868]  kthread+0x190/0x200
      [ 4613.998800]  ret_from_fork+0x10/0x18
      
      This is because when the device goes offline, we obtain the hisi_hba
      semaphore and send the ABORT_DEV command to the device. However, the
      internal abort timed out due to the 2 bit ECC error and triggers automatic
      dump. In addition, since the hisi_hba semaphore has been obtained, the dump
      cannot be executed and the controller cannot be reset.
      
      Therefore, the deadlocks occur on the following circular dependencies:
      hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ...
      -> hisi_sas_internal_abort_timeout() -> down().
      
      The deadlock is triggered only when the timeout occurs during device goes
      offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario
      where a device goes offline from other scenarios.
      
      Fixes: 2ff07b5c ("scsi: hisi_sas: Directly call register snapshot instead of using workqueue")
      Signed-off-by: default avatarYihang Li <liyihang9@huawei.com>
      Signed-off-by: default avatarXiang Chen <chenxiang66@hisilicon.com>
      Link: https://lore.kernel.org/r/1705904747-62186-2-git-send-email-chenxiang66@hisilicon.comSigned-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      3c4f53b2
  2. 24 Jan, 2024 16 commits
  3. 21 Jan, 2024 19 commits