1. 03 Aug, 2018 1 commit
  2. 09 Mar, 2018 1 commit
  3. 03 Feb, 2018 2 commits
  4. 23 Oct, 2017 1 commit
    • Sagi Grimberg's avatar
      nvme-rdma: fix possible hang when issuing commands during ctrl removal · 7db81446
      Sagi Grimberg authored
      
      nvme_rdma_queue_is_ready() fails requests in case a queue is not
      LIVE. If the controller is in RECONNECTING state, we might be in
      this state for a long time (until we successfully reconnect) and
      we are better off with failing the request fast. Otherwise, we
      fail with BLK_STS_RESOURCE to have the block layer try again
      soon.
      
      In case we are removing the controller when the admin queue
      is not LIVE, we will terminate the request with BLK_STS_RESOURCE
      but it happens before we call blk_mq_start_request() so the
      request timeout never expires, and the queue will never get
      back to LIVE (because we are removing the controller). This
      causes the removal operation to block infinitly [1].
      
      Thus, if we are removing (state DELETING), and the queue is
      not LIVE, we need to fail the request permanently as there is
      no chance for it to ever complete successfully.
      
      [1]
      --
      sysrq: SysRq : Show Blocked State
        task                        PC stack   pid father
      kworker/u66:2   D    0   440      2 0x80000000
      Workqueue: nvme-wq nvme_rdma_del_ctrl_work [nvme_rdma]
      Call Trace:
       __schedule+0x3e9/0xb00
       schedule+0x40/0x90
       schedule_timeout+0x221/0x580
       io_schedule_timeout+0x1e/0x50
       wait_for_completion_io_timeout+0x118/0x180
       blk_execute_rq+0x86/0xc0
       __nvme_submit_sync_cmd+0x89/0xf0
       nvmf_reg_write32+0x4b/0x90 [nvme_fabrics]
       nvme_shutdown_ctrl+0x41/0xe0
       nvme_rdma_shutdown_ctrl+0xca/0xd0 [nvme_rdma]
       nvme_rdma_remove_ctrl+0x2b/0x40 [nvme_rdma]
       nvme_rdma_del_ctrl_work+0x25/0x30 [nvme_rdma]
       process_one_work+0x1fd/0x630
       worker_thread+0x1db/0x3b0
       kthread+0x11e/0x150
       ret_from_fork+0x27/0x40
      01              D    0  2868   2862 0x00000000
      Call Trace:
       __schedule+0x3e9/0xb00
       schedule+0x40/0x90
       schedule_timeout+0x260/0x580
       wait_for_completion+0x108/0x170
       flush_work+0x1e0/0x270
       nvme_rdma_del_ctrl+0x5a/0x80 [nvme_rdma]
       nvme_sysfs_delete+0x2a/0x40
       dev_attr_store+0x18/0x30
       sysfs_kf_write+0x45/0x60
       kernfs_fop_write+0x124/0x1c0
       __vfs_write+0x28/0x150
       vfs_write+0xc7/0x1b0
       SyS_write+0x49/0xa0
       entry_SYSCALL_64_fastpath+0x18/0xad
      --
      Reported-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      7db81446
  5. 19 Oct, 2017 2 commits
  6. 25 Sep, 2017 2 commits
  7. 30 Aug, 2017 1 commit
  8. 28 Aug, 2017 14 commits
  9. 18 Aug, 2017 2 commits
  10. 08 Aug, 2017 1 commit
  11. 06 Jul, 2017 4 commits
  12. 04 Jul, 2017 1 commit
  13. 02 Jul, 2017 2 commits
  14. 28 Jun, 2017 2 commits
  15. 15 Jun, 2017 4 commits