• Keith Busch's avatar
    nvme: implement Enhanced Command Retry · 49cd84b6
    Keith Busch authored
    A controller may have an internal state that is not able to successfully
    process commands for a short duration. In such states, an immediate
    command requeue is expected to fail. The driver may exceed its max
    retry count, which permanently ends the command in failure when the same
    command would succeed after waiting for the controller to be ready.
    
    NVMe ratified TP 4033 provides a delay hint in the completion status
    code for failed commands. Implement the retry delay based on the command
    completion status and the controller's requested delay.
    
    Note that requeued commands are handled per request_queue, not per
    individual request. If multiple commands fail, the controller should
    consistently report the desired delay time for retryable commands in
    all CQEs, otherwise the requeue list may be kicked too soon.
    Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
    Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    49cd84b6
core.c 94.6 KB