• Eric Farman's avatar
    scsi: virtio_scsi: Reject commands when virtqueue is broken · 322baf72
    Eric Farman authored
    
    [ Upstream commit 773c7220 ]
    
    In the case of a graceful set of detaches, where the virtio-scsi-ccw
    disk is removed from the guest prior to the controller, the guest
    behaves quite normally.  Specifically, the detach gets us into
    sd_sync_cache to issue a Synchronize Cache(10) command, which
    immediately fails (and is retried a couple of times) because the device
    has been removed.  Later, the removal of the controller sees two CRWs
    presented, but there's no further indication of the removal from the
    guest viewpoint.
    
     [   17.217458] sd 0:0:0:0: [sda] Synchronizing SCSI cache
     [   17.219257] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
     [   21.449400] crw_info : CRW reports slct=0, oflw=0, chn=1, rsc=3, anc=0, erc=4, rsid=2
     [   21.449406] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=3, anc=0, erc=4, rsid=0
    
    However, on s390, the SCSI disks can be removed "by surprise" when an
    entire controller (host) is removed and all associated disks are removed
    via the loop in scsi_forget_host.  The same call to sd_sync_cache is
    made, but because the controller has already been removed, the
    Synchronize Cache(10) command is neither issued (and then failed) nor
    rejected.
    
    That the I/O isn't returned means the guest cannot have other devices
    added nor removed, and other tasks (such as shutdown or reboot) issued
    by the guest will not complete either.  The virtio ring has already been
    marked as broken (via virtio_break_device in virtio_ccw_remove), but we
    still attempt to queue the command only to have it remain there.  The
    calling sequence provides a bit of distinction for us:
    
      virtscsi_queuecommand()
       -> virtscsi_kick_cmd()
        -> virtscsi_add_cmd()
         -> virtqueue_add_sgs()
          -> virtqueue_add()
             if success
               return 0
             elseif vq->broken or vring_mapping_error()
               return -EIO
             else
               return -ENOSPC
    
    A return of ENOSPC is generally a temporary condition, so returning
    "host busy" from virtscsi_queuecommand makes sense here, to have it
    redriven in a moment or two.  But the EIO return code is more of a
    permanent error and so it would be wise to return the I/O itself and
    allow the calling thread to finish gracefully.  The result is these four
    kernel messages in the guest (the fourth one does not occur prior to
    this patch):
    
     [   22.921562] crw_info : CRW reports slct=0, oflw=0, chn=1, rsc=3, anc=0, erc=4, rsid=2
     [   22.921580] crw_info : CRW reports slct=0, oflw=0, chn=0, rsc=3, anc=0, erc=4, rsid=0
     [   22.921978] sd 0:0:0:0: [sda] Synchronizing SCSI cache
     [   22.921993] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
    
    I opted to fill in the same response data that is returned from the more
    graceful device detach, where the disk device is removed prior to the
    controller device.
    Signed-off-by: default avatarEric Farman <farman@linux.vnet.ibm.com>
    Reviewed-by: default avatarFam Zheng <famz@redhat.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    322baf72
virtio_scsi.c 30.7 KB