• James Smart's avatar
    [SCSI] lpfc 8.1.9 : Stall eh handlers if resetting while rport blocked · a90f5684
    James Smart authored
    Stall error handler if attempting resets/aborts while an rport is blocked.
    This avoids device offline scenarios due to errors in the error handler.
    
    Background:
      Although the transport is using the scsi_timed_out functionality to
      restart the timeout if the rport is blocked, if the timeout has already
      fired before the block occurs, the eh handler still runs and can take
      the device offline. Ultimately, this window cannot be resolved without
      significant work in the error handler thread. Christoph noted the first
      level of these issues when he noted the poor error response handling
      by the error thread.
    
      We found, under heavy load and error testing, that time window from when
      the scsi_times_out() adds the io to the queue to when the scsi_error_handler
      gets around to servicing it, can be in the several seconds range. In most
      cases, these test conditions are highly unusual, but possible.
      As a result, we're stalling the error handler in this race window so that
      we can avoid the device_offline transitions.
    Signed-off-by: default avatarJames Smart <James.Smart@emulex.com>
    Signed-off-by: default avatarJames Bottomley <James.Bottomley@SteelEye.com>
    a90f5684
lpfc_scsi.c 36.5 KB