• Mike Marciniszyn's avatar
    IB/qib: Prevent double completions after a timeout or RNR error · c0af2c05
    Mike Marciniszyn authored
    There is a double completion associated with error handling for RC QPs.
    
    The sequence is:
    
     - The do_rc_ack() routine fields an RNR nack and there are 0
       rnr_retries configured on the QP.
     - qib_error_qp() stops the pending timer
     - qib_rc_send_complete() is called from sdma_complete()
     - qib_rc_send_complete() starts the timer because the msb of the psn
       just completed says an ack is needed.
     - a bunch of flushes occur as ipoib posts WQEs to an error'ed QP
     - rc_timeout() calls qib_restart_rc()
     - qib_restart_rc() calls qib_send_complete() with a
       IB_WC_RETRY_EXC_ERR on a wqe that has already been completed in the
       past
    
    The fix avoids starting the timer since another packet will never
    arrive.
    Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@qlogic.com>
    Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
    c0af2c05
qib_rc.c 61.1 KB