• Chuck Lever's avatar
    svcrdma: Revert "svcrdma: Reduce Receive doorbell rate" · bade4be6
    Chuck Lever authored
    I tested commit 43042b90 ("svcrdma: Reduce Receive doorbell
    rate") with mlx4 (IB) and software iWARP and didn't find any
    issues. However, I recently got my hardware iWARP setup back on
    line (FastLinQ) and it's crashing hard on this commit (confirmed
    via bisect).
    
    The failure mode is complex.
     - After a connection is established, the first Receive completes
       normally.
     - But the second and third Receives have garbage in their Receive
       buffers. The server responds with ERR_VERS as a result.
     - When the client tears down the connection to retry, a couple
       of posted Receives flush twice, and that corrupts the recv_ctxt
       free list.
     - __svc_rdma_free then faults or loops infinitely while destroying
       the xprt's recv_ctxts.
    
    Since 43042b90 ("svcrdma: Reduce Receive doorbell rate") does
    not fix a bug but is a scalability enhancement, it's safe and
    appropriate to revert it while working on a replacement.
    
    Fixes: 43042b90 ("svcrdma: Reduce Receive doorbell rate")
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    bade4be6
svc_rdma_recvfrom.c 25.3 KB