• Trond Myklebust's avatar
    SUNRPC: fix hang due to eventd deadlock... · c1384c9c
    Trond Myklebust authored
    Brian Behlendorf writes:
    
    The root cause of the NFS hang we were observing appears to be a rare
    deadlock between the kernel provided usermodehelper API and the linux NFS
    client.  The deadlock can arise because both of these services use the
    generic linux work queues.  The usermodehelper API run the specified user
    application in the context of the work queue.  And NFS submits both cleanup
    and reconnect work to the generic work queue for handling.  Normally this
    is fine but a deadlock can result in the following situation.
    
      - NFS client is in a disconnected state
      - [events/0] runs a usermodehelper app with an NFS dependent operation,
        this triggers an NFS reconnect.
      - NFS reconnect happens to be submitted to [events/0] work queue.
      - Deadlock, the [events/0] work queue will never process the
        reconnect because it is blocked on the previous NFS dependent
        operation which will not complete.`
    
    The solution is simply to run reconnect requests on rpciod.
    Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
    c1384c9c
xprtsock.c 42.8 KB