• Andy Adamson's avatar
    NFSv4.1 Fix a pNFS session draining deadlock · 774d5f14
    Andy Adamson authored
    On a CB_RECALL the callback service thread flushes the inode using
    filemap_flush prior to scheduling the state manager thread to return the
    delegation. When pNFS is used and I/O has not yet gone to the data server
    servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
    filemap_flush call, the LAYOUTGET must proceed to completion.
    
    If the state manager starts to recover data while the inode flush is sending
    the LAYOUTGET, a deadlock occurs as the callback service thread holds the
    single callback session slot until the flushing is done which blocks the state
    manager thread, and the state manager thread has set the session draining bit
    which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
    table waitq.
    
    Separate the draining of the back channel from the draining of the fore channel
    by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
    and back slot tables.  Drain the back channel first allowing the LAYOUTGET
    call to proceed (and fail) so the callback service thread frees the callback
    slot. Then proceed with draining the forechannel.
    Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
    Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
    774d5f14
callback_xdr.c 25 KB