• Trond Myklebust's avatar
    NFSv4.1: Prevent a 3-way deadlock between layoutreturn, open and state recovery · f22e5edd
    Trond Myklebust authored
    Andy Adamson reports:
    
    The state manager is recovering expired state and recovery OPENs are being
    processed. If kswapd is pruning inodes at the same time, a deadlock can occur
    when kswapd calls evict_inode on an NFSv4.1 inode with a layout, and the
    resultant layoutreturn gets an error that the state mangager is to handle,
    causing the layoutreturn to wait on the (NFS client) cl_rpcwaitq.
    
    At the same time an open is waiting for the inode deletion to complete in
    __wait_on_freeing_inode.
    
    If the open is either the open called by the state manager, or an open from
    the same open owner that is holding the NFSv4 sequence id which causes the
    OPEN from the state manager to wait for the sequence id on the Seqid_waitqueue,
    then the state is deadlocked with kswapd.
    
    The fix is simply to have layoutreturn ignore all errors except NFS4ERR_DELAY.
    We already know that layouts are dropped on all server reboots, and that
    it has to be coded to deal with the "forgetful client model" that doesn't
    send layoutreturns.
    Reported-by: default avatarAndy Adamson <andros@netapp.com>
    Link: http://lkml.kernel.org/r/1385402270-14284-1-git-send-email-andros@netapp.comSigned-off-by: default avatarTrond Myklebust <Trond.Myklebust@primarydata.com>
    f22e5edd
nfs4proc.c 227 KB