• Alexander Aring's avatar
    dlm: fix remove member after close call · 2776635e
    Alexander Aring authored
    The idea of commit 63e711b0 ("fs: dlm: create midcomms nodes when
    configure") is to set the midcomms node lifetime when a node joins or
    leaves the cluster. Currently we can hit the following warning:
    
    [10844.611495] ------------[ cut here ]------------
    [10844.615913] WARNING: CPU: 4 PID: 84304 at fs/dlm/midcomms.c:1263
    dlm_midcomms_remove_member+0x13f/0x180 [dlm]
    
    or running in a state where we hit a midcomms node usage count in a
    negative value:
    
    [  260.830782] node 2 users dec count -1
    
    The first warning happens when the a specific node does not exists and
    it was probably removed but dlm_midcomms_close() which is called when a
    node leaves the cluster. The second kernel log message is probably in a
    case when dlm_midcomms_addr() is called when a joined the cluster but
    due fencing a node leaved the cluster without getting removed from the
    lockspace. If the node joins the cluster and it was removed from the
    cluster due fencing the first call is to remove the node from lockspaces
    triggered by the user space. In both cases if the node wasn't found or
    the user count is zero, we should ignore any additional midcomms handling
    of dlm_midcomms_remove_member().
    
    Fixes: 63e711b0 ("fs: dlm: create midcomms nodes when configure")
    Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
    Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
    2776635e
midcomms.c 38.9 KB