• David Teigland's avatar
    [DLM] fix stopping unstarted recovery · 2cdc98aa
    David Teigland authored
    Red Hat BZ 211914
    
    When many nodes are joining a lockspace simultaneously, the dlm gets a
    quick sequence of stop/start events, a pair for adding each node.
    dlm_controld in user space sends dlm_recoverd in the kernel each stop and
    start event.  dlm_controld will sometimes send the stop before
    dlm_recoverd has had a chance to take up the previously queued start.  The
    stop aborts the processing of the previous start by setting the
    RECOVERY_STOP flag.  dlm_recoverd is erroneously clearing this flag and
    ignoring the stop/abort if it happens to take up the start after the stop
    meant to abort it.  The fix is to check the sequence number that's
    incremented for each stop/start before clearing the flag.
    Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
    Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
    2cdc98aa
recoverd.c 6.69 KB