• Xue jiufei's avatar
    ocfs2/dlm: fix a race between purge and migration · 30bee898
    Xue jiufei authored
    We found a race between purge and migration when doing code review.
    Node A put lockres to purgelist before receiving the migrate message
    from node B which is the master.  Node A call dlm_mig_lockres_handler to
    handle this message.
    
    dlm_mig_lockres_handler
      dlm_lookup_lockres
      >>>>>> race window, dlm_run_purge_list may run and send
             deref message to master, waiting the response
      spin_lock(&res->spinlock);
      res->state |= DLM_LOCK_RES_MIGRATING;
      spin_unlock(&res->spinlock);
      dlm_mig_lockres_handler returns
    
      >>>>>> dlm_thread receives the response from master for the deref
      message and triggers the BUG because the lockres has the state
      DLM_LOCK_RES_MIGRATING with the following message:
    
    dlm_purge_lockres:209 ERROR: 6633EB681FA7474A9C280A4E1A836F0F: res
    M0000000000000000030c0300000000 in use after deref
    Signed-off-by: default avatarJiufei Xue <xuejiufei@huawei.com>
    Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
    Reviewed-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
    Cc: Mark Fasheh <mfasheh@suse.de>
    Cc: Joel Becker <jlbec@evilplan.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    30bee898
dlmrecovery.c 86.9 KB