• Joseph Qi's avatar
    ocfs2: fix deadlock between o2hb thread and o2net_wq · 70e82a12
    Joseph Qi authored
    The following case may lead to o2net_wq and o2hb thread deadlock on
    o2hb_callback_sem.
    Currently there are 2 nodes say N1, N2 in the cluster. And N2 down, at
    the same time, N3 tries to join the cluster. So N1 will handle node
    down (N2) and join (N3) simultaneously.
        o2hb                               o2net_wq
        ->o2hb_do_disk_heartbeat
        ->o2hb_check_slot
        ->o2hb_run_event_list
        ->o2hb_fire_callbacks
        ->down_write(&o2hb_callback_sem)
        ->o2net_hb_node_down_cb
        ->flush_workqueue(o2net_wq)
                                           ->o2net_process_message
                                           ->dlm_query_join_handler
                                           ->o2hb_check_node_heartbeating
                                           ->o2hb_fill_node_map
                                           ->down_read(&o2hb_callback_sem)
    
    No need to take o2hb_callback_sem in dlm_query_join_handler,
    o2hb_live_lock is enough to protect live node map.
    Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
    Cc: xMark Fasheh <mfasheh@suse.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: jiangyiwen <jiangyiwen@huawei.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    70e82a12
heartbeat.c 69.3 KB