• Xin Long's avatar
    tipc: call tipc_lxc_xmit without holding node_read_lock · 88956177
    Xin Long authored
    When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
    peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
    might be called, and it's pretty much like in tipc_rcv().
    
    Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
    to protect the peer_net not being freed by another thread. However, when
    receiving these packets, tipc_node_add_conn() might be called where the
    peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
    lockdep detector, although it is not a real dead lock:
    
        WARNING: possible recursive locking detected
        --------------------------------------------
        conn_server/1086 is trying to acquire lock:
        ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
                         at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
    
        but task is already holding lock:
        ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                         at: tipc_node_xmit+0x285/0xb30 [tipc]
    
        other info that might help us debug this:
         Possible unsafe locking scenario:
    
               CPU0
               ----
          lock(&n->lock#2);
          lock(&n->lock#2);
    
         *** DEADLOCK ***
    
         May be due to missing lock nesting notation
    
        4 locks held by conn_server/1086:
         #0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
                              at: tipc_accept+0x9c0/0x10b0 [tipc]
         #1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
                              at: tipc_accept+0x363/0x10b0 [tipc]
         #2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                              at: tipc_node_xmit+0x285/0xb30 [tipc]
         #3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
                              at: tipc_sk_rcv+0x2da/0x1b40 [tipc]
    
        Call Trace:
         <TASK>
         dump_stack_lvl+0x44/0x5b
         __lock_acquire.cold.77+0x1f2/0x3d7
         lock_acquire+0x1d2/0x610
         _raw_write_lock_bh+0x38/0x80
         tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
         tipc_sk_finish_conn+0x21e/0x640 [tipc]
         tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
         tipc_sk_rcv+0xbb4/0x1b40 [tipc]
         tipc_lxc_xmit+0x225/0x26b [tipc]
         tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
         __tipc_sendstream+0x879/0xff0 [tipc]
         tipc_accept+0x966/0x10b0 [tipc]
         do_accept+0x37d/0x590
    
    This patch avoids this warning by not holding the 'node rw lock' before
    calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
    should be enough, as in cleanup_net() when freeing the netns, it calls
    synchronize_rcu() before the free is continued.
    
    Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
    sense to call it under rcu_read_lock(). Note that the right lock order
    must be:
    
       rcu_read_lock();
       tipc_node_read_lock(n);
       tipc_node_read_unlock(n);
       tipc_lxc_xmit();
       rcu_read_unlock();
    
    instead of:
    
       tipc_node_read_lock(n);
       rcu_read_lock();
       tipc_node_read_unlock(n);
       tipc_lxc_xmit();
       rcu_read_unlock();
    
    and we have to call tipc_node_read_lock/unlock() twice in
    tipc_node_xmit().
    
    Fixes: f73b1281 ("tipc: improve throughput between nodes in netns")
    Reported-by: default avatarShuang Li <shuali@redhat.com>
    Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
    Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    88956177
node.c 79.1 KB