• Parav Pandit's avatar
    RDMA/core: Sync unregistration with netlink commands · 01b67117
    Parav Pandit authored
    When the rdma device is getting removed, get resource info can race with
    device removal, as below:
    
          CPU-0                                  CPU-1
        --------                               --------
        rdma_nl_rcv_msg()
           nldev_res_get_cq_dumpit()
              mutex_lock(device_lock);
              get device reference
              mutex_unlock(device_lock);        [..]
                                                ib_unregister_device()
                                                /* Valid reference to
                                                 * device->dev exists.
                                                 */
                                                 ib_dealloc_device()
    
              [..]
              provider->fill_res_entry();
    
    Even though device object is not freed, fill_res_entry() can get called on
    device which doesn't have a driver anymore. Kernel core device reference
    count is not sufficient, as this only keeps the structure valid, and
    doesn't guarantee the driver is still loaded.
    
    Similar race can occur with device renaming and device removal, where
    device_rename() tries to rename a unregistered device. While this is fine
    for devices of a class which are not net namespace aware, but it is
    incorrect for net namespace aware class coming in subsequent series.  If a
    class is net namespace aware, then the below [1] call trace is observed in
    above situation.
    
    Therefore, to avoid the race, keep a reference count and let device
    unregistration wait until all netlink users drop the reference.
    
    [1] Call trace:
    kernfs: ns required in 'infiniband' for 'mlx5_0'
    WARNING: CPU: 18 PID: 44270 at fs/kernfs/dir.c:842 kernfs_find_ns+0x104/0x120
    libahci i2c_core mlxfw libata dca [last unloaded: devlink]
    RIP: 0010:kernfs_find_ns+0x104/0x120
    Call Trace:
    kernfs_find_and_get_ns+0x2e/0x50
    sysfs_rename_link_ns+0x40/0xb0
    device_rename+0xb2/0xf0
    ib_device_rename+0xb3/0x100 [ib_core]
    nldev_set_doit+0x165/0x190 [ib_core]
    rdma_nl_rcv_msg+0x249/0x250 [ib_core]
    ? netlink_deliver_tap+0x8f/0x3e0
    rdma_nl_rcv+0xd6/0x120 [ib_core]
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2f0/0x3e0
    sock_sendmsg+0x30/0x40
    __sys_sendto+0xdc/0x160
    
    Fixes: da5c8507 ("RDMA/nldev: add driver-specific resource tracking")
    Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
    Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
    01b67117
nldev.c 30.9 KB