• Jason Gunthorpe's avatar
    RDMA/cma: Fix use after free race in roce multicast join · b5de0c60
    Jason Gunthorpe authored
    The roce path triggers a work queue that continues to touch the id_priv
    but doesn't hold any reference on it. Futher, unlike in the IB case, the
    work queue is not fenced during rdma_destroy_id().
    
    This can trigger a use after free if a destroy is triggered in the
    incredibly narrow window after the queue_work and the work starting and
    obtaining the handler_mutex.
    
    The only purpose of this work queue is to run the ULP event callback from
    the standard context, so switch the design to use the existing
    cma_work_handler() scheme. This simplifies quite a lot of the flow:
    
    - Use the cma_work_handler() callback to launch the work for roce. This
      requires generating the event synchronously inside the
      rdma_join_multicast(), which in turn means the dummy struct
      ib_sa_multicast can become a simple stack variable.
    
    - cm_work_handler() used the id_priv kref, so we can entirely eliminate
      the kref inside struct cma_multicast. Since the cma_multicast never
      leaks into an unprotected work queue the kfree can be done at the same
      time as for IB.
    
    - Eliminating the general multicast.ib requires using cma_set_mgid() in a
      few places to recompute the mgid.
    
    Fixes: 3c86aa70 ("RDMA/cm: Add RDMA CM support for IBoE devices")
    Link: https://lore.kernel.org/r/20200902081122.745412-9-leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    b5de0c60
cma.c 128 KB