RDMA/cma: Fix use after free race in roce multicast join
The roce path triggers a work queue that continues to touch the id_priv but doesn't hold any reference on it. Futher, unlike in the IB case, the work queue is not fenced during rdma_destroy_id(). This can trigger a use after free if a destroy is triggered in the incredibly narrow window after the queue_work and the work starting and obtaining the handler_mutex. The only purpose of this work queue is to run the ULP event callback from the standard context, so switch the design to use the existing cma_work_handler() scheme. This simplifies quite a lot of the flow: - Use the cma_work_handler() callback to launch the work for roce. This requires generating the event synchronously inside the rdma_join_multicast(), which in turn means the dummy struct ib_sa_multicast can become a simple stack variable. - cm_work_handler() used the id_priv kref, so we can entirely eliminate the kref inside struct cma_multicast. Since the cma_multicast never leaks into an unprotected work queue the kfree can be done at the same time as for IB. - Eliminating the general multicast.ib requires using cma_set_mgid() in a few places to recompute the mgid. Fixes: 3c86aa70 ("RDMA/cm: Add RDMA CM support for IBoE devices") Link: https://lore.kernel.org/r/20200902081122.745412-9-leon@kernel.orgSigned-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Showing
This diff is collapsed.
Please register or sign in to comment