net/mlx5: E-Switch, Use atomic rep state to serialize state change
[ Upstream commit 6f4e0219 ] When the state of rep was introduced, it was also designed to prevent duplicate unloading of the same rep. Considering the following two flows when an eswitch manager is at switchdev mode with n VF reps loaded. +--------------------------------------+--------------------------------+ | cpu-0 | cpu-1 | | -------- | -------- | | mlx5_ib_remove | mlx5_eswitch_disable_sriov | | mlx5_ib_unregister_vport_reps | esw_offloads_cleanup | | mlx5_eswitch_unregister_vport_reps | esw_offloads_unload_all_reps | | __unload_reps_all_vport | __unload_reps_all_vport | +--------------------------------------+--------------------------------+ These two flows will try to unload the same rep. Per original design, once one flow unloads the rep, the state moves to REGISTERED. The 2nd flow will no longer needs to do the unload and bails out. However, as read and write of the state is not atomic, when 1st flow is doing the unload, the state is still LOADED, 2nd flow is able to do the same unload action. Kernel crash will happen. To solve this, driver should do atomic test-and-set for the state. So that only one flow can change the rep state from LOADED to REGISTERED, and proceed to do the actual unloading. Since the state is changing to atomic type, all other read/write should be atomic action as well. Fixes: f121e0ea (net/mlx5: E-Switch, Add state to eswitch vport representors) Signed-off-by: Bodong Wang <bodong@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Vu Pham <vuhuong@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Showing
Please register or sign in to comment