Commit 9078e843 authored by Shay Drory's avatar Shay Drory Committed by Saeed Mahameed

net/mlx5: Avoid recovery in probe flows

Currently, recovery is done without considering whether the device is
still in probe flow.
This may lead to recovery before device have finished probed
successfully. e.g.: while mlx5_init_one() is running. Recovery flow is
using functionality that is loaded only by mlx5_init_one(), and there
is no point in running recovery without mlx5_init_one() finished
successfully.

Fix it by waiting for probe flow to finish and checking whether the
device is probed before trying to perform recovery.

Fixes: 51d138c2 ("net/mlx5: Fix health error state handling")
Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
parent 44aee8ea
...@@ -674,6 +674,12 @@ static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work) ...@@ -674,6 +674,12 @@ static void mlx5_fw_fatal_reporter_err_work(struct work_struct *work)
dev = container_of(priv, struct mlx5_core_dev, priv); dev = container_of(priv, struct mlx5_core_dev, priv);
devlink = priv_to_devlink(dev); devlink = priv_to_devlink(dev);
mutex_lock(&dev->intf_state_mutex);
if (test_bit(MLX5_DROP_NEW_HEALTH_WORK, &health->flags)) {
mlx5_core_err(dev, "health works are not permitted at this stage\n");
return;
}
mutex_unlock(&dev->intf_state_mutex);
enter_error_state(dev, false); enter_error_state(dev, false);
if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) { if (IS_ERR_OR_NULL(health->fw_fatal_reporter)) {
devl_lock(devlink); devl_lock(devlink);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment