• Shay Drory's avatar
    net/mlx5: Fix health error state handling · 51d138c2
    Shay Drory authored
    Currently, when we discover a fatal error, we are queueing a work that
    will wait for a lock in order to enter the device to error state.
    Meanwhile, FW commands are still being processed, and gets timeouts.
    This can block the driver for few minutes before the work will manage
    to get the lock and enter to error state.
    
    Setting the device to error state before queueing health work, in order
    to avoid FW commands being processed while the work is waiting for the
    lock.
    
    Fixes: c1d4d2e9 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
    Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
    Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
    Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
    51d138c2
health.c 22.8 KB