• Ido Schimmel's avatar
    mlxsw: pci: Fix driver initialization with Spectrum-4 · 0602697d
    Ido Schimmel authored
    Cited commit added support for a new reset flow ("all reset") which is
    deeper than the existing reset flow ("software reset") and allows the
    device's PCI firmware to be upgraded.
    
    In the new flow the driver first tells the firmware that "all reset" is
    required by issuing a new reset command (i.e., MRSR.command=6) and then
    triggers the reset by having the PCI core issue a secondary bus reset
    (SBR).
    
    However, due to a race condition in the device's firmware the device is
    not always able to recover from this reset, resulting in initialization
    failures [1].
    
    New firmware versions include a fix for the bug and advertise it using a
    new capability bit in the Management Capabilities Mask (MCAM) register.
    
    Avoid initialization failures by reading the new capability bit and
    triggering the new reset flow only if the bit is set. If the bit is not
    set, trigger a normal PCI hot reset by skipping the call to the
    Management Reset and Shutdown Register (MRSR).
    
    Normal PCI hot reset is weaker than "all reset", but it results in a
    fully operational driver and allows users to flash a new firmware, if
    they want to.
    
    [1]
    mlxsw_spectrum4 0000:01:00.0: not ready 1023ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 2047ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 4095ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 8191ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 16383ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 32767ms after bus reset; waiting
    mlxsw_spectrum4 0000:01:00.0: not ready 65535ms after bus reset; giving up
    mlxsw_spectrum4 0000:01:00.0: PCI function reset failed with -25
    mlxsw_spectrum4 0000:01:00.0: cannot register bus device
    mlxsw_spectrum4: probe of 0000:01:00.0 failed with error -25
    
    Fixes: f257c73e ("mlxsw: pci: Add support for new reset flow")
    Reported-by: default avatarMaksym Yaremchuk <maksymy@nvidia.com>
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Tested-by: default avatarMaksym Yaremchuk <maksymy@nvidia.com>
    Reviewed-by: default avatarSimon Horman <horms@kernel.org>
    Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    0602697d
reg.h 379 KB