• Mark Bloch's avatar
    RDMA/mlx5: Fix affinity assignment · 617f5db1
    Mark Bloch authored
    The cited commit aimed to ensure that Virtual Functions (VFs) assign a
    queue affinity to a Queue Pair (QP) to distribute traffic when
    the LAG master creates a hardware LAG. If the affinity was set while
    the hardware was not in LAG, the firmware would ignore the affinity value.
    
    However, this commit unintentionally assigned an affinity to QPs on the LAG
    master's VPORT even if the RDMA device was not marked as LAG-enabled.
    In most cases, this was not an issue because when the hardware entered
    hardware LAG configuration, the RDMA device of the LAG master would be
    destroyed and a new one would be created, marked as LAG-enabled.
    
    The problem arises when a user configures Equal-Cost Multipath (ECMP).
    In ECMP mode, traffic can be directed to different physical ports based on
    the queue affinity, which is intended for use by VPORTS other than the
    E-Switch manager. ECMP mode is supported only if both E-Switch managers are
    in switchdev mode and the appropriate route is configured via IP. In this
    configuration, the RDMA device is not destroyed, and we retain the RDMA
    device that is not marked as LAG-enabled.
    
    To ensure correct behavior, Send Queues (SQs) opened by the E-Switch
    manager through verbs should be assigned strict affinity. This means they
    will only be able to communicate through the native physical port
    associated with the E-Switch manager. This will prevent the firmware from
    assigning affinity and will not allow the SQs to be remapped in case of
    failover.
    
    Fixes: 802dcc7f ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode")
    Reviewed-by: default avatarMaor Gottlieb <maorg@nvidia.com>
    Signed-off-by: default avatarMark Bloch <mbloch@nvidia.com>
    Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
    617f5db1
qp.c 158 KB