net/mlx4_en: avoid one cache line miss to ring doorbell
This patch caches doorbell address directly in struct mlx4_en_tx_ring. This removes the need to bring in cpu caches whole struct mlx4_uar in fast path. Note that mlx4_uar is not guaranteed to be on a local node, because mlx4_bf_alloc() uses a single free list (priv->bf_list) regardless of its node parameter. This kind of change does matter in presence of light/moderate traffic. In high stress, this read-only line would be kept hot in caches. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Showing
Please register or sign in to comment