Commit fba2e4e5 authored by Somnath Kotur's avatar Somnath Kotur Committed by Jakub Kicinski

bnxt_en: Allocate page pool per numa node

Driver's Page Pool allocation code looks at the node local
to the PCIe device to determine where to allocate memory.
In scenarios where the core count per NUMA node is low (< default rings)
it makes sense to exhaust page pool allocations on
Node 0 first and then moving on to allocating page pools
for the remaining rings from Node 1.

With this patch, and the following configuration on the NIC
$ ethtool -L ens1f0np0 combined 16
(core count/node = 12, first 12 rings on node#0, last 4 rings node#1)
and traffic redirected to a ring on node#1 , we see a performance
improvement of ~20%
Signed-off-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
Signed-off-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240402093753.331120-4-pavan.chebbi@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parent 8635ae8e
......@@ -3559,14 +3559,15 @@ static void bnxt_free_rx_rings(struct bnxt *bp)
}
static int bnxt_alloc_rx_page_pool(struct bnxt *bp,
struct bnxt_rx_ring_info *rxr)
struct bnxt_rx_ring_info *rxr,
int numa_node)
{
struct page_pool_params pp = { 0 };
pp.pool_size = bp->rx_agg_ring_size;
if (BNXT_RX_PAGE_MODE(bp))
pp.pool_size += bp->rx_ring_size;
pp.nid = dev_to_node(&bp->pdev->dev);
pp.nid = numa_node;
pp.napi = &rxr->bnapi->napi;
pp.netdev = bp->dev;
pp.dev = &bp->pdev->dev;
......@@ -3586,7 +3587,8 @@ static int bnxt_alloc_rx_page_pool(struct bnxt *bp,
static int bnxt_alloc_rx_rings(struct bnxt *bp)
{
int i, rc = 0, agg_rings = 0;
int numa_node = dev_to_node(&bp->pdev->dev);
int i, rc = 0, agg_rings = 0, cpu;
if (!bp->rx_ring)
return -ENOMEM;
......@@ -3597,10 +3599,15 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp)
for (i = 0; i < bp->rx_nr_rings; i++) {
struct bnxt_rx_ring_info *rxr = &bp->rx_ring[i];
struct bnxt_ring_struct *ring;
int cpu_node;
ring = &rxr->rx_ring_struct;
rc = bnxt_alloc_rx_page_pool(bp, rxr);
cpu = cpumask_local_spread(i, numa_node);
cpu_node = cpu_to_node(cpu);
netdev_dbg(bp->dev, "Allocating page pool for rx_ring[%d] on numa_node: %d\n",
i, cpu_node);
rc = bnxt_alloc_rx_page_pool(bp, rxr, cpu_node);
if (rc)
return rc;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment