Commit 82463d9d authored by Jakub Kicinski's avatar Jakub Kicinski

Merge branch 'fix-trainwreck-with-ocelot-switch-statistics-counters'

Vladimir Oltean says:

====================
Fix trainwreck with Ocelot switch statistics counters

While testing the patch set for preemptible traffic classes with some
controlled traffic and measuring counter deltas:
https://lore.kernel.org/netdev/20230220122343.1156614-1-vladimir.oltean@nxp.com/

I noticed that in the output of "ethtool -S swp0 --groups eth-mac
eth-phy eth-ctrl rmon -- --src emac | grep -v ': 0'", the TX counters
were off. Quickly I realized that their values were permutated by 1
compared to their names, and that for example
tx-rmon-etherStatsPkts64to64Octets was incrementing when
tx-rmon-etherStatsPkts65to127Octets should have.

Initially I suspected something having to do with the bulk reading
logic, and indeed I found a bug there (fixed as 1/3), but that was not
the source of the problems. Instead it revealed other problems.

While dumping the regions created by the driver on my switch, I figured
out that it sees a discontinuity which shouldn't have existed between
reg 0x278 and reg 0x280.

Discontinuity between last reg 0x0 and new reg 0x0, creating new region
Discontinuity between last reg 0x108 and new reg 0x200, creating new region
Discontinuity between last reg 0x278 and new reg 0x280, creating new region
Discontinuity between last reg 0x2b0 and new reg 0x400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 31 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 13 contiguous counters starting with SYS:STAT:CNT[0x0a0]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]

That is where TX_MM_HOLD should have been, and that was the bug, since
it was missing. After adding it, the regions look like this and the
off-by-one issue is resolved:

Discontinuity between last reg 0x000000 and new reg 0x000000, creating new region
Discontinuity between last reg 0x000108 and new reg 0x000200, creating new region
Discontinuity between last reg 0x0002b0 and new reg 0x000400, creating new region
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]

However, as I am thinking out loud, it should have not reported the
other counters as off by one even when skipping TX_MM_HOLD... after all,
on Ocelot/Seville, there are more counters which need to be skipped.

Which is when I investigated and noticed the bug solved in 2/3.
I've validated that both on native VSC9959 (which uses
ocelot_mm_stats_layout) as well as by faking the other switches by
making VSC9959 use the plain ocelot_stats_layout.

To summarize: on all Ocelot switches, the TX counters and drop counters
are completely broken. The RX counters are mostly fine.

With this occasion, I have collected more cleanup patches in this area,
which I'm going to submit after the net -> net-next merge.
====================

Link: https://lore.kernel.org/r/20230321010325.897817-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents 8e50ed77 5291099e
......@@ -258,6 +258,7 @@ struct ocelot_stat_layout {
struct ocelot_stats_region {
struct list_head node;
u32 base;
enum ocelot_stat first_stat;
int count;
u32 *buf;
};
......@@ -273,6 +274,7 @@ static const struct ocelot_stat_layout ocelot_mm_stats_layout[OCELOT_NUM_STATS]
OCELOT_STAT(RX_ASSEMBLY_OK),
OCELOT_STAT(RX_MERGE_FRAGMENTS),
OCELOT_STAT(TX_MERGE_FRAGMENTS),
OCELOT_STAT(TX_MM_HOLD),
OCELOT_STAT(RX_PMAC_OCTETS),
OCELOT_STAT(RX_PMAC_UNICAST),
OCELOT_STAT(RX_PMAC_MULTICAST),
......@@ -341,11 +343,12 @@ static int ocelot_port_update_stats(struct ocelot *ocelot, int port)
*/
static void ocelot_port_transfer_stats(struct ocelot *ocelot, int port)
{
unsigned int idx = port * OCELOT_NUM_STATS;
struct ocelot_stats_region *region;
int j;
list_for_each_entry(region, &ocelot->stats_regions, node) {
unsigned int idx = port * OCELOT_NUM_STATS + region->first_stat;
for (j = 0; j < region->count; j++) {
u64 *stat = &ocelot->stats[idx + j];
u64 val = region->buf[j];
......@@ -355,8 +358,6 @@ static void ocelot_port_transfer_stats(struct ocelot *ocelot, int port)
*stat = (*stat & ~(u64)U32_MAX) + val;
}
idx += region->count;
}
}
......@@ -899,7 +900,8 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
if (!layout[i].reg)
continue;
if (region && layout[i].reg == last + 4) {
if (region && ocelot->map[SYS][layout[i].reg & REG_MASK] ==
ocelot->map[SYS][last & REG_MASK] + 4) {
region->count++;
} else {
region = devm_kzalloc(ocelot->dev, sizeof(*region),
......@@ -914,6 +916,7 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot)
WARN_ON(last >= layout[i].reg);
region->base = layout[i].reg;
region->first_stat = i;
region->count = 1;
list_add_tail(&region->node, &ocelot->stats_regions);
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment