Commits · 3d6b24a2ada3d53f7da9362d03856bdec74806e5 · Kirill Smelkov / linux

13 Oct, 2021 16 commits

ravb: Update ravb_emac_init_gbeth() · 3d6b24a2

Biju Das authored Oct 12, 2021

This patch enables Receive/Transmit port of TOE and removes
the setting of promiscuous bit from EMAC configuration mode register.

This patch also update EMAC configuration mode comment from
"PAUSE prohibition" to "EMAC Mode: PAUSE prohibition; Duplex; TX;
RX; CRC Pass Through".
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3d6b24a2

ravb: Document PFRI register bit · 95e99b10

Biju Das authored Oct 12, 2021

Document PFRI register bit, as it is documented on R-Car Gen3 and
RZ/G2L hardware manuals.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

95e99b10

ravb: Rename "nc_queue" feature bit · 1091da57

Biju Das authored Oct 12, 2021

Rename the feature bit "nc_queue" with "nc_queues" as AVB DMAC has
RX and TX NC queues.

There is no functional change.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

1091da57

ravb: Optimize ravb_emac_init_gbeth function · 030634f3

Biju Das authored Oct 12, 2021

Optimize CXR31 register initialization on ravb_emac_init_gbeth
function.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

030634f3

ravb: Rename "tsrq" variable · 4ea3167b

Biju Das authored Oct 12, 2021

Rename the variable "tsrq" with "tccr_mask" as we are passing
TCCR mask to the ravb_wait() function.

There is no functional change.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

4ea3167b

ravb: Add support to retrieve stats for GbEthernet · 0ee65bc1

Biju Das authored Oct 12, 2021

Add support for retrieving stats information for GbEthernet.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

0ee65bc1

ravb: Add carrier_counters to struct ravb_hw_info · b6a4ee6e

Biju Das authored Oct 12, 2021

RZ/G2L E-MAC supports carrier counters.
Add a carrier_counter hw feature bit to struct ravb_hw_info
to add this feature only for RZ/G2L.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b6a4ee6e

ravb: Fillup ravb_rx_gbeth() stub · 1c59eb67

Biju Das authored Oct 12, 2021

Fillup ravb_rx_gbeth() function to support RZ/G2L.

This patch also renames ravb_rcar_rx to ravb_rx_rcar to be
consistent with the naming convention used in sh_eth driver.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

1c59eb67

ravb: Fillup ravb_rx_ring_format_gbeth() stub · 16a6e245

Biju Das authored Oct 12, 2021

Fillup ravb_rx_ring_format_gbeth() function to support RZ/G2L.

This patch also renames ravb_rx_ring_format to ravb_rx_ring_format_rcar
to be consistent with the naming convention used in sh_eth driver.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

16a6e245

ravb: Fillup ravb_rx_ring_free_gbeth() stub · 2458b8ed

Biju Das authored Oct 12, 2021

Fillup ravb_rx_ring_free_gbeth() function to support RZ/G2L.

This patch also renames ravb_rx_ring_free to ravb_rx_ring_free_rcar
to be consistent with the naming convention used in sh_eth driver.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2458b8ed

ravb: Fillup ravb_alloc_rx_desc_gbeth() stub · 3d4e37df

Biju Das authored Oct 12, 2021

Fillup ravb_alloc_rx_desc_gbeth() function to support RZ/G2L.

This patch also renames ravb_alloc_rx_desc to ravb_alloc_rx_desc_rcar
to be consistent with the naming convention used in sh_eth driver.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

3d4e37df

ravb: Add rx_max_buf_size to struct ravb_hw_info · 2e95e08a

Biju Das authored Oct 12, 2021

R-Car AVB-DMAC has maximum 2K size on RX buffer, whereas on RZ/G2L
it is 8K. We need to allow for changing the MTU within the limit
of the maximum size of a descriptor.

Add a rx_max_buf_size variable to struct ravb_hw_info to handle
this difference.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2e95e08a

ravb: Use ALIGN macro for max_rx_len · 23144a91

Biju Das authored Oct 12, 2021

Use ALIGN macro for calculating the value for max_rx_len.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Suggested-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

23144a91

net: qed_debug: fix check of false (grc_param < 0) expression · 50515cac

Jean Sacren authored Oct 12, 2021

The type of enum dbg_grc_params has the enumerator list starting from 0.
When grc_param is declared by enum dbg_grc_params, (grc_param < 0) is
always false. We should remove the check of this expression.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Acked-by: Shai Malin <smalin@marvell.com>
Link: https://lore.kernel.org/r/20211012074645.12864-1-sakiwit@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

50515cac

net: enetc: include ip6_checksum.h for csum_ipv6_magic · edce2a93

Ioana Ciornei authored Oct 12, 2021

For those architectures which do not define_HAVE_ARCH_IPV6_CSUM, we need
to include ip6_checksum.h which provides the csum_ipv6_magic() function.

Fixes: fb8629e2 ("net: enetc: add support for software TSO")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211012121358.16641-1-ioana.ciornei@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

edce2a93

ionic: no devlink_unregister if not registered · d1f24712

Shannon Nelson authored Oct 12, 2021

Don't try to unregister the devlink if it hasn't been registered
yet. This bit of error cleanup code got missed in the recent
devlink registration changes.

Fixes: 7911c8bd ("ionic: Move devlink registration to be last devlink command")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20211012231520.72582-1-snelson@pensando.ioSigned-off-by: Jakub Kicinski <kuba@kernel.org>

d1f24712

12 Oct, 2021 24 commits

Merge branch 'devlink-reload-simplification' · 0e258cec

Jakub Kicinski authored Oct 12, 2021

Leon Romanovsky says:

====================
devlink reload simplification

Simplify devlink reload APIs.
====================

Link: https://lore.kernel.org/r/cover.1634044267.git.leonro@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0e258cec

devlink: Delete reload enable/disable interface · 82465bec

Leon Romanovsky authored Oct 12, 2021

Commit a0c76345 ("devlink: disallow reload operation during device
cleanup") added devlink_reload_{enable,disable}() APIs to prevent reload
operation from racing with device probe/dismantle.

After recent changes to move devlink_register() to the end of device
probe and devlink_unregister() to the beginning of device dismantle,
these races can no longer happen. Reload operations will be denied if
the devlink instance is unregistered and devlink_unregister() will block
until all in-flight operations are done.

Therefore, remove these devlink_reload_{enable,disable}() APIs.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

82465bec

net/mlx5: Set devlink reload feature bit for supported devices only · 96869f19

Leon Romanovsky authored Oct 12, 2021

Mulitport slave device doesn't support devlink reload, so instead of
complicating initialization flow with devlink_reload_enable() which
will be removed in next patch, don't set DEVLINK_F_RELOAD feature bit
for such devices.

This fixes an error when reload counters exposed (and equal zero) for
the mode that is not supported at all.

Fixes: d89ddaae ("net/mlx5: Disable devlink reload for multi port slave device")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

96869f19

devlink: Allow control devlink ops behavior through feature mask · bd032e35

Leon Romanovsky authored Oct 12, 2021

Introduce new devlink call to set feature mask to control devlink
behavior during device initialization phase after devlink_alloc()
is already called.

This allows us to set reload ops based on device property which
is not known at the beginning of driver initialization.

For the sake of simplicity, this API lacks any type of locking and
needs to be called before devlink_register() to make sure that no
parallel access to the ops is possible at this stage.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

bd032e35

devlink: Annotate devlink API calls · b88f7b12

Leon Romanovsky authored Oct 12, 2021

Initial annotation patch to separate calls that needs to be executed
before or after devlink_register().
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b88f7b12

devlink: Move netdev_to_devlink helpers to devlink.c · 2bc50987

Leon Romanovsky authored Oct 12, 2021

Both netdev_to_devlink and netdev_to_devlink_port are used in devlink.c
only, so move them in order to reduce their scope.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2bc50987

devlink: Reduce struct devlink exposure · 21314638

Leon Romanovsky authored Oct 12, 2021

The declaration of struct devlink in general header provokes the
situation where internal fields can be accidentally used by the driver
authors. In order to reduce such possible situations, let's reduce the
namespace exposure of struct devlink.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

21314638

ethernet: tulip: avoid duplicate variable name on sparc · 177c9235

Jakub Kicinski authored Oct 11, 2021

I recently added a variable called addr to tulip_init_one()
but for sparc there's already a variable called that half
way thru the function. Rename it to fix build.

Fixes: ca879317 ("ethernet: tulip: remove direct netdev->dev_addr writes")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

177c9235

net: hns3: debugfs add support dumping page pool info · 850bfb91

Hao Chen authored Oct 11, 2021

Add a file node "page_pool_info" for debugfs, then cat this
file node to dump page pool info as below:

QUEUE_ID ALLOCATE_CNT FREE_CNT POOL_SIZE(PAGE_NUM) ORDER NUMA_ID MAX_LEN
0 512 0 512 0 2 4K
1 512 0 512 0 2 4K
2 512 0 512 0 2 4K
3 512 0 512 0 2 4K
4 512 0 512 0 2 4K
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

850bfb91

tulip: fix setting device address from rom · 25b90c19

Jakub Kicinski authored Oct 11, 2021

I missed removing i from the array index when converting
from a loop to a direct copy.

Fixes: ca879317 ("ethernet: tulip: remove direct netdev->dev_addr writes")
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

25b90c19

Merge branch 'Managed-Neighbor-Entries' · 2ed08b5e

David S. Miller authored Oct 12, 2021

Daniel Borkmann says:

====================
Managed Neighbor Entries

This series adds a couple of fixes related to NTF_EXT_LEARNED and NTF_USE
neighbor flags, extends the UAPI with a new NDA_FLAGS_EXT netlink attribute
in order to be able to add new neighbor flags from user space given all
current struct ndmsg / ndm_flags bits are used up. Finally, the core of this
series adds a new NTF_EXT_MANAGED flag to neighbors, which allows user space
control planes to add 'managed' neighbor entries. Meaning, user space may
either transition existing entries or can push down new L3 entries without
lladdr into the kernel where the latter will periodically try to keep such
NTF_EXT_MANAGED managed entries in reachable state. Main use case for this
series are XDP / tc BPF load-balancers which make use of the bpf_fib_lookup()
helper for backends. For more details, please see individual patches. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2ed08b5e

net, neigh: Add NTF_MANAGED flag for managed neighbor entries · 7482e384

Daniel Borkmann authored Oct 11, 2021

Allow a user space control plane to insert entries with a new NTF_EXT_MANAGED
flag. The flag then indicates to the kernel that the neighbor entry should be
periodically probed for keeping the entry in NUD_REACHABLE state iff possible.

The use case for this is targeting XDP or tc BPF load-balancers which use
the bpf_fib_lookup() BPF helper in order to piggyback on neighbor resolution
for their backends. Given they cannot be resolved in fast-path, a control
plane inserts the L3 (without L2) entries manually into the neighbor table
and lets the kernel do the neighbor resolution either on the gateway or on
the backend directly in case the latter resides in the same L2. This avoids
to deal with L2 in the control plane and to rebuild what the kernel already
does best anyway.

NTF_EXT_MANAGED can be combined with NTF_EXT_LEARNED in order to avoid GC
eviction. The kernel then adds NTF_MANAGED flagged entries to a per-neighbor
table which gets triggered by the system work queue to periodically call
neigh_event_send() for performing the resolution. The implementation allows
migration from/to NTF_MANAGED neighbor entries, so that already existing
entries can be converted by the control plane if needed. Potentially, we could
make the interval for periodically calling neigh_event_send() configurable;
right now it's set to DELAY_PROBE_TIME which is also in line with mlxsw which
has similar driver-internal infrastructure c723c735 ("mlxsw: spectrum_router:
Periodically update the kernel's neigh table"). In future, the latter could
possibly reuse the NTF_MANAGED neighbors as well.

Example:

# ./ip/ip n replace 192.168.178.30 dev enp5s0 managed extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a managed extern_learn REACHABLE
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Link: https://linuxplumbersconf.org/event/11/contributions/953/Signed-off-by: David S. Miller <davem@davemloft.net>

7482e384

net, neigh: Extend neigh->flags to 32 bit to allow for extensions · 2c611ad9

Roopa Prabhu authored Oct 11, 2021

Currently, all bits in struct ndmsg's ndm_flags are used up with the most
recent addition of 435f2e7c ("net: bridge: add support for sticky fdb
entries"). This makes it impossible to extend the neighboring subsystem
with new NTF_* flags:

  struct ndmsg {
    __u8   ndm_family;
    __u8   ndm_pad1;
    __u16  ndm_pad2;
    __s32  ndm_ifindex;
    __u16  ndm_state;
    __u8   ndm_flags;
    __u8   ndm_type;
  };

There are ndm_pad{1,2} attributes which are not used. However, due to
uncareful design, the kernel does not enforce them to be zero upon new
neighbor entry addition, and given they've been around forever, it is
not possible to reuse them today due to risk of breakage. One option to
overcome this limitation is to add a new NDA_FLAGS_EXT attribute for
extended flags.

In struct neighbour, there is a 3 byte hole between protocol and ha_lock,
which allows neigh->flags to be extended from 8 to 32 bits while still
being on the same cacheline as before. This also allows for all future
NTF_* flags being in neigh->flags rather than yet another flags field.
Unknown flags in NDA_FLAGS_EXT will be rejected by the kernel.
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2c611ad9

net, neigh: Enable state migration between NUD_PERMANENT and NTF_USE · 3dc20f47

Daniel Borkmann authored Oct 11, 2021

Currently, it is not possible to migrate a neighbor entry between NUD_PERMANENT
state and NTF_USE flag with a dynamic NUD state from a user space control plane.
Similarly, it is not possible to add/remove NTF_EXT_LEARNED flag from an existing
neighbor entry in combination with NTF_USE flag.

This is due to the latter directly calling into neigh_event_send() without any
meta data updates as happening in __neigh_update(). Thus, to enable this use
case, extend the latter with a NEIGH_UPDATE_F_USE flag where we break the
NUD_PERMANENT state in particular so that a latter neigh_event_send() is able
to re-resolve a neighbor entry.

Before fix, NUD_PERMANENT -> NUD_* & NTF_USE:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
  [...]

As can be seen, despite the admin-triggered replace, the entry remains in the
NUD_PERMANENT state.

After fix, NUD_PERMANENT -> NUD_* & NTF_USE:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
  [...]
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn STALE
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a PERMANENT
  [...]

After the fix, the admin-triggered replace switches to a dynamic state from
the NTF_USE flag which triggered a new neighbor resolution. Likewise, we can
transition back from there, if needed, into NUD_PERMANENT.

Similar before/after behavior can be observed for below transitions:

Before fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [...]

After fix, NTF_USE -> NTF_USE | NTF_EXT_LEARNED -> NTF_USE:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
  [...]
  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [..]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3dc20f47

net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE · e4400bbf

Daniel Borkmann authored Oct 11, 2021

The NTF_EXT_LEARNED neigh flag is usually propagated back to user space
upon dump of the neighbor table. However, when used in combination with
NTF_USE flag this is not the case despite exempting the entry from the
garbage collector. This results in inconsistent state since entries are
typically marked in neigh->flags with NTF_EXT_LEARNED, but here they are
not. Fix it by propagating the creation flag to ___neigh_create().

Before fix:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [...]

After fix:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
  [...]

Fixes: 9ce33e46 ("neighbour: support for NTF_EXT_LEARNED flag")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4400bbf

net: hns: Prefer struct_size over open coded arithmetic · 7bb39a39

Len Baker authored Oct 11, 2021

As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.

So, take the opportunity to refactor the hnae_handle structure to switch
the last member to flexible array, changing the code accordingly. Also,
fix the comment in the hnae_vf_cb structure to inform that the ae_handle
member must be the last member.

Then, use the struct_size() helper to do the arithmetic instead of the
argument "size + count * size" in the kzalloc() function.

This code was detected with the help of Coccinelle and audited and fixed
manually.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-argumentsSigned-off-by: Len Baker <len.baker@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7bb39a39

Merge branch 'mlxsw-ECN-mirroring' · 249ae949

David S. Miller authored Oct 12, 2021

Ido Schimmel says:

====================
mlxsw: Add support for ECN mirroring

Petr says:

Patches in this set have been floating around for some time now together
with trap_fwd support. That will however need more work, time for which is
nowhere to be found, apparently. Instead, this patchset enables offload of
only packet mirroring on RED mark qevent, enabling mirroring of ECN-marked
packets.

Formally it enables offload of filters added to blocks bound to the RED
qevent mark if:

- The switch ASIC is Spectrum-2 or above.
- Only a single filter is attached at the block, at chain 0 (the default),
  and its classifier is matchall.
- The filter has hw_stats set to disabled.
- The filter has a single action, which is mirror.

This differs from early_drop qevent offload, which supports mirroring and
trapping. However trapping in context of ECN-marked packets is not
suitable, because the HW does not drop the packet, as the trap action
implies. And there is as of now no way to express only the part of trapping
that transfers the packet to the SW datapath, sans the HW-datapath drop.

The patchset progresses as follows:

Patch #1 is an extack propagation.

Mirroring of ECN-marked packets is configured in the ASIC through an ECN
trigger, which is considered "egress", unlike the EARLY_DROP trigger.
In patch #2, add a helper to classify triggers as ingress.

As clarified above, traps cannot be offloaded on mark qevent. Similarly,
given a trap_fwd action, it would not be offloadable on early_drop qevent.
In patch #3, introduce support for tracking actions permissible on a given
block.

Patch #4 actually adds the mark qevent offload.

In patch #5, fix a small style issue in one of the selftests, and in
patch #6 add mark offload selftests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

249ae949

selftests: mlxsw: RED: Add selftests for the mark qevent · 0cd6fa99

Petr Machata authored Oct 10, 2021

Add do_mark_test(), which is to do_ecn_test() like do_drop_test() is to
do_red_test(): meant to test that actions on the RED mark qevent block are
offloaded, and executed on ECN-marked packets.

The test splits install_qdisc() into its constituents, install_root_qdisc()
and install_qdisc_tcX(). This is in order to test that when mirroring is
enabled on one TC, the other TC does not mirror.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0cd6fa99

selftests: mlxsw: sch_red_core: Drop two unused variables · a703b517

Petr Machata authored Oct 10, 2021

These variables are cut'n'pasted from other functions in the file and not
actually used.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a703b517

mlxsw: spectrum_qdisc: Offload RED qevent mark · 9c18eaf2

Petr Machata authored Oct 10, 2021

The RED "mark" qevent can be offloaded under similar conditions as the RED
"early_drop" qevent. Therefore recognize its binding type in the
TC_SETUP_BLOCK handler and translate to the right SPAN trigger, with the
right set of supported actions.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c18eaf2

mlxsw: spectrum_qdisc: Track permissible actions per binding · 099bf89d

Petr Machata authored Oct 10, 2021

One block can be bound to several qevents. The qevent type that the block
is bound to determines which actions make sense in a given context. In the
particular case of mlxsw, trap cannot be offloaded on a RED mark qevent,
because the trap contract specifies that the packet is dropped in the HW
datapath, and the HW trigger that the action is offloaded to is always
forwarding the packet (in addition to marking in).

Therefore keep track of which actions are permissible at each binding
block. When an attempt is made to bind a certain action at a binding point
where it is not supported, bounce the request.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

099bf89d

mlxsw: spectrum_qdisc: Distinguish between ingress and egress triggers · 0908e42a

Petr Machata authored Oct 10, 2021

The following patches will configure the MLXSW_SP_SPAN_TRIGGER_ECN
mirroring trigger. This trigger is considered "egress", unlike the
previously-offloaded _EARLY_DROP. Add a helper to spectrum_span,
mlxsw_sp_span_trigger_is_ingress(), to classify triggers to ingress and
egress. Pass result of this instead of hardcoding true when calling
mlxsw_sp_span_analyzed_port_get()/_put().
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0908e42a

mlxsw: spectrum_qdisc: Pass extack to mlxsw_sp_qevent_entry_configure() · a34dda72

Petr Machata authored Oct 10, 2021

This function will report a new failure in the following patches.
Pass extack so that the failure is explicable.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a34dda72

Merge branch 'nfc-minor-printk-cleanup' · ff7f0e4e

Jakub Kicinski authored Oct 11, 2021

Krzysztof Kozlowski says:

====================
nfc: minor printk cleanup

v2: Correct SPDX license in patch 2/7 (as Joe pointed out).
v1: Remove unused variable in pn533 (reported by kbuild).
====================

Link: https://lore.kernel.org/r/20211011133835.236347-1-krzysztof.kozlowski@canonical.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

ff7f0e4e