Commits · b1351527f1eeb9624c301ecb7d8adbc4f543e045 · Kirill Smelkov / linux

16 Mar, 2022 18 commits

Merge branch 'devlink-expose-instance-locking-and-simplify-port-splitting' · b1351527

Jakub Kicinski authored Mar 16, 2022

Jakub Kicinski says:

====================
devlink: expose instance locking and simplify port splitting

This series puts the devlink ports fully under the devlink instance
lock's protection. As discussed in the past it implements my preferred
solution of exposing the instance lock to the drivers. This way drivers
which want to support port splitting can lock the devlink instance
themselves on the probe path, and we can take that lock in the core
on the split/unsplit paths.

nfp and mlxsw are converted, with slightly deeper changes done in
nfp since I'm more familiar with that driver.

Now that the devlink port is protected we can pass a pointer to
the drivers, instead of passing a port index and forcing the drivers
to do their own lookups. Both nfp and mlxsw can container_of() to
their own structures.
====================

Link: https://lore.kernel.org/r/20220315060009.1028519-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

b1351527

devlink: pass devlink_port to port_split / port_unsplit callbacks · 706217c1

Jakub Kicinski authored Mar 14, 2022

Now that devlink ports are protected by the instance lock
it seems natural to pass devlink_port as an argument to
the port_split / port_unsplit callbacks.

This should save the drivers from doing a lookup.

In theory drivers may have supported unsplitting ports
which were not registered prior to this change.
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

706217c1

devlink: hold the instance lock in port_split / port_unsplit callbacks · 49e83bbe

Jakub Kicinski authored Mar 14, 2022

Let the core take the devlink instance lock around port splitting
and remove the now redundant locking in the drivers.
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

49e83bbe

eth: mlxsw: switch to explicit locking for port registration · 5e8930aa

Jakub Kicinski authored Mar 14, 2022

Explicitly lock the devlink instance and use devl_ API.

This will be used by the subsequent patch to invoke
.port_split / .port_unsplit callbacks with devlink
instance lock held.
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5e8930aa

eth: nfp: replace driver's "pf" lock with devlink instance lock · 162cca42

Jakub Kicinski authored Mar 14, 2022

The whole reason for existence of the pf mutex is that we could
not lock the devlink instance around port splitting. There are
more types of reconfig which can make ports appear or disappear.
Now that the devlink instance lock is exposed to drivers and
"locked" helpers exist we can switch to using the devlink lock
directly.

Next patches will move the locking inside .port_(un)split to
the core.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

162cca42

eth: nfp: wrap locking assertions in helpers · 8a38f2cc

Jakub Kicinski authored Mar 14, 2022

We can replace the PF lock with devlink instance lock in subsequent
changes. To make the patches easier to comprehend and limit line
lengths - factor out the existing locking assertions.

No functional changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

8a38f2cc

devlink: expose instance locking and add locked port registering · 2cb7b489

Jakub Kicinski authored Mar 14, 2022

It should be familiar and beneficial to expose devlink instance
lock to the drivers. This way drivers can block devlink from
calling them during critical sections without breakneck locking.

Add port helpers, port splitting callbacks will be the first
target.

Use 'devl_' prefix for "explicitly locked" API. Initial RFC used
'__devlink' but that's too much typing.

devl_lock_is_held() is not defined without lockdep, which is
the same behavior as lockdep_is_held() itself.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2cb7b489

Merge branch 'mediatek-next' · 49045b9c

David S. Miller authored Mar 16, 2022

Biao Huang says:

====================
MediaTek Ethernet Patches on MT8195

Changes in v13:
1. add reviewed-by in "net: dt-bindings: dwmac: add support for mt8195"
   as Rob's comments.
2. drop num_clks defined in mediatek_dwmac_plat_data struct in "stmmac:
   dwmac-mediatek: Reuse more common features" as Angelo's comments.

Changes in v12:
1. add a new patch "stmmac: dwmac-mediatek: re-arrange clock setting" to
   this series, to simplify clock handling in driver, which benefits to
   binding file mediatek-dwmac.yaml.
2. modify dt-binding description in patch "net: dt-bindings: dwmac: add
   support for mt8195" as Rob's comments in v10 series, put mac_cg to the
   end of clock list.
3. there are small changes in patch "stmmac: dwmac-mediatek: add support
   for mt8195", @AngeloGioacchino, please review it kindly.

Changes in v11:
1. add reivewed-by in "net: dt-bindings: dwmac: Convert mediatek-dwmac to
   DT schema" as Rob's comments.
2. fall back "net: dt-bindings: dwmac: add support for mt8195" to v8 version
   as mentioned in previous reply(https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20211216055328.15953-7-biao.huang@mediatek.com/):
   2.1 there is already a special clock named "rmii_internal", which need to
       be put to the end of the clock list(driver special handling),
       so we can't simply put new "mac_cg" for mt8195 to the end of the clock
       list.
   2.2 we prefer the if-then schema, which will make mt8195 clock list clearer
       with some duplicated information.
   2.3 we expect the future IC will follow mt2712 or mt8195, so we only need
       add new IC name to compatible list for future IC, and will not make the
       clock list binding files worse.

Changes in v10:
1. add detailed description in "arm64: dts: mt2712: update ethernet
   device node" to make the modifications clearer as Matthias's comments.
2. modify dt-binding description as Rob's comments, and "make dtbs_check" runs
   pass locally with "arm64: dts: mt2712: update ethernet device node"
   in this series.

Changes in v9:
1. remove oneOf for 1 entry as Rob's comments.
2. add new clocks to the end of existing clocks to simplify
   the binding as Rob's comments.

Changes in v8:
1. add acked-by in "stmmac: dwmac-mediatek: add platform level clocks
   management" patch

Changes in v7:
1. fix uninitialized warning as Jakub's comments.

Changes in v6:
1. update commit message as Jakub's comments.
2. split mt8195 eth dts patch("arm64: dts: mt8195: add ethernet device
   node") from this series, since mt8195 dtsi/dts basic patches is still
   under reviewing.
   https://patchwork.kernel.org/project/linux-mediatek/list/?series=579071
   we'll resend mt8195 eth dts patch once all the dependent patches are
   accepted.

Changes in v5:
1. remove useless inclusion in dwmac-mediatek.c as Angelo's comments.
2. add acked-by in "net-next: stmmac: dwmac-mediatek: add support for
   mt8195" patch

Changes in v4:
1. add changes in commit message in "net-next: dt-bindings: dwmac:
   Convert mediatek-dwmac to DT schema" patch.
2. remove ethernet-controller.yaml since snps,dwmac.yaml already include it.

Changes in v3:
1. Add prefix "net-next" to support new IC as Denis's suggestion.
2. Split dt-bindings to two patches, one for conversion, and the other for
   new IC.
3. add a new patch to update device node in mt2712-evb.dts to accommodate to
   changes in driver.
4. remove unnecessary wrapper as Angelo's suggestion.
5. Add acked-by in "net-next: stmmac: dwmac-mediatek: Reuse more common
   features" patch.

Changes in v2:
1. fix errors/warnings in mediatek-dwmac.yaml with upgraded dtschema tools

Changes in v1:
This series include 5 patches:
1. add platform level clocks management for dwmac-mediatek
2. resue more common features defined in stmmac_platform.c
3. add ethernet entry for mt8195
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

49045b9c

net: dt-bindings: dwmac: add support for mt8195 · ee410d51

Biao Huang authored Mar 14, 2022

Add binding document for the ethernet on mt8195.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee410d51

stmmac: dwmac-mediatek: add support for mt8195 · f2d356a6

Biao Huang authored Mar 14, 2022

Add Ethernet support for MediaTek SoCs from the mt8195 family.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f2d356a6

net: dt-bindings: dwmac: Convert mediatek-dwmac to DT schema · 150b6add

Biao Huang authored Mar 14, 2022

Convert mediatek-dwmac to DT schema, and delete old mediatek-dwmac.txt.
And there are some changes in .yaml than .txt, others almost keep the same:
  1. compatible "const: snps,dwmac-4.20".
  2. delete "snps,reset-active-low;" in example, since driver remove this
     property long ago.
  3. add "snps,reset-delay-us = <0 10000 10000>" in example.
  4. the example is for rgmii interface, keep related properties only.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

150b6add

arm64: dts: mt2712: update ethernet device node · 79e11778

Biao Huang authored Mar 14, 2022

Since there are some changes in ethernet driver:
update ethernet device node in dts to accommodate to it.

1. stmmac_probe_config_dt() in stmmac_platform.c will initialize specified
   parameters according to compatible string "snps,dwmac-4.20a", then,
   dwmac-mediatek.c can skip the initialization if add compatible string
   "snps,dwmac-4.20a" in eth device node.
2. commit 882007ed ("net-next: dt-binding: dwmac-mediatek: add more
   description for RMII") added rmii internal support, we should add
   corresponding clocks/clocks-names in eth device node.
3. add "snps,reset-delays-us = <0 10000 10000>;" to ensure reset delay
   can meet PHY requirement.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

79e11778

stmmac: dwmac-mediatek: re-arrange clock setting · 4fe3075f

Biao Huang authored Mar 14, 2022

The rmii_internal clock is needed only when PHY
interface is RMII, and reference clock is from MAC.

Re-arrange the clock setting as following:
1. the optional "rmii_internal" is controlled by devm_clk_get(),
2. other clocks still be configured by devm_clk_bulk_get().
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4fe3075f

stmmac: dwmac-mediatek: Reuse more common features · a71e67b2

Biao Huang authored Mar 14, 2022

This patch makes dwmac-mediatek reuse more features
supported by stmmac_platform.c.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a71e67b2

stmmac: dwmac-mediatek: add platform level clocks management · 3186bdad

Biao Huang authored Mar 14, 2022

This patch implements clks_config callback for dwmac-mediatek platform,
which could support platform level clocks management.
Signed-off-by: Biao Huang <biao.huang@mediatek.com>
Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3186bdad

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 79b04108

David S. Miller authored Mar 16, 2022

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-03-15

Jacob Keller says:

The ice_sriov.c file now houses almost all of the virtualization code in the
ice driver. This includes both Single Root specific implementation as well
as generic functionality such as the virtchnl interface.

We are planning to implement support for Scalable IOV in the ice driver in
the future. This implementation will want to use the generic functionality
in ice_sriov.c

Rather than dump the Scalable IOV code into ice_sriov.c, we will want to
implement it in a separate file, ice_siov.c

To help with this, refactor the code in ice_sriov.c and split the generic
functionality out into separate files.

Reorganize code to make the non-implementation specific bits into new files
with the following general guidelines:

* ice_vf_lib.[ch]

Basic VF structures and accessors. This is where scheme-independent
code will reside.

* ice_virtchnl.[ch]

Virtchnl message handling. This is where the bulk of the logic for
processing messages from VFs using the virtchnl messaging scheme will
reside. This is separated from ice_vf_lib.c because it is somewhat
distinct and stand alone.

* ice_sriov.[ch]

Single Root IOV implementation, including initialization and the
routines for interacting with SR-IOV based netdev operations.

* (future) ice_siov.[ch]

Scalable IOV implementation.

The end goal is to make it easier to re-use the generic parts of the
virtualization logic while keeping separate the concerns of the Single Root
implementation.

In addition to the pure code moves, this series has a reset refactor which
clean up the functionality to make it easier to reuse the reset code. A new
ops table is introduced to make the VF reset logic more generic. The Single
Root specific details are implemented in ice_sriov.c. A future series
implementing Scalable IOV support will use this ops table to allow re-use of
the reset logic which is now in ice_vf_lib.c
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

79b04108

net: sparx5: Use Switchdev fdb events for managing fdb entries · 9f01cfbf

Casper Andersson authored Mar 14, 2022

Changes the handling of fdb entries to use Switchdev events,
instead of the previous "sync_bridge" and "sync_port" which
only run when adding or removing VLANs on the bridge.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Link: https://lore.kernel.org/r/20220314160918.4rfrrfgmbsf2pxl3@wse-c0155Signed-off-by: Jakub Kicinski <kuba@kernel.org>

9f01cfbf

net: Add l3mdev index to flow struct and avoid oif reset for port devices · 40867d74

David Ahern authored Mar 14, 2022

The fundamental premise of VRF and l3mdev core code is binding a socket
to a device (l3mdev or netdev with an L3 domain) to indicate L3 scope.
Legacy code resets flowi_oif to the l3mdev losing any original port
device binding. Ben (among others) has demonstrated use cases where the
original port device binding is important and needs to be retained.
This patch handles that by adding a new entry to the common flow struct
that can indicate the l3mdev index for later rule and table matching
avoiding the need to reset flowi_oif.

In addition to allowing more use cases that require port device binds,
this patch brings a few datapath simplications:

1. l3mdev_fib_rule_match is only called when walking fib rules and
always after l3mdev_update_flow. That allows an optimization to bail
early for non-VRF type uses cases when flowi_l3mdev is not set. Also,
only that index needs to be checked for the FIB table id.

2. l3mdev_update_flow can be called with flowi_oif set to a l3mdev
(e.g., VRF) device. By resetting flowi_oif only for this case the
FLOWI_FLAG_SKIP_NH_OIF flag is not longer needed and can be removed,
removing several checks in the datapath. The flowi_iif path can be
simplified to only be called if the it is not loopback (loopback can
not be assigned to an L3 domain) and the l3mdev index is not already
set.

3. Avoid another device lookup in the output path when the fib lookup
returns a reject failure.

Note: 2 functional tests for local traffic with reject fib rules are
updated to reflect the new direct failure at FIB lookup time for ping
rather than the failure on packet path. The current code fails like this:

HINT: Fails since address on vrf device is out of device scope
COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
ping: Warning: source address might be selected on device other than: eth1
PING 172.16.3.1 (172.16.3.1) from 172.16.3.1 eth1: 56(84) bytes of data.

--- 172.16.3.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

where the test now directly fails:

HINT: Fails since address on vrf device is out of device scope
COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
ping: connect: No route to host
Signed-off-by: David Ahern <dsahern@kernel.org>
Tested-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20220314204551.16369-1-dsahern@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

40867d74

15 Mar, 2022 22 commits

ice: remove PF pointer from ice_check_vf_init · 5a57ee83

Jacob Keller authored Feb 22, 2022

The ice_check_vf_init function takes both a PF and a VF pointer. Every
caller looks up the PF pointer from the VF structure. Some callers only
use of the PF pointer is call this function. Move the lookup inside
ice_check_vf_init and drop the unnecessary argument.

Cleanup the callers to drop the now unnecessary local variables. In
particular, replace the local PF pointer with a HW structure pointer in
ice_vc_get_vf_res_msg which simplifies a few accesses to the HW
structure in that function.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

5a57ee83

ice: introduce ice_virtchnl.c and ice_virtchnl.h · bf93bf79

Jacob Keller authored Feb 22, 2022

Just as we moved the generic virtualization library logic into
ice_vf_lib.c, move the virtchnl message handling into ice_virtchnl.c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

bf93bf79

ice: cleanup long lines in ice_sriov.c · 8cf52bec

Jacob Keller authored Feb 22, 2022

Before we move the virtchnl message handling from ice_sriov.c into
ice_virtchnl.c, cleanup some long line warnings to avoid checkpatch.pl
complaints.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

8cf52bec

ice: introduce ICE_VF_RESET_LOCK flag · f5f085c0

Jacob Keller authored Feb 22, 2022

The ice_reset_vf function performs actions which must be taken only
while holding the VF configuration lock. Some flows already acquired the
lock, while other flows must acquire it just for the reset function. Add
the ICE_VF_RESET_LOCK flag to the function so that it can handle taking
and releasing the lock instead at the appropriate scope.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

f5f085c0

ice: introduce ICE_VF_RESET_NOTIFY flag · 9dbb33da

Jacob Keller authored Feb 22, 2022

In some cases of resetting a VF, the PF would like to first notify the
VF that a reset is impending. This is currently done via
ice_vc_notify_vf_reset. A wrapper to ice_reset_vf, ice_vf_reset_vf, is
used to call this function first before calling ice_reset_vf.

In fact, every single call to ice_vc_notify_vf_reset occurs just prior
to a call to ice_vc_reset_vf.

Now that ice_reset_vf has flags, replace this separate call with an
ICE_VF_RESET_NOTIFY flag. This removes an unnecessary exported function
of ice_vc_notify_vf_reset, and also makes there be a single function to
reset VFs (ice_reset_vf).
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

9dbb33da

ice: convert ice_reset_vf to take flags · 7eb517e4

Jacob Keller authored Feb 22, 2022

The ice_reset_vf function takes a boolean parameter which indicates
whether or not the reset is due to a VFLR event.

This is somewhat confusing to read because readers must interpret what
"true" and "false" mean when seeing a line of code like
"ice_reset_vf(vf, false)".

We will want to add another toggle to the ice_reset_vf in a following
change. To avoid proliferating many arguments, convert this function to
take flags instead. ICE_VF_RESET_VFLR will indicate if this is a VFLR
reset. A value of 0 indicates no flags.

One could argue that "ice_reset_vf(vf, 0)" is no more readable than
"ice_reset_vf(vf, false)".. However, this type of flags interface is
somewhat common and using 0 to mean "no flags" makes sense in this
context. We could bother to add a define for "ICE_VF_RESET_PLAIN" or
something similar, but this can be confusing since its not an actual bit
flag.

This paves the way to add another flag to the function in a following
change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

7eb517e4

ice: convert ice_reset_vf to standard error codes · 4fe193cc

Jacob Keller authored Feb 22, 2022

The ice_reset_vf function returns a boolean value indicating whether or
not the VF reset. This is a bit confusing since it means that callers
need to know how to interpret the return value when needing to indicate
an error.

Refactor the function and call sites to report a regular error code. We
still report success (i.e. return 0) in cases where the reset is in
progress or is disabled.

Existing callers don't care because they do not check the return value.
We keep the error code anyways instead of a void return because we
expect future code which may care about or at least report the error
value.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

4fe193cc

ice: make ice_reset_all_vfs void · fe99d1c0

Jacob Keller authored Feb 22, 2022

The ice_reset_all_vfs function returns true if any VFs were reset, and
false otherwise. However, no callers check the return value.

Drop this return value and make the function void since the callers do
not care about this.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

fe99d1c0

ice: drop is_vflr parameter from ice_reset_all_vfs · dac57288

Jacob Keller authored Feb 22, 2022

The ice_reset_all_vfs function takes a parameter to handle whether its
operating after a VFLR event or not. This is not necessary as every
caller always passes true. Simplify the interface by removing the
parameter.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

dac57288

ice: move reset functionality into ice_vf_lib.c · 16686d7f

Jacob Keller authored Feb 22, 2022

Now that the reset functions do not rely on Single Root specific
behavior, move the ice_reset_vf, ice_reset_all_vfs, and
ice_vf_rebuild_host_cfg functions and their dependent helper functions
out of ice_sriov.c and into ice_vf_lib.c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

16686d7f

ice: fix a long line warning in ice_reset_vf · 5de95744

Jacob Keller authored Feb 22, 2022

We're about to move ice_reset_vf out of ice_sriov.c and into
ice_vf_lib.c

One of the dev_err statements has a checkpatch.pl violation due to
putting the vf->vf_id on the same line as the dev_err. Fix this style
issue first before moving the code.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

5de95744

ice: introduce VF operations structure for reset flows · 9c6f7878

Jacob Keller authored Feb 22, 2022

The ice driver currently supports virtualization using Single Root IOV,
with code in the ice_sriov.c file. In the future, we plan to also
implement support for Scalable IOV, which uses slightly different
hardware implementations for some functionality.

To eventually allow this, we introduce a new ice_vf_ops structure which
will contain the basic operations that are different between the two IOV
implementations. This primarily includes logic for how to handle the VF
reset registers, as well as what to do before and after rebuilding the
VF's VSI.

Implement these ops structures and call the ops table instead of
directly calling the SR-IOV specific function. This will allow us to
easily add the Scalable IOV implementation in the future. Additionally,
it helps separate the generalized VF logic from SR-IOV specifics. This
change allows us to move the reset logic out of ice_sriov.c and into
ice_vf_lib.c without placing any Single Root specific details into the
generic file.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

9c6f7878

ice: fix incorrect dev_dbg print mistaking 'i' for vf->vf_id · f5840e0d

Jacob Keller authored Feb 22, 2022

If we fail to clear the malicious VF indication after a VF reset, the
dev_dbg message which is printed uses the local variable 'i' when it
meant to use vf->vf_id. Fix this.

Fixes: 0891c896 ("ice: warn about potentially malicious VFs")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

f5840e0d

ice: introduce ice_vf_lib.c, ice_vf_lib.h, and ice_vf_lib_private.h · 109aba47

Jacob Keller authored Feb 22, 2022

Introduce the ice_vf_lib.c file along with the ice_vf_lib.h and
ice_vf_lib_private.h header files.

These files will house the generic VF structures and access functions.
Move struct ice_vf and its dependent definitions into this new header
file.

The ice_vf_lib.c is compiled conditionally on CONFIG_PCI_IOV. Some of
its functionality is required by all driver files. However, some of its
functionality will only be required by other files also conditionally
compiled based on CONFIG_PCI_IOV.

Declaring these functions used only in CONFIG_PCI_IOV files in
ice_vf_lib.h is verbose. This is because we must provide a fallback
implementation for each function in this header since it is included in
files which may not be compiled with CONFIG_PCI_IOV.

Instead, introduce a new ice_vf_lib_private.h header which verifies that
CONFIG_PCI_IOV is enabled. This header is intended to be directly
included in .c files which are CONFIG_PCI_IOV only. Add a #error
indication that will complain if the file ever gets included by another
C file on a kernel with CONFIG_PCI_IOV disabled. Add a comment
indicating the nature of the file and why it is useful.

This makes it so that we can easily define functions exposed from
ice_vf_lib.c into other virtualization files without needing to add
fallback implementations for every single function.

This begins the path to separate out generic code which will be reused
by other virtualization implementations from ice_sriov.h and ice_sriov.c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

109aba47

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · c84d86a0

Jakub Kicinski authored Mar 15, 2022

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-03-14

Jacob Keller says:

The ice_virtchnl_pf.c file has become a single place for a lot of
virtualization functionality. This includes most of the virtchnl message
handling, integration with kernel hooks like the .ndo operations, reset
logic, and more.

We are planning in the future to implement and support Scalable IOV in the
ice driver. To do this, much (but not all) of the code in ice_virtchnl_pf.c
will want to be reused.

Rather than dump all of the Scalable IOV implementation into
ice_virtchnl_pf.c it makes sense to house it in a separate file. But that
still leaves all of the Single Root IOV code littered among more generic
logic.

The long term goal is to re-organize the code such that generic re-usable
code is split into separate files. The ice_sriov.c file would end up
containing all of the Single Root IOV implementation specific details, while
ice_vf_lib.[ch] and ice_virtchnl.[ch] contain the generic pieces.

As a first step, notice that ice_sriov.c currently does not contain much of
the SR-IOV implementation. This is housed primarily in ice_virtchnl_pf.c

The code in ice_sriov.c is really generic and relates to the VF mailbox,
including mailbox overflow detection.

Rename ice_sriov.c to ice_vf_mbx.c, and then rename ice_virtchnl_pf.c to
ice_sriov.c

A later series will finish the refactor by splitting ice_sriov.c into
multiple files, moving the generic code into ice_vf_lib.c and ice_virtchnl.c

To prepare for that series, perform some basic cleanup and other refactors
that we've accumulated during this development cycle.

This series builds on top of the recent hash table refactor work.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: use ice_is_vf_trusted helper function
  ice: log an error message when eswitch fails to configure
  ice: cleanup error logging for ice_ena_vfs
  ice: move ice_set_vf_port_vlan near other .ndo ops
  ice: refactor spoofchk control code in ice_sriov.c
  ice: rename ICE_MAX_VF_COUNT to avoid confusion
  ice: remove unused definitions from ice_sriov.h
  ice: convert vf->vc_ops to a const pointer
  ice: remove circular header dependencies on ice.h
  ice: rename ice_virtchnl_pf.c to ice_sriov.c
  ice: rename ice_sriov.c to ice_vf_mbx.c
====================

Link: https://lore.kernel.org/r/20220315011155.2166817-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

c84d86a0

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next · abe2fec8

Jakub Kicinski authored Mar 15, 2022

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

1) Revert CHECKSUM_UNNECESSARY for UDP packet from conntrack.

2) Reject unsupported families when creating tables, from Phil Sutter.

3) GRE support for the flowtable, from Toshiaki Makita.

4) Add GRE offload support for act_ct, also from Toshiaki.

5) Update mlx5 driver to support for GRE flowtable offload,
   from Toshiaki Makita.

6) Oneliner to clean up incorrect indentation in nf_conntrack_bridge,
   from Jiapeng Chong.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: bridge: clean up some inconsistent indenting
  net/mlx5: Support GRE conntrack offload
  act_ct: Support GRE offload
  netfilter: flowtable: Support GRE
  netfilter: nf_tables: Reject tables of unsupported family
  Revert "netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY"
====================

Link: https://lore.kernel.org/r/20220315091513.66544-1-pablo@netfilter.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

abe2fec8

net: mscc: ocelot: fix build error due to missing IEEE_8021QAZ_MAX_TCS · 72f56fdb

Vladimir Oltean authored Mar 15, 2022

IEEE_8021QAZ_MAX_TCS is defined in include/uapi/linux/dcbnl.h, which is
included by net/dcbnl.h. Then, linux/netdevice.h conditionally includes
net/dcbnl.h if CONFIG_DCB is enabled.

Therefore, when CONFIG_DCB is disabled, this indirect dependency is
broken.

There isn't a good reason to include net/dcbnl.h headers into the ocelot
switch library which exports low-level hardware API, so replace
IEEE_8021QAZ_MAX_TCS with OCELOT_NUM_TC which has the same value.

Fixes: 978777d0 ("net: dsa: felix: configure default-prio and dscp priorities")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220315131215.273450-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

72f56fdb

net: sparx5: fix a couple warning messages · c24f6577

Dan Carpenter authored Mar 14, 2022

The WARN_ON() macro takes a condition, not a warning message.

Fixes: 0933bd04 ("net: sparx5: Add support for ptp clocks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20220314140327.GB30883@kiliSigned-off-by: Paolo Abeni <pabeni@redhat.com>

c24f6577

Merge branch 'netdevsim-support-for-l3-hw-stats' · 583024cf

Paolo Abeni authored Mar 15, 2022

Petr Machata says:

====================
netdevsim: Support for L3 HW stats

"L3 stats" is a suite of interface statistics aimed at reflecting traffic
taking place in a HW device, on an object corresponding to some software
netdevice. Support for this stats suite has been added recently, in commit
ca0a53dc ("Merge branch 'net-hw-counters-for-soft-devices'").

In this patch set:

- Patch #1 adds support for L3 stats to netdevsim.

  Real devices can have various conditions for when an L3 counter is
  available. To simulate this, netdevsim maintains a list of devices
  suitable for HW stats collection. Only when l3_stats is enabled on both a
  netdevice itself, and in netdevsim, will netdevsim contribute values to
  L3 stats.

  This enablement and disablement is done via debugfs:

    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/enable_ifindex
    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/disable_ifindex

  Besides this, there is a third toggle to mark a device for future failure:

    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/fail_next_enable

- This allows HW-independent testing of stats reporting and in-kernel APIs,
  as well as a test for enablement rollback, which is difficult to do
  otherwise. This netdevsim-specific selftest is added in patch #2.

- Patch #3 adds another driver-specific selftest, namely a test aimed at
  checking mlxsw-induced stats monitoring events.

====================

Link: https://lore.kernel.org/r/cover.1647265833.git.petrm@nvidia.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

583024cf

selftests: mlxsw: hw_stats_l3: Add a new test · ed2ae69c

Petr Machata authored Mar 14, 2022

Add a test that verifies that UAPI notifications are emitted, as mlxsw
installs and deinstalls HW counters for the L3 offload xstats.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

ed2ae69c

selftests: netdevsim: hw_stats_l3: Add a new test · 9b18942e

Petr Machata authored Mar 14, 2022

Add a test that verifies basic UAPI contracts, netdevsim operation,
rollbacks after partial enablement in core, and UAPI notifications.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

9b18942e

netdevsim: Introduce support for L3 offload xstats · 1a6d7ae7

Petr Machata authored Mar 14, 2022

Add support for testing of HW stats support that was added recently, namely
the L3 stats support. L3 stats are provided for devices for which the L3
stats have been turned on, and that were enabled for netdevsim through a
debugfs toggle:

    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/enable_ifindex

For fully enabled netdevices, netdevsim counts 10pps of ingress traffic and
20pps of egress traffic. Similarly, L3 stats can be disabled for a given
device, and netdevsim ceases pretending there is any HW traffic going on:

    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/disable_ifindex

Besides this, there is a third toggle to mark a device for future failure:

    # echo $ifindex > /sys/kernel/debug/netdevsim/$DEV/hwstats/l3/fail_next_enable

A future request to enable L3 stats on such netdevice will be bounced by
netdevsim:

    # ip -j l sh dev d | jq '.[].ifindex'
    66
    # echo 66 > /sys/kernel/debug/netdevsim/netdevsim10/hwstats/l3/enable_ifindex
    # echo 66 > /sys/kernel/debug/netdevsim/netdevsim10/hwstats/l3/fail_next_enable
    # ip stats set dev d l3_stats on
    Error: netdevsim: Stats enablement set to fail.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

1a6d7ae7