- 24 Mar, 2021 40 commits
-
-
Ido Schimmel authored
The number of nexthop buckets in a resilient nexthop group never changes, so when the gateway address of a nexthop cannot be resolved, the nexthop buckets are programmed to trap packets to the CPU in order to trigger resolution. For example: # ip nexthop add id 1 via 198.51.100.1 dev swp3 # ip nexthop add id 10 group 1 type resilient buckets 32 # ip nexthop bucket get id 10 index 0 id 10 index 0 idle_time 1.44 nhid 1 trap Where 198.51.100.1 is a made-up IP. Test that in this case packets are indeed trapped to the CPU via the unresolved neigh trap. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Now that mlxsw supports resilient nexthop groups, allow them to be programmed after validating that their configuration conforms to the device's limitations (e.g., number of buckets is within predefined range). Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The kernel periodically checks the idle time of nexthop buckets to determine if they are idle and can be re-populated with a new nexthop. When the resilient nexthop group is offloaded to hardware, the kernel will not see activity on nexthop buckets unless it is reported from hardware. Therefore, periodically (every 1 second) query the hardware for activity of adjacency entries used as part of a resilient nexthop group and report it to the nexthop code. The activity is only queried if resilient nexthop groups are in use. The delayed work is canceled otherwise. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The RATRAD register is used to dump and optionally clear activity bits of router adjacency table entries. Will be used by the next patch to query and clear the activity of nexthop buckets in a resilient nexthop group. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
So far, mlxsw only updated hardware flags ('offload' / 'trap') on nexthop objects. For resilient nexthop groups, these flags need to be updated on individual nexthop buckets as well. Update these flags whenever updating the flags of the encapsulating nexthop object and whenever a nexthop bucket is replaced. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Replace a single nexthop bucket upon receiving a 'NEXTHOP_EVENT_BUCKET_REPLACE' notification. When the 'force' parameter is not set, instruct the device to only overwrite an adjacency entry if its activity is cleared, so as not to break existing flows using the adjacency entry. The device does not provide feedback if the replacement was successful in this case, so the contents of the adjacency entry after the replacement are compared with the replacement request. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Have the caller pass a pointer to the payload of the RATR register to the function updating a single nexthop / adjacency entry. In a subsequent patch, this will allow the caller to make sure replacement was successful by querying the state of the adjacency entry after replacement and comparing with the initial request. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Allow the driver to instruct the device to only overwrite an adjacency entry if its activity is cleared. Currently, adjacency entry is always overwritten, regardless of activity. This will be used by subsequent patches to prevent replacement of active nexthop buckets. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Parse the configuration of resilient nexthop groups to existing mlxsw data structures. Unlike non-resilient groups, nexthops without a valid MAC or router interface (RIF) are programmed with a trap action instead of not being programmed at all. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cooper Lees authored
- The Open Routing (Open/R) network protocol netlink handler uses ID 99 - Will also add to `/etc/iproute2/rt_protos` once this is accepted - For more information: https://github.com/facebook/openrSigned-off-by: From: Cooper Lees <me@cooperlees.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
When enetc runs out of exact match entries for unicast address filtering, it switches to an approach based on hash tables, where multiple MAC addresses might end up in the same bucket. However, the enetc_set_mac_ht_flt function currently depends on the system endianness, because it interprets the 64-bit hash value as an array of two u32 elements. Modify this to use lower_32_bits and upper_32_bits. Tested by forcing enetc to go into hash table mode by creating two macvlan upper interfaces: ip link add link eno0 address 00:01:02:03:00:00 eno0.0 type macvlan && ip link set eno0.0 up ip link add link eno0 address 00:01:02:03:00:01 eno0.1 type macvlan && ip link set eno0.1 up and verified that the same bit values are written to the registers before and after: enetc_sync_mac_filters: addr 00:00:80:00:40:10 exact match 0 enetc_sync_mac_filters: addr 00:00:00:00:80:00 exact match 0 enetc_set_mac_ht_flt: hash 0x80008000000000 UMHFR0 0x0 UMHFR1 0x800080 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
ENETC has a 64-entry hash table for VLAN RX filtering per Station Interface, which is accessed through two 32-bit registers: VHFR0 holding the low portion, and VHFR1 holding the high portion. The enetc_set_vlan_ht_filter function looks at the pf->vlan_ht_filter bitmap, which is fundamentally an unsigned long variable, and casts it to a u32 array of two elements. It puts the first u32 element into VHFR0 and the second u32 element into VHFR1. It is easy to imagine that this will not work on big endian systems (although, yes, we have bigger problems, because currently enetc assumes that the CPU endianness is equal to the controller endianness, aka little endian - but let's assume that we could add a cpu_to_le32 in enetc_wd_reg and a le32_to_cpu in enetc_rd_reg). Let's use lower_32_bits and upper_32_bits which are designed to work regardless of endianness. Tested that both the old and the new method produce the same results: $ ethtool -K eth1 rx-vlan-filter on $ ip link add link eth1 name eth1.100 type vlan id 100 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x20 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x20 $ ip link add link eth1 name eth1.101 type vlan id 101 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x30 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x30 $ ip link add link eth1 name eth1.34 type vlan id 34 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x34 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x34 $ ip link add link eth1 name eth1.1024 type vlan id 1024 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x1 VHFR1 0x34 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x1 VHFR1 0x34 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Use memdup_user_nul() helper instead of open-coding to simplify the code. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sai Kalyaan Palla authored
Made changes to coding style as suggested by checkpatch.pl changes are of the type: space required before the open parenthesis '(' space required after that ',' Signed-off-by: Sai Kalyaan Palla <saikalyaan63@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Wong Vee Khee says: ==================== Add support for Clause-45 PHY Loopback This patch series add support for Clause-45 PHY loopback. It involves adding a generic API in the PHY framework, which can be accessed by all C45 PHY drivers using the .set_loopback callback. Also, enable PHY loopback for the Marvell 88x3310/88x2110 driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wong Vee Khee authored
Add support for PHY loopback for Marvell 88x2110 and Marvell 88x3310. This allow user to perform PHY loopback test using ethtool selftest. Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wong Vee Khee authored
Add generic code to enable C45 PHY loopback into the common phy-c45.c file. This will allow C45 PHY drivers aceess this by setting .set_loopback. Suggested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
sprintf() is declared with a restrict keyword to not allow input and output to point to the same buffer: lib/test_rhashtable.c: In function 'print_ht': lib/test_rhashtable.c:504:4: error: 'sprintf' argument 3 overlaps destination object 'buff' [-Werror=restrict] 504 | sprintf(buff, "%s\nbucket[%d] -> ", buff, i); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/test_rhashtable.c:489:7: note: destination object referenced by 'restrict'-qualified argument 1 was declared here 489 | char buff[512] = ""; | ^~~~ Rework this function to remember the last offset instead to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
When compile testing this driver on a platform on which probe() is known to fail at compile time, gcc warns about the cgx_lmactype_string[] array being uninitialized: In function 'strncpy', inlined from 'link_status_user_format' at /git/arm-soc/drivers/net/ethernet/marvell/octeontx2/af/cgx.c:838:2, inlined from 'cgx_link_change_handler' at /git/arm-soc/drivers/net/ethernet/marvell/octeontx2/af/cgx.c:853:2: include/linux/fortify-string.h:27:30: error: argument 2 null where non-null expected [-Werror=nonnull] 27 | #define __underlying_strncpy __builtin_strncpy Address this by turning the runtime initialization into a fixed array, which should also produce better code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tan Tee Min authored
Cross timestamping is supported on Integrated Ethernet Controller in Intel SoC such as EHL and TGL with Always Running Timer. The hardware cross-timestamp result is made available to applications through the PTP_SYS_OFFSET_PRECISE ioctl which calls stmmac_getcrosststamp(). Device time is stored in the MAC Auxiliary register. The 64-bit System time (ART timestamp) is stored in registers that are only addressable by using MDIO space. Signed-off-by: Tan Tee Min <tee.min.tan@intel.com> Co-developed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bhaskar Chowdhury authored
s/maintaning/maintaining/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bhaskar Chowdhury authored
s/procdure/procedure/ s/maintanance/maintenance/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bhaskar Chowdhury authored
s/preceeds/precedes/ .....two different places s/rsponse/response/ s/cetain/certain/ s/precison/precision/ Fix a sentence construction as per suggestion. Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Pablo Neira Ayuso says: ==================== netfilter: flowtable enhancements [ This is v2 that includes documentation enhancements, including existing limitations. This is a rebase on top on net-next. ] The following patchset augments the Netfilter flowtable fastpath to support for network topologies that combine IP forwarding, bridge, classic VLAN devices, bridge VLAN filtering, DSA and PPPoE. This includes support for the flowtable software and hardware datapaths. The following pictures provides an example scenario: fast path! .------------------------. / \ | IP forwarding | | / \ \/ | br0 wan ..... eth0 . / \ host C -> veth1 veth2 . switch/router . . eth0 host A The bridge master device 'br0' has an IP address and a DHCP server is also assumed to be running to provide connectivity to host A which reaches the Internet through 'br0' as default gateway. Then, packet enters the IP forwarding path and Netfilter is used to NAT the packets before they leave through the wan device. The general idea is to accelerate forwarding by building a fast path that takes packets from the ingress path of the bridge port and place them in the egress path of the wan device (and vice versa). Hence, skipping the classic bridge and IP stack paths. ** Patch from #1 to #6 add the infrastructure which describes the list of netdevice hops to reach a given destination MAC address in the local network topology. Patch #1 adds dev_fill_forward_path() and .ndo_fill_forward_path() to netdev_ops. Patch #2 adds .ndo_fill_forward_path for vlan devices, which provides the next device hop via vlan->real_dev, the vlan ID and the protocol. Patch #3 adds .ndo_fill_forward_path for bridge devices, which allows to make lookups to the FDB to locate the next device hop (bridge port) in the forwarding path. Patch #4 extends bridge .ndo_fill_forward_path to support for bridge VLAN filtering. Patch #5 adds .ndo_fill_forward_path for PPPoE devices. Patch #6 adds .ndo_fill_forward_path for DSA. Patches from #7 to #14 update the flowtable software datapath: Patch #7 adds the transmit path type field to the flow tuple. Two transmit paths are supported so far: the neighbour and the xfrm transmit paths. Patch #8 and #9 update the flowtable datapath to use dev_fill_forward_path() to obtain the real ingress/egress device for the flowtable datapath. This adds the new ethernet xmit direct path to the flowtable. Patch #10 adds native flowtable VLAN support (up to 2 VLAN tags) through dev_fill_forward_path(). The flowtable stores the VLAN id and protocol in the flow tuple. Patch #11 adds native flowtable bridge VLAN filter support through dev_fill_forward_path(). Patch #12 adds native flowtable bridge PPPoE through dev_fill_forward_path(). Patch #13 adds DSA support through dev_fill_forward_path(). Patch #14 extends flowtable selftests to cover for flowtable software datapath enhancements. ** Patches from #15 to #20 update the flowtable hardware offload datapath: Patch #15 extends the flowtable hardware offload to support for the direct ethernet xmit path. This also includes VLAN support. Patch #16 stores the egress real device in the flow tuple. The software flowtable datapath uses dev_hard_header() to transmit packets, hence it might refer to VLAN/DSA/PPPoE software device, not the real ethernet device. Patch #17 deals with switchdev PVID hardware offload to skip it on egress. Patch #18 adds FLOW_ACTION_PPPOE_PUSH to the flow_offload action API. Patch #19 extends the flowtable hardware offload to support for PPPoE Patch #20 adds TC_SETUP_FT support for DSA. ** Patches from #20 to #23: Felix Fietkau adds a new driver which support hardware offload for the mtk PPE engine through the existing flow offload API which supports for the flowtable enhancements coming in this batch. Patch #24 extends the documentation and describe existing limitations. Please, apply, thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
This patch updates the flowtable documentation to describe recent enhancements: - Offload action is available after the first packets go through the classic forwarding path. - IPv4 and IPv6 are supported. Only TCP and UDP layer 4 are supported at this stage. - Tuple has been augmented to track VLAN id and PPPoE session id. - Bridge and IP forwarding integration, including bridge VLAN filtering support. - Hardware offload support. - Describe the [OFFLOAD] and [HW_OFFLOAD] tags in the conntrack table listing. - Replace 'flow offload' by 'flow add' in example rulesets (preferred syntax). - Describe existing cache limitations. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Fietkau authored
This adds support for offloading IPv4 routed flows, including SNAT/DNAT, one VLAN, PPPoE and DSA. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Fietkau authored
The PPE (packet processing engine) is used to offload NAT/routed or even bridged flows. This patch brings up the PPE and uses it to get a packet hash. It also contains some functionality that will be used to bring up flow offloading. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Fietkau authored
When using DSA, set the special tag in GDM ingress control to allow the MAC to parse packets properly earlier. This affects rx DMA source port reporting. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
The dsa infrastructure provides a well-defined hierarchy of devices, pass up the call to set up the flow block to the master device. From the software dataplane, the netfilter infrastructure uses the dsa slave devices to refer to the input and output device for the given skbuff. Similarly, the flowtable definition in the ruleset refers to the dsa slave port devices. This patch adds the glue code to call ndo_setup_tc with TC_SETUP_FT with the master device via the dsa slave devices. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Add a PPPoE push action if layer 2 protocol is ETH_P_PPP_SES to add PPPoE flowtable hardware offload support. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Add an action to represent the PPPoE hardware offload support that includes the session ID. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Fietkau authored
The switch might have already added the VLAN tag through PVID hardware offload. Keep this extra VLAN in the flowtable but skip it on egress. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
If there is a forward path to reach an ethernet device and hardware offload is enabled, then use the direct xmit path. Moreover, store the real device in the direct xmit path info since software datapath uses dev_hard_header() to push the layer encapsulation headers while hardware offload refers to the real device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
When the flow tuple xmit_type is set to FLOW_OFFLOAD_XMIT_DIRECT, the dst_cache pointer is not valid, and the h_source/h_dest/ifidx out fields need to be used. This patch also adds the FLOW_ACTION_VLAN_PUSH action to pass the VLAN tag to the driver. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
This patch adds two new tests to cover bridge and vlan support: - Add a bridge device to the Router1 (nsr1) container and attach the veth0 device to the bridge. Set the IP address to the bridge device to exercise the bridge forwarding path. - Add vlan encapsulation between to the bridge device in the Router1 and one of the sender containers (ns1). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Replace the master ethernet device by the dsa slave port. Packets coming in from the software ingress path use the dsa slave port as input device. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Add the PPPoE protocol and session id to the flow tuple using the encap fields to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Add the vlan tag based when PVID is set on. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
Add the vlan id and protocol to the flow tuple to uniquely identify flows from the receive path. For the transmit path, dev_hard_header() on the vlan device push the headers. This patch includes support for two vlan headers (QinQ) from the ingress path. Add a generic encap field to the flowtable entry which stores the protocol and the tag id. This allows to reuse these fields in the PPPoE support coming in a later patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
The egress device in the tuple is obtained from route. Use dev_fill_forward_path() instead to provide the real egress device for this flow whenever this is available. The new FLOW_OFFLOAD_XMIT_DIRECT type uses dev_queue_xmit() to transmit ethernet frames. Cache the source and destination hardware address to use dev_queue_xmit() to transfer packets. The FLOW_OFFLOAD_XMIT_DIRECT replaces FLOW_OFFLOAD_XMIT_NEIGH if dev_fill_forward_path() finds a direct transmit path. In case of topology updates, if peer is moved to different bridge port, the connection will time out, reconnect will result in a new entry with the correct path. Snooping fdb updates would allow for cleaning up stale flowtable entries. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-