- 18 Mar, 2021 40 commits
-
-
Tobias Waldekranz authored
Move the intricacies of correctly iterating over the VTU to a common implementation. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Waldekranz authored
When a port is a part of a LAG, the ATU will create dynamic entries belonging to the LAG ID when learning is enabled. So trying to fast-age those out using the constituent port will have no effect. Unfortunately the hardware does not support move operations on LAGs so there is no obvious way to transform the request to target the LAG instead. Instead we document this known limitation and at least avoid wasting any time on it. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Waldekranz authored
In order for a driver to be able to query a bridge for information about itself, e.g. reading out port flags, it has to use a netdev that is known to the bridge. In the simple case, that is just the netdev representing the port, e.g. swp0 or swp1 in this example: br0 / \ swp0 swp1 But in the case of an offloaded lag, this will be the bond or team interface, e.g. bond0 in this example: br0 / bond0 / \ swp0 swp1 Add a helper that hides some of this complexity from the drivers. Then, redefine dsa_port_offloads_bridge_port using the helper to avoid double accounting of the set of possible offloaded uppers. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: support 32-bit targets There is currently a configuration dependency that restricts IPA to be supported only on 64-bit machines. There are only a few things that really require that, and those are fixed in this series. The last patch in the series removes the CONFIG_64BIT build dependency for IPA. Version 2 of this series uses upper_32_bits() rather than creating a new function to extract bits out of a DMA address. Version 3 of uses lower_32_bits() as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
We currently assume the IPA driver is built only for a 64 bit kernel. When this constraint was put in place it eliminated some do_div() calls, replacing them with the "/" and "%" operators. We now only use these operations on u32 and size_t objects. In a 32-bit kernel build, size_t will be 32 bits wide, so there remains no reason to use do_div() for divide and modulo. A few recent commits also fix some code that assumes that DMA addresses are 64 bits wide. With that, we can get rid of the 64-bit build requirement. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
We currently have a build-time check to ensure that the minimum DMA allocation alignment satisfies the constraint that IPA filter and route tables must point to rules that are 128-byte aligned. But what's really important is that the actual allocated DMA memory has that alignment, even if the minimum is smaller than that. Remove the BUILD_BUG_ON() call checking against minimim DMA alignment and instead verify at rutime that the allocated memory is properly aligned. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Use upper_32_bits() to extract the high-order 32 bits of a DMA address. This avoids doing a 32-position shift on a DMA address if it happens not to be 64 bits wide. Use lower_32_bits() to extract the low-order 32 bits (because that's what it's for). Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Some build time checks in ipa_table_validate_build() assume that a DMA address is 64 bits wide. That is more restrictive than it has to be. A route or filter table is 64 bits wide no matter what the size of a DMA address is on the AP. The code actually uses a pointer to __le64 to access table entries, and a fixed constant IPA_TABLE_ENTRY_SIZE to describe the size of those entries. Loosen up two checks so they still verify some requirements, but such that they do not assume the size of a DMA address is 64 bits. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Julian Wiedmann says: ==================== s390/qeth: updates 2021-03-18 please apply the following patch series for qeth to netdev's net-next tree. This brings two small optimizations (replace a hard-coded GFP_ATOMIC, pass through the NAPI budget to enable napi_consume_skb()), and removes some redundant VLAN filter code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
The callbacks have been slimmed down to a level where they no longer do any actual work. So stop pretending that we support the NETIF_F_HW_VLAN_CTAG_FILTER feature. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Pending TX buffers are completed from the same NAPI code as normal TX buffers. Pass the NAPI budget to qeth_tx_complete_buf() so that the freeing of the completed skbs can be deferred. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
qeth_init_qdio_out_buf() is typically called during initialization, and the GFP_ATOMIC is only needed for a very specific & rare case during TX completion. Allow callers to specify a gfp mask, so that the initialization path can select GFP_KERNEL. While at it also clarify the function name. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Antoine Tenart says: ==================== net: xps: improve the xps maps handling This series aims at fixing various issues with the xps code, including out-of-bound accesses and use-after-free. While doing so we try to improve the xps code maintainability and readability. The main change is moving dev->num_tc and dev->nr_ids in the xps maps, to avoid out-of-bound accesses as those two fields can be updated after the maps have been allocated. This allows further reworks, to improve the xps code readability and allow to stop taking the rtnl lock when reading the maps in sysfs. The maps are moved to an array in net_device, which simplifies the code a lot. One future improvement may be to remove the use of xps_map_mutex from net/core/dev.c, but that may require extra care. Thanks! Antoine Since v3: - Removed the 3 patches about the rtnl lock and __netif_set_xps_queue as there are extra issues. Those patches were not tied to the others, and I'll see want can be done as a separate effort. - One small fix in patch 12. Since v2: - Patches 13-16 are new to the series. - Fixed another issue I found while preparing v3 (use after free of old xps maps). - Kept the rtnl lock when calling netdev_get_tx_queue and netdev_txq_to_tc. - Use get_device/put_device when using the sb_dev. - Take the rtnl lock in mlx5 and virtio_net when calling netif_set_xps_queue. - Fixed a coding style issue. Since v1: - Reordered the patches to improve readability and avoid introducing issues in between patches. - Use dev_maps->nr_ids to allocate the mask in xps_queue_show but still default to nr_cpu_ids/dev->num_rx_queues in xps_queue_show when dev_maps hasn't been allocated yet for backward compatibility.:w ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
In __netif_set_xps_queue, old map entries from the old dev_maps are freed but their corresponding entry in the old dev_maps aren't NULLed. Fix this. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
When setting up an new dev_maps in __netif_set_xps_queue, we remove and free maps from unused CPUs/rx-queues near the end of the function; by calling remove_xps_queue. However it's possible those maps are also part of the old not-freed-yet dev_maps, which might be used concurrently. When that happens, a map can be freed while its corresponding entry in the old dev_maps table isn't NULLed, leading to: "BUG: KASAN: use-after-free" in different places. This fixes the map freeing logic for unused CPUs/rx-queues, to also NULL the map entries from the old dev_maps table. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Most of the xps_cpus_show and xps_rxqs_show functions share the same logic. Having it in two different functions does not help maintenance. This patch moves their common logic into a new function, xps_queue_show, to improve this. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Now that nr_ids and num_tc are stored in the xps dev_maps, which are RCU protected, we do not have the need to protect the maps in the rtnl lock. Move the rtnl unlock up so we reduce the rtnl locking section. We also increase the reference count on the subordinate device if any, as we don't want this device to be freed while we use it (now that the rtnl lock isn't protecting it in the whole function). Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Improve the readability of the loop removing tx-queue from unused CPUs/rx-queues in __netif_set_xps_queue. The change should only be cosmetic. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
This patch adds an helper, xps_copy_dev_maps, to copy maps from dev_maps to new_dev_maps at a given index. The logic should be the same, with an improved code readability and maintenance. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Move the xps maps (xps_cpus_map and xps_rxqs_map) to an array in net_device. That will simplify a lot the code removing the need for lots of if/else conditionals as the correct map will be available using its offset in the array. This should not modify the xps maps behaviour in any way. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Remove the xps possible_mask. It was an optimization but we can just loop from 0 to nr_ids now that it is embedded in the xps dev_maps. That simplifies the code a bit. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Embed nr_ids (the number of cpu for the xps cpus map, and the number of rxqs for the xps cpus map) in dev_maps. That will help not accessing out of bound memory if those values change after dev_maps was allocated. Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
The xps cpus/rxqs map is accessed using dev->num_tc, which is used when allocating the map. But later updates of dev->num_tc can lead to having a mismatch between the maps and how they're accessed. In such cases the map values do not make any sense and out of bound accesses can occur (that can be easily seen using KASAN). This patch aims at fixing this by embedding num_tc into the maps, using the value at the time the map is created. This brings two improvements: - The maps can be accessed using the embedded num_tc, so we know for sure we won't have out of bound accesses. - Checks can be made before accessing the maps so we know the values retrieved will make sense. We also update __netif_set_xps_queue to conditionally copy old maps from dev_maps in the new one only if the number of traffic classes from both maps match. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Make the implementations of xps_cpus_show and xps_rxqs_show to converge, as the two share the same logic but diverted over time. This should not modify their behaviour but will help future changes and improve maintenance. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
In net-sysfs, get_netdev_queue_index returns an unsigned int. Some of its callers use an unsigned long to store the returned value. Update the code to be consistent, this should only be cosmetic. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Tenart authored
Use bitmap_zalloc instead of zalloc_cpumask_var in xps_cpus_show to align with xps_rxqs_show. This will improve maintenance and allow us to factorize the two functions. The function should behave the same. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rafał Miłecki authored
BCM4908 has only 1 RGMII reg for controlling port 7. Fixes: 73b7a604 ("net: dsa: bcm_sf2: support BCM4908's integrated switch") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rafał Miłecki authored
Simple macro like REG_RGMII_CNTRL_P() is insufficient as: 1. It doesn't validate port argument 2. It doesn't support chipsets with non-lineral RGMII regs layout Missing port validation could result in getting register offset from out of array. Random memory -> random offset -> random reads/writes. It affected e.g. BCM4908 for REG_RGMII_CNTRL_P(7). Fixes: a78e86ed ("net: dsa: bcm_sf2: Prepare for different register layouts") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Álvaro Fernández Rojas authored
Add device tree support to b53_mmap.c while keeping platform devices support. Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Mohammad Athari Bin Ismail says: ==================== net: stmmac: EST interrupts and ethtool This patchset adds support for handling EST interrupts and reporting EST errors. Additionally, the errors are added into ethtool statistic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ong Boon Leong authored
Below EST errors are added into ethtool statistic: 1) Constant Gate Control Error (CGCE): The counter "mtl_est_cgce" increases everytime CGCE interrupt is triggered. 2) Head-of-Line Blocking due to Scheduling (HLBS): The counter "mtl_est_hlbs" increases everytime HLBS interrupt is triggered. 3) Head-of-Line Blocking due to Frame Size (HLBF): The counter "mtl_est_hlbf" increases everytime HLBF interrupt is triggered. 4) Base Time Register error (BTRE): The counter "mtl_est_btre" increases everytime BTRE interrupt is triggered but BTRL not reaches maximum value of 15. 5) Base Time Register Error Loop Count (BTRL) reaches maximum value: The counter "mtl_est_btrlm" increases everytime BTRE interrupt is triggered and BTRL value reaches maximum value of 15. Please refer to MTL_EST_STATUS register in DesignWare Cores Ethernet Quality-of-Service Databook for more detail explanation. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Co-developed-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Voon Weifeng authored
Enabled EST related interrupts as below: 1) Constant Gate Control Error (CGCE) 2) Head-of-Line Blocking due to Scheduling (HLBS) 3) Head-of-Line Blocking due to Frame Size (HLBF). 4) Base Time Register error (BTRE) 5) Switch to S/W owned list Complete (SWLC) For HLBS, the user will get the info of all the queues that shows this error. For HLBF, the user will get the info of all the queue with the latest frame size which causes the error. Frame size 0 indicates no error. The ISR handling takes place when EST feature is enabled by user. Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Co-developed-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ong Boon Leong says: ==================== stmmac: add VLAN priority based RX steering The current tc flower implementation in stmmac supports both L3 and L4 filter offloading. This patch adds the support of VLAN priority based RX frame steering into different Rx Queues. The patches have been tested on both configuration test (include L3/L4) and traffic test (multi VLAN ping streams with RX Frame Steering) below:- > tc qdisc delete dev eth0 ingress > tc qdisc del dev eth0 parent root 2&> /dev/null > tc qdisc del dev eth0 parent ffff: 2&> /dev/null > tc qdisc add dev eth0 ingress > tc filter add dev eth0 parent ffff: protocol ip flower dst_ip 192.168.0.1 \ src_ip 192.168.1.1 ip_proto tcp dst_port 5201 src_port 6201 action drop > tc filter add dev eth0 parent ffff: protocol ip flower dst_ip 192.168.0.2 \ src_ip 192.168.1.2 ip_proto tcp dst_port 5202 src_port 6202 action drop > tc filter show dev eth0 ingress filter parent ffff: protocol ip pref 49151 flower chain 0 filter parent ffff: protocol ip pref 49151 flower chain 0 handle 0x1 eth_type ipv4 ip_proto tcp dst_ip 192.168.0.2 src_ip 192.168.1.2 dst_port 5202 src_port 6202 in_hw in_hw_count 1 action order 1: gact action drop random type none pass val 0 index 2 ref 1 bind 1 filter parent ffff: protocol ip pref 49152 flower chain 0 filter parent ffff: protocol ip pref 49152 flower chain 0 handle 0x1 eth_type ipv4 ip_proto tcp dst_ip 192.168.0.1 src_ip 192.168.1.1 dst_port 5201 src_port 6201 in_hw in_hw_count 1 action order 1: gact action drop random type none pass val 0 index 1 ref 1 bind 1 > tc qdisc delete dev eth0 ingress > tc qdisc del dev eth0 parent root 2&> /dev/null > tc qdisc del dev eth0 parent ffff: 2&> /dev/null > tc qdisc add dev eth0 ingress > tc qdisc add dev eth0 root mqprio num_tc 4 \ map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 \ queues 1@0 1@1 1@2 1@3 hw 0 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 0 hw_tc 3 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 1 hw_tc 2 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 2 hw_tc 1 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 3 hw_tc 0 > tc filter show dev eth0 ingress filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 handle 0x1 hw_tc 0 vlan_prio 3 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 handle 0x1 hw_tc 1 vlan_prio 2 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 handle 0x1 hw_tc 2 vlan_prio 1 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 handle 0x1 hw_tc 3 vlan_prio 0 in_hw in_hw_count 1 > tc qdisc delete dev eth0 ingress > ip address flush dev eth0 > ip address add 169.254.1.11/24 dev eth0 > ip link delete dev eth0.vlan1 2> /dev/null > ip link add link eth0 name eth0.vlan1 type vlan id 1 > ip address flush dev eth0.vlan1 2> /dev/null > ip address add 169.254.11.11/24 dev eth0.vlan1 > ip link delete dev eth0.vlan2 2> /dev/null > ip link add link eth0 name eth0.vlan2 type vlan id 2 > ip address flush dev eth0.vlan2 2> /dev/null > ip address add 169.254.12.11/24 dev eth0.vlan2 > ip link delete dev eth0.vlan3 2> /dev/null > ip link add link eth0 name eth0.vlan3 type vlan id 3 > ip address flush dev eth0.vlan3 2> /dev/null > ip address add 169.254.13.11/24 dev eth0.vlan3 > ip link delete dev eth0.vlan4 2> /dev/null > ip link add link eth0 name eth0.vlan4 type vlan id 4 > ip address flush dev eth0.vlan4 2> /dev/null > ip address add 169.254.14.11/24 dev eth0.vlan4 > ip address flush dev eth0 > ip address add 169.254.1.22/24 dev eth0 > ip link delete dev eth0.vlan1 2> /dev/null > ip link add link eth0 name eth0.vlan1 type vlan id 1 > ip address flush dev eth0.vlan1 2> /dev/null > ip address add 169.254.11.22/24 dev eth0.vlan1 > ip link delete dev eth0.vlan2 2> /dev/null > ip link add link eth0 name eth0.vlan2 type vlan id 2 > ip address flush dev eth0.vlan2 2> /dev/null > ip address add 169.254.12.22/24 dev eth0.vlan2 > ip link delete dev eth0.vlan3 2> /dev/null > ip link add link eth0 name eth0.vlan3 type vlan id 3 > ip address flush dev eth0.vlan3 2> /dev/null > ip address add 169.254.13.22/24 dev eth0.vlan3 > ip link delete dev eth0.vlan4 2> /dev/null > ip link add link eth0 name eth0.vlan4 type vlan id 4 > ip address flush dev eth0.vlan4 2> /dev/null > ip address add 169.254.14.22/24 dev eth0.vlan4 > mkdir -p /sys/fs/cgroup/net_prio/grp0 > echo eth0 0 > /sys/fs/cgroup/net_prio/grp0/net_prio.ifpriomap > echo eth0.vlan1 0 > /sys/fs/cgroup/net_prio/grp0/net_prio.ifpriomap > mkdir -p /sys/fs/cgroup/net_prio/grp1 > echo eth0 0 > /sys/fs/cgroup/net_prio/grp1/net_prio.ifpriomap > echo eth0.vlan2 1 > /sys/fs/cgroup/net_prio/grp1/net_prio.ifpriomap > mkdir -p /sys/fs/cgroup/net_prio/grp2 > echo eth0 0 > /sys/fs/cgroup/net_prio/grp2/net_prio.ifpriomap > echo eth0.vlan3 2 > /sys/fs/cgroup/net_prio/grp2/net_prio.ifpriomap > mkdir -p /sys/fs/cgroup/net_prio/grp3 > echo eth0 0 > /sys/fs/cgroup/net_prio/grp3/net_prio.ifpriomap > echo eth0.vlan4 3 > /sys/fs/cgroup/net_prio/grp3/net_prio.ifpriomap > tc qdisc del dev eth0 parent root 2&> /dev/null > tc qdisc del dev eth0 parent ffff: 2&> /dev/null > tc qdisc add dev eth0 ingress > tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 1@1 1@2 1@3 hw 0 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 0 hw_tc 0 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 1 hw_tc 1 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 2 hw_tc 2 > tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 3 hw_tc 3 > ip link set eth0.vlan1 type vlan egress-qos-map 0:0 > ip link set eth0.vlan2 type vlan egress-qos-map 1:1 > ip link set eth0.vlan3 type vlan egress-qos-map 2:2 > ip link set eth0.vlan4 type vlan egress-qos-map 3:3 > tc filter show dev eth0 ingress filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 filter parent ffff: protocol 802.1Q pref 49149 flower chain 0 handle 0x1 hw_tc 3 vlan_prio 3 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 filter parent ffff: protocol 802.1Q pref 49150 flower chain 0 handle 0x1 hw_tc 2 vlan_prio 2 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 filter parent ffff: protocol 802.1Q pref 49151 flower chain 0 handle 0x1 hw_tc 1 vlan_prio 1 in_hw in_hw_count 1 filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 filter parent ffff: protocol 802.1Q pref 49152 flower chain 0 handle 0x1 hw_tc 0 vlan_prio 0 in_hw in_hw_count 1 > echo 1 > /proc/irq/131/smp_affinity > echo 1 > /proc/irq/132/smp_affinity > echo 4 > /proc/irq/133/smp_affinity > echo 4 > /proc/irq/134/smp_affinity > echo 4 > /proc/irq/135/smp_affinity > echo 4 > /proc/irq/136/smp_affinity > echo 2 > /proc/irq/137/smp_affinity > echo 2 > /proc/irq/138/smp_affinity > ping -i 0.001 169.254.11.22 2&> /dev/null & > PID1="$!" > echo $PID1 > /sys/fs/cgroup/net_prio/grp0/cgroup.procs > ping -i 0.001 169.254.12.22 2&> /dev/null & > PID2="$!" > echo $PID2 > /sys/fs/cgroup/net_prio/grp1/cgroup.procs > ping -i 0.001 169.254.13.22 2&> /dev/null & > PID3="$!" > echo $PID3 > /sys/fs/cgroup/net_prio/grp2/cgroup.procs > ping -i 0.001 169.254.14.22 2&> /dev/null & > PID4="$!" > echo $PID4 > /sys/fs/cgroup/net_prio/grp3/cgroup.procs > ping -i 0.001 169.254.11.11 2&> /dev/null & > PID1="$!" > echo $PID1 > /sys/fs/cgroup/net_prio/grp0/cgroup.procs > ping -i 0.001 169.254.12.11 2&> /dev/null & > PID2="$!" > echo $PID2 > /sys/fs/cgroup/net_prio/grp1/cgroup.procs > ping -i 0.001 169.254.13.11 2&> /dev/null & > PID3="$!" > echo $PID3 > /sys/fs/cgroup/net_prio/grp2/cgroup.procs > ping -i 0.001 169.254.14.11 2&> /dev/null & > PID4="$!" > echo $PID4 > /sys/fs/cgroup/net_prio/grp3/cgroup.procs > watch -n 0.5 -d "cat /proc/interrupts | grep eth0" 131: 251918 41 0 0 IR-PCI-MSI 477184-edge eth0:rx-0 132: 18969 1 0 0 IR-PCI-MSI 477185-edge eth0:tx-0 133: 0 0 295872 0 IR-PCI-MSI 477186-edge eth0:rx-1 134: 0 0 16136 0 IR-PCI-MSI 477187-edge eth0:tx-1 135: 0 0 288042 0 IR-PCI-MSI 477188-edge eth0:rx-2 136: 0 0 16135 0 IR-PCI-MSI 477189-edge eth0:tx-2 137: 0 211177 0 0 IR-PCI-MSI 477190-edge eth0:rx-3 138: 2 16144 0 0 IR-PCI-MSI 477191-edge eth0:tx-3 139: 0 0 0 0 IR-PCI-MSI 477192-edge eth0:rx-4 140: 0 0 0 0 IR-PCI-MSI 477193-edge eth0:tx-4 141: 0 0 0 0 IR-PCI-MSI 477194-edge eth0:rx-5 142: 0 0 0 0 IR-PCI-MSI 477195-edge eth0:tx-5 143: 0 0 0 0 IR-PCI-MSI 477196-edge eth0:rx-6 144: 0 0 0 0 IR-PCI-MSI 477197-edge eth0:tx-6 145: 0 0 0 0 IR-PCI-MSI 477198-edge eth0:rx-7 146: 0 0 0 0 IR-PCI-MSI 477199-edge eth0:tx-7 157: 0 0 0 0 IR-PCI-MSI 477210-edge eth0:safety-ue ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ong Boon Leong authored
We extend tc flower to support configuration of VLAN priority-based RX frame steering hardware offloading. To map VLAN <PCP> to Traffic Class <TC>: $ tc filter add dev <IFNAME> parent ffff: protocol 802.1Q flower \ vlan_prio <PCP> hw_tc <TC> Note: <TC> < N whereby "tc qdisc ... num_tc N ..." To delete all tc flower configurations: $ tc qdisc delete dev <IFNAME> ingress Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ong Boon Leong authored
The current tc_add_flow() and tc_del_flow() use hardware L3 & L4 filters as offloading. The number of L3/L4 filters is read from L3L4FNUM field from MAC_HW_Feature1 register and is used to alloc priv->tc_entries[]. For RX frame steering based on VLAN priority offloading, we use MAC_RXQ_CTRL2 & MAC_RXQ_CTRL3 registers and all VLAN priority level can be configured independent from L3 & L4 filters. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Naveen Mamindlapalli says: ==================== Add tc hardware offloads This patch series adds support for tc hardware offloads. Patch #1 adds support for offloading flows that matches IP tos and IP protocol which will be used by tc hw offload support. Also added ethtool n-tuple filter to code to offload the flows matching the above fields. Patch #2 adds tc flower hardware offload support on ingress traffic. Patch #3 adds TC flower offload stats. Patch #4 adds tc TC_MATCHALL egress ratelimiting offload. * tc flower hardware offload in PF driver The driver parses the flow match fields and actions received from the tc subsystem and adds/delete MCAM rules for the same. Each flow contains set of match and action fields. If the action or fields are not supported, the rule cannot be offloaded to hardware. The tc uses same set of MCAM rules allocated for ethtool n-tuple filters. So, at a time only one entity can offload the flows to hardware, they're made mutually exclusive in the driver. Following match and actions are supported. Match: Eth dst_mac, EtherType, 802.1Q {vlan_id,vlan_prio}, vlan EtherType, IP proto {tcp,udp,sctp,icmp,icmp6}, IPv4 tos, IPv4{dst_ip,src_ip}, L4 proto {dst_port|src_port number}. Actions: drop, accept, vlan pop, redirect to another port on the device. The Hardware stats are also supported. Currently only packet counter stats are updated. * tc egress rate limiting support Added TC-MATCHALL classifier offload with police action applied for all egress traffic on the specified interface. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sunil Goutham authored
Add TC_MATCHALL egress ratelimiting offload support with POLICE action for entire traffic going out of the interface. Eg: To ratelimit egress traffic to 100Mbps $ ethtool -K eth0 hw-tc-offload on $ tc qdisc add dev eth0 clsact $ tc filter add dev eth0 egress matchall skip_sw \ action police rate 100Mbit burst 16Kbit HW supports a max burst size of ~128KB. Only one ratelimiting filter can be installed at a time. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Naveen Mamindlapalli authored
Add support to get the stats for tc flower flows that are offloaded to hardware. To support this feature, added a new AF mbox handler which returns the MCAM entry stats for a flow that has hardware stat counter enabled. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Naveen Mamindlapalli authored
This patch adds support for tc flower hardware offload on ingress traffic. Since the tc-flower filter rules use the same set of MCAM rules as the n-tuple filters, the n-tuple filters and tc flower rules are mutually exclusive. When one of the feature is enabled using ethtool, the other feature is disabled in the driver. By default the driver enables n-tuple filters during initialization. The following flow keys are supported. -> Ethernet: dst_mac -> L2 proto: all protocols -> VLAN (802.1q): vlan_id/vlan_prio -> IPv4: dst_ip/src_ip/ip_proto{tcp|udp|sctp|icmp}/ip_tos -> IPv6: ip_proto{icmpv6} -> L4(tcp/udp/sctp): dst_port/src_port The following flow actions are supported. -> drop -> accept -> redirect -> vlan pop The flow action supports multiple actions when vlan pop is specified as the first action. The redirect action supports redirecting to the PF/VF of same PCI device. Redirecting to other PCI NIX devices is not supported. Example #1: Add a tc filter rule to drop UDP traffic with dest port 80 # ethtool -K eth0 hw-tc-offload on # tc qdisc add dev eth0 ingress # tc filter add dev eth0 protocol ip parent ffff: flower ip_proto \ udp dst_port 80 action drop Example #2: Add a tc filter rule to redirect ingress traffic on eth0 with vlan id 3 to eth6 (ex: eth0 vf0) after stripping the vlan hdr. # ethtool -K eth0 hw-tc-offload on # tc qdisc add dev eth0 ingress # tc filter add dev eth0 parent ffff: protocol 802.1Q flower \ vlan_id 3 vlan_ethtype ipv4 action vlan pop action mirred \ ingress redirect dev eth6 Example #3: List the ingress filter rules # tc -s filter show dev eth4 ingress Example #4: Delete tc flower filter rule with handle 0x1 # tc filter del dev eth0 ingress protocol ip pref 49152 \ handle 1 flower Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Naveen Mamindlapalli authored
Add support for programming the HW MCAM match key with IP tos, IP(v6) proto icmp/icmpv6, allowing flow offload rules to be installed using those fields. The NPC HW extracts layer type, which will be used as a matching criteria for different IP protocols. The ethtool n-tuple filter logic has been updated to parse the IP tos and l4proto for HW offloading. l4proto tcp/udp/sctp/ah/esp/icmp are supported. See example usage below. Ex: Redirect l4proto icmp to vf 0 queue 0 ethtool -U eth0 flow-type ip4 l4proto 1 action vf 0 queue 0 Ex: Redirect flow with ip tos 8 to vf 0 queue 0 ethtool -U eth0 flow-type ip4 tos 8 vf 0 queue 0 Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-