- 20 Feb, 2020 23 commits
-
-
Paul Blakey authored
Chain ids are mapped to the lower part of reg C, and after loopback are copied to to CQE via a restore rule's flow_tag. To let tc continue in the correct chain, we find the corresponding chain id in the eswitch chain id <-> reg C mapping, and set the SKB's tc extension chain to it. That tells tc to continue processing from this set chain. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
Copy the current rep mpwqe rx handler which is also used by nic profile. In the next patch, we will add rep specific logic, just for the rep profile rx handler. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
Currently, if we miss in hardware after jumping to some chain, we continue in chain 0 in software. This is wrong, and with the new tc skb extension we can now restore the chain id on the skb, so tc can continue with in the correct chain. To restore the chain id in software after a miss in hardware, we create a register mapping from 32bit chain ids to 16bit of reg_c0 (that survives loopback), to 32bit chain ids. We then mark packets that miss on some chain with the current chain id mapping on their reg_c0 field. Using this mapping, we will support up to 64K concurrent chains. This register survives loopback and gets to the CQE on flow_tag via the eswitch restore rules. In next commit, we will reverse the mapping we got on the CQE to a chain id and tell tc to continue in the sw chain where we left off via the tc skb extension. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
On RX side create a restore table in OFFLOADS namespace. This table will match on all values for reg_c0 we will use, and set it to the flow_tag. This flow tag can then be read on the CQE. As there is no copy action from reg c0 to flow tag, instead we have to set the flow tag explictily. We add an API so callers can add all the used reg_c0 values (tags) and for each of those we add a restore rule. This will be used in a following patch to save the miss chain mapping tag on reg_c0 and from it restore the tc chain on the skb. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
Multi chain support requires the miss path to continue the processing from the last chain id, and for that we need to save the chain miss tag (a mapping for 32bit chain id) on reg_c0 which will come in a next patch. Currently reg_c0 is exclusively used to store the source port metadata, giving it 32bit, it is created from 16bits of vcha_id, and 16bits of vport number. We will move this source port metadata to upper 16bits, and leave the lower bits for the chain miss tag. We compress the reg_c0 source port metadata to 16bits by taking 8 bits from vhca_id, and 8bits from the vport number. Since we compress the vport number to 8bits statically, and leave two top ids for special PF/ECPF numbers, we will only support a max of 254 vports with this strategy. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
Add a new interface for mapping data to a given id range (max_id), and back again. It uses xarray as the id allocator and for finding a given id. For locking it uses xa_lock (spin_lock) for add()/del(), and rcu_read_lock for find(). This mapping interface also supports delaying the mapping removal via a workqueue. This is for cases where we need the mapping to have some grace period in regards to finding it back again, for example for packets arriving from hardware that were marked with by a rule with an old mapping that no longer exists. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
Set the starting chain from the tc skb ext chain value. Once we read the tc skb ext, delete it, so cloned/redirect packets won't inherit it. In order to lookup a chain by the chain index on the ingress block at ingress classification, provide a lookup function. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
To allow lookup of a block's chain under atomic context. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
On ingress and cls_act qdiscs init, save the block on ingress mini_Qdisc and and pass it on to ingress classification, so it can be used for the looking up a specified chain index. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Paul Blakey authored
TC multi chain configuration can cause offloaded tc chains to miss in hardware after jumping to some chain. In such cases the software should continue from the chain that missed in hardware, as the hardware may have manipulated the packet and updated some counters. Currently a single tcf classification function serves both ingress and egress. However, multi chain miss processing (get tc skb extension on hw miss, set tc skb extension on tc miss) should happen only on ingress. Refactor the code to use ingress classification function, and move setting the tc skb extension from general classification to it, as a prestep for supporting the hw miss scenario. Co-developed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Roman Mashak authored
Added tests for 'u32' extended match rules for u8 alignment. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: phy: Better support for BCM54810 This patch series updates the broadcom PHY driver to better support the BCM54810 and allow it to make use of the exiting bcm54xx_adjust_rxrefclk() as well as fix suspend/resume for it. Changes in v2: - added Reviewed-by tags from Andrew for patches #1 and #3 - expanded commit message in #2 to explain the change ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The BCM54810 PHY can use the standard BMCR Power down suspend, but needs a custom resume routine which first clear the Power down bit, and then re-initializes the PHY. While in low-power mode, the PHY only accepts writes to the BMCR register. The datasheet clearly says it: Reads or writes to any MII register other than MII Control register (address 00h) while the device is in the standby power-down mode may cause unpredictable results. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
bcm54xx_adjust_rxrefclk() already checks for PHY_BRCM_AUTO_PWRDWN_ENABLE and PHY_BRCM_DIS_TXCRXC_NOENRGY in order to set the appropriate bit. The situation is a bit more complicated with the flag PHY_BRCM_RX_REFCLK_UNUSED but essentially amounts to the same situation. The default setting for the 125MHz clock is to be on for all PHYs and we still treat BCM50610 and BCM50610M specifically with the polarity of the bit reversed. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The function bcm54xx_adjust_rxrefclk() works correctly on the BCM54810 PHY, allow this device ID to proceed through. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YueHaibing authored
drivers/net/ethernet/sfc/efx.c:116:38: warning: efx_default_channel_type defined but not used [-Wunused-const-variable=] commit 83975485 ("sfc: move channel alloc/removal code") left behind this, remove it. Reported-by: Hulk Robot <hulkci@huawei.com> Fixes: 83975485 ("sfc: move channel alloc/removal code") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Huazhong Tan says: ==================== net: hns3: misc updates for -net-next This series includes some misc updates for the HNS3 ethernet driver. [patch 1] modifies an unsuitable print when setting dulex mode. [patch 2] adds some debugfs info for TC and DWRR. [patch 3] adds some debugfs info for loopback. [patch 4] adds a missing help info for QS shaper in debugfs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yonglong Liu authored
HNS3 driver can dump QS shaper configs via debugfs, but missing help info in debugfs for this operation. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yufeng Mo authored
The MAC ID and loopback status information are obtained from the hardware, which will be helpful for debugging. This patch adds support for these two items in debugfs. Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yonglong Liu authored
The actual enabled TC numbers and the DWRR weight of each TC may be helpful for debugging, so adds them into debugfs. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guangbin Huang authored
Currently, if device is in link down status and user uses 'ethtool -s' command to set speed but not specify duplex mode, the duplex mode passed from ethtool to driver is unknown value(255), and the fibre port will identify this value as half duplex mode and print "only copper port supports half duplex!". This message is confusing. So for fibre port, only the setting duplex is half, prints error and returns. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732 ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Oros authored
commit 93c09704 ("net: phy: consider latched link-down status in polling mode") removed double-read of latched link-state register for polling mode from genphy_update_link(). This added extra ~1s delay into sequence link down->up. Following scenario: - After boot link goes up - phy_start() is called triggering an aneg restart, hence link goes down and link-down info is latched. - After aneg has finished link goes up. In phy_state_machine is checked link state but it is latched "link is down". The state machine is scheduled after one second and there is detected "link is up". This extra delay can be avoided when we keep link-state register double read in case when link was down previously. With this solution we don't miss a link-down event in polling mode and link-up is faster. Details about this quirky behavior on Realtek phy: Without patch: T0: aneg is started, link goes down, link-down status is latched T0+3s: state machine runs, up-to-date link-down is read T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1), here i read link-down (BMSR_LSTATUS==0), T0+5s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1), up-to-date link-up is read (BMSR_LSTATUS==1), phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING With patch: T0: aneg is started, link goes down, link-down status is latched T0+3s: state machine runs, up-to-date link-down is read T0+4s: state machine runs, aneg is finished (BMSR_ANEGCOMPLETE==1), first BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==0, second BMSR read: BMSR_ANEGCOMPLETE==1 and BMSR_LSTATUS==1, phydev->link goes up, state change PHY_NOLINK to PHY_RUNNING Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 19 Feb, 2020 17 commits
-
-
David S. Miller authored
Heiner Kallweit says: ==================== net: core: add helper tcp_v6_gso_csum_prep Several network drivers for chips that support TSO6 share the same code for preparing the TCP header, so let's factor it out to a helper. A difference is that some drivers reset the payload_len whilst others don't do this. This value is overwritten by TSO anyway, therefore the new helper resets it in general. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Use new helper tcp_v6_gso_csum_prep in additional network drivers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Simplify the code by using the new helper tcp_v6_gso_csum_prep. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Several network drivers for chips that support TSO6 share the same code for preparing the TCP header, so let's factor it out to a helper. A difference is that some drivers reset the payload_len whilst others don't do this. This value is overwritten by TSO anyway, therefore the new helper resets it in general. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christian Brauner authored
It is currenty possible to switch the TCP congestion control algorithm in non-initial network namespaces: unshare -U --map-root --net --fork --pid --mount-proc echo "reno" > /proc/sys/net/ipv4/tcp_congestion_control works just fine. But currently non-initial network namespaces have no way of kowing which congestion algorithms are available or allowed other than through trial and error by writing the names of the algorithms into the aforementioned file. Since we already allow changing the congestion algorithm in non-initial network namespaces by exposing the tcp_congestion_control file there is no reason to not also expose the tcp_{allowed,available}_congestion_control files to non-initial network namespaces. After this change a container with a separate network namespace will show: root@f1:~# ls -al /proc/sys/net/ipv4/tcp_* | grep congestion -rw-r--r-- 1 root root 0 Feb 19 11:54 /proc/sys/net/ipv4/tcp_allowed_congestion_control -r--r--r-- 1 root root 0 Feb 19 11:54 /proc/sys/net/ipv4/tcp_available_congestion_control -rw-r--r-- 1 root root 0 Feb 19 11:54 /proc/sys/net/ipv4/tcp_congestion_control Link: https://github.com/lxc/lxc/issues/3267Reported-by: Haw Loeung <haw.loeung@canonical.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Bianconi authored
Introduce "rx" prefix in the name scheme for xdp counters on rx path. Differentiate between XDP_TX and ndo_xdp_xmit counters Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Sunil Goutham says: ==================== octeontx2-af: Cleanup changes These patches cleanup AF driver by removing unnecessary function exports and cleanup repititive logic. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-