- 15 Jun, 2021 15 commits
-
-
Shay Drory authored
Whenever users provided affinity for an EQ creation request, map the EQ to a matching IRQ. Matching IRQ=IRQ with the same affinity and type (completion/control) of the EQ created. This mapping is being done in agressive dedicated IRQ allocation scheme, which described bellow. First, we check whether there is a matching IRQ that his min threshold is not exhausted. - min_eqs_threshold = 3 for control EQ. - min_eqs_threshold = 1 for completion EQ. In case no matching IRQ was found, try to request a new IRQ. In case we can't request a new IRQ, reuse least-used matching IRQ. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Move mlx5_sf_max_functions() and friends from the privete sf/sf.h to the public lib/sf.h. This is done in order to have one direction include paths. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
FW is now supporting more than 256 MSI-X per PF (up to 2K). Hence, enlarge interrupt field in CREATE_EQ to make use of the new MSI-X's. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
SFs (Sub Functions) currently use IRQs from the global IRQ table their parent Physical Function have. In order to better scale, we need to allocate more IRQs and share them between different SFs. Driver will maintain 3 separated irq pools: 1. A pool that serve the PF consumer (PF's netdev, rdma stacks), similar to what the driver had before this patch. i.e, this pool will share irqs between rdma and netev, and will keep the irq indexes and allocation order. The last is important for PF netdev rmap (aRFS). 2. A pool of control IRQs for SFs. The size of this pool is the number of SFs that can be created divided by SFS_PER_IRQ. This pool will serve the control path EQs of the SFs. 3. A pool of completion data path IRQs for SFs transport queues. The size of this pool is: num_irqs_allocated - pf_pool_size - sf_ctrl_pool_size. This pool will served netdev and rdma stacks. Moreover, rmap is not supported on SFs. Sharing methodology of the SFs pools is explained in the next patch. Important note: rmap is not supported on SFs because rmap mapping cannot function correctly for IRQs that are shared for different core/netdev RX rings. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Store newly created IRQs in the xarray DB instead of a static array, so we will be able to store only IRQs which are being used. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
IRQs are being simplified in order to ease their sharing and any feature specific object will be moved to upper layer. Hence we move rmap object into eq_table. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Extend mlx5_irq_request so that IRQs will be requested upon EQ creation, and not on driver boot. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
In next patches, IRQs will be requested according to demand, instead of statically on driver boot. Also, currently, rmap is managed by the IRQ layer. rmap management will move out from the IRQ layer in future patches. Therefore, we want to remove the IRQ from the rmap, when IRQ is destroyed, instead of removing all the IRQs from the rmap when irq_table is destroyed. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Leon Romanovsky authored
The eq.[c|h] files are under major rewrite. so use this opportunity and update their copyright and license texts. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Leon Romanovsky authored
The users of EQ are running their code on different CPUs and with various affinity patterns. Move the cpumask setting close to their actual usage. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Introduce new API that will allow IRQs users to hold a pointer to mlx5_irq. In the end of this series, IRQs will be allocated on demand. Hence, this will allow us to properly manage and use IRQs. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Leon Romanovsky authored
Shared IRQ are consumed by multiple EQ users and in order to properly initialize and later release such IRQs, we add kref counting of IRQ structure. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Mark Bloch authored
Lag is used to combine two PCI functions of the same HCA into a single logical unit. This is a core functionality and as such should be managed by the core driver. Currently this isn't the case. While we store the lag software structure inside the lower device, its lifetime (creation / destruction) is dictated by the mlx5e part. Change the ownership model so lag is tied to the lifetime of the lower level driver instead to the mlx5e part. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Mark Bloch authored
If MLX5_PRIV_FLAGS_DISABLE_ALL_ADEV is set it means the device is going down and mlx5_rescan_drivers_locked() shouldn't be called. With this patch and the previous one in the series, unbinding a PCI function when its netdev is part of a bond works and leaves the system in a working state. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Mark Bloch authored
When a net device is removed (can happen if the PCI function is unbound from the system) it's not enough to destroy the hardware lag. The system should recreate the original devices that were present before the lag. As the same flow is done when a net device is removed from the bond refactor and reuse the code. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
- 14 Jun, 2021 25 commits
-
-
Loic Poulain authored
There is not strong reason to have both WWAN and WWAN_CORE symbols, Let's build the WWAN core framework when WWAN is selected, in the same way as for other subsystems. This fixes issue with mhi_net selecting WWAN_CORE without WWAN and reported by kernel test robot: Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for WWAN_CORE Depends on NETDEVICES && WWAN Selected by - MHI_NET && NETDEVICES && NET_CORE && MHI_BUS Fixes: 9a44c1cc ("net: Add a WWAN subsystem") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
After the blamed patch, __skb_flow_dissect() on the DSA master stopped adjusting for the length of the DSA headers. This is because it was told to adjust only if the needed_headroom is zero, aka if there is no DSA header. Of course, the adjustment should be done only if there _is_ a DSA header. Modify the comment too so it is clearer. Fixes: 4e500251 ("net: dsa: generalize overhead for taggers that use both headers and trailers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The struct sja1105_regs tables are not modified during the runtime of the driver, so they can be made constant. In fact, struct sja1105_info already holds a const pointer to these. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vladimir Oltean says: ==================== Fixes and improvements to TJA1103 PHY driver This series contains: - an erratum workaround for the TJA1103 PHY integrated in SJA1110 - an adaptation of the driver so it prints less unnecessary information when probing on SJA1110 - a PTP RX timestamping bug fix and a clarification patch Targeting net-next since the PHY support is currently in net-next only. Changes in v3: Added one more patch which improves the readability of nxp_c45_reconstruct_ts. Changes in v2: Added a comment to the hardware workaround procedure. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The SJA1110 switch integrates TJA1103 PHYs, but in SJA1110 switch rev B silicon, there is a bug in that the registers for selecting the 100base-T1 autoneg master/slave roles are not writable. To enable write access to the master/slave registers, these additional PHY writes are necessary during initialization. The issue has been corrected in later SJA1110 silicon versions and is not present in the standalone PHY variants, but applying the workaround unconditionally in the driver should not do any harm. Suggested-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The reconstruction procedure for partial timestamps reads the current PTP time and fills in the low 2 bits of the second portion, as well as the nanoseconds portion, from the actual hardware packet timestamp. Critically, the reconstruction procedure works because it assumes that the current PTP time is strictly larger than the hardware timestamp was: it detects a 2-bit wraparound of the 'seconds' portion by checking whether the 'seconds' portion of the partial hardware timestamp is larger than the 'seconds' portion of the current time. That can only happen if the hardware timestamp was captured by the PHY during the last phase of a 'modulo 4 seconds' interval, and the current PTP time was read by the driver during the initial phase of the next 'modulo 4 seconds' interval. The partial RX timestamps are added to priv->rx_queue in nxp_c45_rxtstamp() and they are processed potentially in parallel by the aux worker thread in nxp_c45_do_aux_work(). This means that it is possible for nxp_c45_do_aux_work() to process more than one RX timestamp during the same schedule. There is one premature optimization that will cause issues: for RX timestamping, the driver reads the current time only once, and it uses that to reconstruct all PTP RX timestamps in the queue. For the second and later timestamps, this will be an issue if we are processing two RX timestamps which are to the left and to the right, respectively, of a 4-bit wraparound of the 'seconds' portion of the PTP time, and the current PTP time is also pre-wraparound. 0.000000000 4.000000000 8.000000000 12.000000000 |..................|..................|..................|............> ^ ^ ^ ^ time | | | | | | | process hwts 1 and hwts 2 | | | | | hwts 2 | | | read current PTP time | hwts 1 What will happen in that case is that hwts 2 (post-wraparound) will use a stale current PTP time that is pre-wraparound. But nxp_c45_reconstruct_ts will not detect this condition, because it is not coded up for it, so it will reconstruct hwts 2 with a current time from the previous 4 second interval (i.e. 0.something instead of 4.something). This is solvable by making sure that the full 64-bit current time is always read after the PHY has taken the partial RX timestamp. We do this by reading the current PTP time for every timestamp in the RX queue. Fixes: 514def5d ("phy: nxp-c45-tja11xx: add timestamping support") Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
nxp_c45_reconstruct_ts() takes a partial hardware timestamp in @hwts, with 2 bits of the 'seconds' portion, and a full PTP time in @ts. It patches in the lower bits of @hwts into @ts, and to ensure that the reconstructed timestamp is correct, it checks whether the lower 2 bits of @hwts are not in fact higher than the lower 2 bits of @ts. This is not logically possible because, according to the calling convention, @ts was collected later in time than @hwts, but due to two's complement arithmetic it can actually happen, because the current PTP time might have wrapped around between when @hwts was collected and when @ts was, yielding the lower 2 bits of @ts smaller than those of @hwts. To correct for that situation which is expected to happen under normal conditions, the driver subtracts exactly one wraparound interval from the reconstructed timestamp, since the upper bits of that need to correspond to what the upper bits of @hwts were, not to what the upper bits of @ts were. Readers might be confused because the driver denotes the amount of bits that the partial hardware timestamp has to offer as TS_SEC_MASK (timestamp mask for seconds). But it subtracts a seemingly unrelated BIT(2), which is in fact more subtle: if the hardware timestamp provides 2 bits of partial 'seconds' timestamp, then the wraparound interval is 2^2 == BIT(2). But nonetheless, it is better to express the wraparound interval in terms of a definition we already have, so replace BIT(2) with 1 + GENMASK(1, 0) which produces the same result but is clearer. Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
The SJA1110 switch integrates these PHYs, and they do not have support for timestamping. This message becomes quite overwhelming: [ 10.056596] NXP C45 TJA1103 spi1.0-base-t1:01: the phy does not support PTP [ 10.112625] NXP C45 TJA1103 spi1.0-base-t1:02: the phy does not support PTP [ 10.167461] NXP C45 TJA1103 spi1.0-base-t1:03: the phy does not support PTP [ 10.223510] NXP C45 TJA1103 spi1.0-base-t1:04: the phy does not support PTP [ 10.278239] NXP C45 TJA1103 spi1.0-base-t1:05: the phy does not support PTP [ 10.332663] NXP C45 TJA1103 spi1.0-base-t1:06: the phy does not support PTP [ 15.390828] NXP C45 TJA1103 spi1.2-base-t1:01: the phy does not support PTP [ 15.445224] NXP C45 TJA1103 spi1.2-base-t1:02: the phy does not support PTP [ 15.499673] NXP C45 TJA1103 spi1.2-base-t1:03: the phy does not support PTP [ 15.554074] NXP C45 TJA1103 spi1.2-base-t1:04: the phy does not support PTP [ 15.608516] NXP C45 TJA1103 spi1.2-base-t1:05: the phy does not support PTP [ 15.662996] NXP C45 TJA1103 spi1.2-base-t1:06: the phy does not support PTP So reduce its log level to debug. Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Zhou Yanjie says: ==================== Add Ingenic SoCs MAC support. v2->v3: 1.Add "ingenic,mac.yaml" for Ingenic SoCs. 2.Change tx clk delay and rx clk delay from hardware value to ps. 3.return -EINVAL when a unsupported value is encountered when parsing the binding. 4.Simplify the code of the RGMII part of X2000 SoC according to Andrew Lunn’s suggestion. 5.Follow the example of "dwmac-mediatek.c" to improve the code that handles delays according to Andrew Lunn’s suggestion. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
周琰杰 (Zhou Yanjie) authored
Add support for Ingenic SoC MAC glue layer support for the stmmac device driver. This driver is used on for the MAC ethernet controller found in the JZ4775 SoC, the X1000 SoC, the X1600 SoC, the X1830 SoC, and the X2000 SoC. Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
周琰杰 (Zhou Yanjie) authored
Add the dwmac bindings for the JZ4775 SoC, the X1000 SoC, the X1600 SoC, the X1830 SoC and the X2000 SoC from Ingenic. Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Oleksandr Mazur says: ==================== Marvell Prestera driver implementation of devlink functionality. This patch series implement Prestera Switchdev driver devlink traps, that are registered within the driver, as well as extend current devlink functionality by adding new hard drop statistics counter, that could be retrieved on-demand: the counter shows number of packets that have been dropped by the underlying device and haven't been passed to the devlink subsystem. The core prestera-devlink functionality is implemented in the prestera_devlink.c. The patch series also extends the existing devlink kernel API: - devlink: add trap_drop_counter_get callback for driver to register - make it possible to keep track of how many packets have been dropped (hard) by the switch device, before the packets even made it to the devlink subsystem (e.g. dropped due to RXDMA buffer overflow). The core features that extend current functionality of prestera Switchdev driver: - add logic for driver traps and drops registration (also traps with DROP action). - add documentation for prestera driver traps and drops group. PATCH v2: 1) Rebase whole series on top of latest mater; 2) Remove storm control-related patches, as they're out of devlink scope; ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Add documentation for the devlink feature prestera switchdev driver supports: add description for the support of the driver-specific devlink traps (include both traps with action TRAP and action DROP); Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Add traps that have init_action being set to DROP. Add 'trap_drop_counter_get' (devlink API) callback implementation, that is used to get number of packets that have been dropped by the HW (traps with action 'DROP'). Add new FW command CPU_CODE_COUNTERS_GET. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Add devlink traps registration (with corresponding groups) for all the traffic types that driver traps to the CPU; prestera_rxtx: report each packet trapped to the CPU (RX) to the prestera_devlink; Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Add hard drop counter check testcase, to make sure netdevsim driver properly handles the devlink hard drop counters get/set callbacks. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Whenever query statistics is issued for trap with DROP action, devlink subsystem would also fill-in statistics 'dropped' field. In case if device driver did't register callback for hard drop statistics querying, 'dropped' field will be omitted and not filled. Add trap_drop_counter_get callback implementation to the netdevsim. Add new test cases for netdevsim, to test both the callback functionality, as well as drop statistics alteration check. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
testing: selftests: net: forwarding: add devlink-required functionality to test (hard) dropped stats field Add devlink_trap_drop_packets_get function, as well as test that are used to verify devlink (hard) dropped stats functionality works. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksandr Mazur authored
Whenever query statistics is issued for trap, devlink subsystem would also fill-in statistics 'dropped' field. This field indicates the number of packets HW dropped and failed to report to the device driver, and thus - to the devlink subsystem itself. In case if device driver didn't register callback for hard drop statistics querying, 'dropped' field will be omitted and not filled. Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
Author forgot to remove that flag. Fixes: f7af616c ("net: iosm: infrastructure") Reported-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lijun Pan authored
The 3rd argument is u32 by function definition while it is __be32 by function declaration. Signed-off-by: Lijun Pan <lijunp213@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Oleksij Rempel says: ==================== provide cable test support for the ksz886x switch changes v5: - drop resume() patch - add Reviewed-by tags. - rework dsa_slave_phy_connect() patch changes v4: - use fallthrough; - use EOPNOTSUPP instead of ENOTSUPP - drop flags variable in dsa_slave_phy_connect patch - extend description for the "net: phy: micrel: apply resume errat" patch - fix "use consistent alignments" patch changes v3: - remove RFC tag changes v2: - use generic MII_* defines where possible - rework phylink validate - remove phylink get state function - reorder cabletest patches to make PHY flag patch in the right order - fix MDI-X detection This patches provide support for cable testing on the ksz886x switches. Since it has one special port, we needed to add phylink with validation and extra quirk for the PHY to signal, that one port will not provide valid cable testing reports. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksij Rempel authored
This patch support for cable test for the ksz886x switches and the ksz8081 PHY. The patch was tested on a KSZ8873RLL switch with following results: - port 1: - provides invalid values, thus return -ENOTSUPP (Errata: DS80000830A: "LinkMD does not work on Port 1", http://ww1.microchip.com/downloads/en/DeviceDoc/KSZ8873-Errata-DS80000830A.pdf) - port 2: - can detect distance - can detect open on each wire of pair A (wire 1 and 2) - can detect open only on one wire of pair B (only wire 3) - can detect short between wires of a pair (wires 1 + 2 or 3 + 6) - short between pairs is detected as open. For example short between wires 2 + 3 is detected as open. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksij Rempel authored
The current get_phy_flags() is only processed when we connect to a PHY via a designed phy-handle property via phylink_of_phy_connect(), but if we fallback on the internal MDIO bus created by a switch and take the dsa_slave_phy_connect() path then we would not be processing that flag and using it at PHY connection time. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oleksij Rempel authored
Add mapping for LINK_MD register to enable cable testing functionality. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-