- 30 Nov, 2016 22 commits
-
-
Francis Yan authored
This patch exports the sender chronograph stats via the socket SO_TIMESTAMPING channel. Currently we can instrument how long a particular application unit of data was queued in TCP by tracking SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having these sender chronograph stats exported simultaneously along with these timestamps allow further breaking down the various sender limitation. For example, a video server can tell if a particular chunk of video on a connection takes a long time to deliver because TCP was experiencing small receive window. It is not possible to tell before this patch without packet traces. To prepare these stats, the user needs to set SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags while requesting other SOF_TIMESTAMPING TX timestamps. When the timestamps are available in the error queue, the stats are returned in a separate control message of type SCM_TIMESTAMPING_OPT_STATS, in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME, TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francis Yan authored
This patch exports all the sender chronograph measurements collected in the previous patches to TCP_INFO interface. Note that busy time exported includes all the other sending limits (rwnd-limited, sndbuf-limited). Internally the time unit is jiffy but externally the measurements are in microseconds for future extensions. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francis Yan authored
This patch measures the amount of time when TCP runs out of new data to send to the network due to insufficient send buffer, while TCP is still busy delivering (i.e. write queue is not empty). The goal is to indicate either the send buffer autotuning or user SO_SNDBUF setting has resulted network under-utilization. The measurement starts conservatively by checking various conditions to minimize false claims (i.e. under-estimation is more likely). The measurement stops when the SOCK_NOSPACE flag is cleared. But it does not account the time elapsed till the next application write. Also the measurement only starts if the sender is still busy sending data, s.t. the limit accounted is part of the total busy time. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francis Yan authored
This patch measures the total time when the TCP stops sending because the receiver's advertised window is not large enough. Note that once the limit is lifted we are likely in the busy status if we have data pending. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francis Yan authored
This patch measures TCP busy time, which is defined as the period of time when sender has data (or FIN) to send. The time starts when data is buffered and stops when the write queue is flushed by ACKs or error events. Note the busy time does not include SYN time, unless data is included in SYN (i.e. Fast Open). It does include FIN time even if the FIN carries no payload. Excluding pure FIN is possible but would incur one additional test in the fast path, which may not be worth it. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francis Yan authored
This patch implements the skeleton of the TCP chronograph instrumentation on sender side limits: 1) idle (unspec) 2) busy sending data other than 3-4 below 3) rwnd-limited 4) sndbuf-limited The limits are enumerated 'tcp_chrono'. Since a connection in theory can idle forever, we do not track the actual length of this uninteresting idle period. For the rest we track how long the sender spends in each limit. At any point during the life time of a connection, the sender must be in one of the four states. If there are multiple conditions worthy of tracking in a chronograph then the highest priority enum takes precedence over the other conditions. So that if something "more interesting" starts happening, stop the previous chrono and start a new one. The time unit is jiffy(u32) in order to save space in tcp_sock. This implies application must sample the stats no longer than every 49 days of 1ms jiffy. Signed-off-by: Francis Yan <francisyyan@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yegor Yefremov authored
Add the ability to query and set Energy Efficient Ethernet parameters via ethtool for applicable devices. This patch doesn't activate full EEE support in cpsw driver, but it enables reading and writing EEE advertising settings. This way one can disable advertising EEE for certain speeds. Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com> Acked-by: Rami Rosen <roszenrami@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vitaly Kuznetsov authored
When we change MTU or the number of channels on a netvsc device we get the following logged: hv_netvsc bf5edba8...: net device safe to remove hv_netvsc: hv_netvsc channel opened successfully hv_netvsc bf5edba8...: Send section size: 6144, Section count:2560 hv_netvsc bf5edba8...: Device MAC 00:15:5d:1e:91:12 link state up This information is useful as debug at most. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: couple of enhancements and fixes Couple of enhancements and fixes from Ido. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We call bus->init() before allocating 'lag.mapping'. Change the order of operations in removal path to reflect that. This makes the error path of mlxsw_core_bus_device_register() symmetric with mlxsw_core_bus_device_unregister(). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Without this rollback, the thermal zone is still registered during the error path, whereas its private data is freed upon the destruction of the underlying bus device due to the use of devm_kzalloc(). This results in use after free. Fix this by calling mlxsw_thermal_fini() from the appropriate place in the error path. Fixes: a50c1e35 ("mlxsw: core: Implement thermal zone") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The shared buffer pools are containers whose size is used to calculate the maximum usage for packets from / to a specific port / {port, PG/TC}, when dynamic threshold is employed. While it's perfectly fine for the sum of the pools to exceed the maximum size of the shared buffer, a single pool cannot. Add a check when the pool size is set and forbid sizes larger than the maximum size of the shared buffer. Without the patch: $ devlink sb pool set pci/0000:03:00.0 pool 0 size 999999999 thtype dynamic // No error is returned With the patch: $ devlink sb pool set pci/0000:03:00.0 pool 0 size 999999999 thtype dynamic devlink answers: Invalid argument Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We need to be able to limit the size of shared buffer pools, so query the maximum size from the device during init. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The newly added switchib driver fails to link if MLXSW_PCI=m: drivers/net/ethernet/mellanox/mlxsw/mlxsw_switchib.o: In function^Cmlxsw_sib_module_exit': switchib.c:(.exit.text+0x8): undefined reference to `mlxsw_pci_driver_unregister' switchib.c:(.exit.text+0x10): undefined reference to `mlxsw_pci_driver_unregister' drivers/net/ethernet/mellanox/mlxsw/mlxsw_switchib.o: In function `mlxsw_sib_module_init': switchib.c:(.init.text+0x28): undefined reference to `mlxsw_pci_driver_register' switchib.c:(.init.text+0x38): undefined reference to `mlxsw_pci_driver_register' switchib.c:(.init.text+0x48): undefined reference to `mlxsw_pci_driver_unregister' The other two such sub-drivers have a dependency, so add the same one here. In theory we could allow this driver if MLXSW_PCI is disabled, but it's probably not worth it. Fixes: d1ba5263 ("mlxsw: switchib: Introduce SwitchIB and SwitchIB silicon driver") Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geert Uytterhoeven authored
If device_release_driver(&phydev->mdio.dev) is called, it releases all resources belonging to the PHY device. Hence the subsequent call to phy_led_triggers_unregister() will access already freed memory when unregistering the LEDs. Move the call to phy_led_triggers_unregister() before the possible call to device_release_driver() to fix this. Fixes: 2e0bc452 ("net: phy: leds: add support for led triggers on phy link state change") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Zach Brown <zach.brown@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Machek authored
Fix comments, add some new, and make debugfs output consistent. Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Mack authored
There's a 'not' missing in one paragraph. Add it. Fixes: 30070984 ("cgroup: add support for eBPF programs") Signed-off-by: Daniel Mack <daniel@zonque.org> Reported-by: Rami Rosen <roszenrami@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jerome Brunet says: ==================== Fix OdroidC2 Gigabit Tx link issue This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F). The platform seems to enter LPI on the Rx path too often while performing relatively high TX transfer. This eventually break the link (both Tx and Rx), and require to bring the interface down and up again to get the Rx path working again. The root cause of this issue is not fully understood yet but disabling EEE advertisement on the PHY prevent this feature to be negotiated. With this change, the link is stable and reliable, with the expected throughput performance. The patchset adds options in the generic phy driver to disable EEE advertisement, through device tree. The way it is done is very similar to the handling of the max-speed property. Changes since V2: [2] - Rename "eee-advert-disable" to "eee-broken-modes" to make the intended purpose of this option clear (flag broken configuration, not a configuration option) - Add DT bindings constants so the DT configuration is more user friendly - Submit to net-next instead of net. Changes since V1: [1] - Disable the advertisement of EEE in the generic code instead of the realtek driver. [1] : http://lkml.kernel.org/r/1479220154-25851-1-git-send-email-jbrunet@baylibre.com [2] : http://lkml.kernel.org/r/1479742524-30222-1-git-send-email-jbrunet@baylibre.com ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
jbrunet authored
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
jbrunet authored
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Yegor Yefremov <yegorslists@googlemail.com> Tested-by: Andreas Färber <afaerber@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
jbrunet authored
This patch adds an option to disable EEE advertisement in the generic PHY by providing a mask of prohibited modes corresponding to the value found in the MDIO_AN_EEE_ADV register. On some platforms, PHY Low power idle seems to be causing issues, even breaking the link some cases. The patch provides a convenient way for these platforms to disable EEE advertisement and work around the issue. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Yegor Yefremov <yegorslists@googlemail.com> Tested-by: Andreas Färber <afaerber@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niklas Cassel authored
The dwmac4 IP can synthesized with 1-8 number of tx queues. On an IP synthesized with DWC_EQOS_NUM_TXQ > 1, all txqueues are disabled by default. For these IPs, the bitfield TXQEN is R/W. Always enable tx queue 0. The write will have no effect on IPs synthesized with DWC_EQOS_NUM_TXQ == 1. The driver does still not utilize more than one tx queue in the IP. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 29 Nov, 2016 3 commits
-
-
Peter Robinson authored
Add dependencies on the architectures that support these devices and add compile test to ensure ongoing code build coverage. Signed-off-by: Peter Robinson <pbrobinson@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andreas Färber authored
Also include the netdev list for convenience, as done elsewhere. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Andreas Färber <afaerber@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
mlx4 stats are chaotic because a deferred work queue is responsible to update them every 250 ms. Even sampling stats every one second with "sar -n DEV 1" gives variations like the following : lpaa23:~# sar -n DEV 1 10 | grep eth0 | cut -c1-65 07:39:22 eth0 146877.00 3265554.00 9467.15 4828168.50 07:39:23 eth0 146587.00 3260329.00 9448.15 4820445.98 07:39:24 eth0 146894.00 3259989.00 9468.55 4819943.26 07:39:25 eth0 110368.00 2454497.00 7113.95 3629012.17 <<>> 07:39:26 eth0 146563.00 3257502.00 9447.25 4816266.23 07:39:27 eth0 145678.00 3258292.00 9389.79 4817414.39 07:39:28 eth0 145268.00 3253171.00 9363.85 4809852.46 07:39:29 eth0 146439.00 3262185.00 9438.97 4823172.48 07:39:30 eth0 146758.00 3264175.00 9459.94 4826124.13 07:39:31 eth0 146843.00 3256903.00 9465.44 4815381.97 Average: eth0 142827.50 3179259.70 9206.30 4700578.16 This patch allows rx/tx bytes/packets counters being folded at the time we need stats. We now can fetch stats every 1 ms if we want to check NIC behavior on a small time window. It is also easier to detect anomalies. lpaa23:~# sar -n DEV 1 10 | grep eth0 | cut -c1-65 07:42:50 eth0 142915.00 3177696.00 9212.06 4698270.42 07:42:51 eth0 143741.00 3200232.00 9265.15 4731593.02 07:42:52 eth0 142781.00 3171600.00 9202.92 4689260.16 07:42:53 eth0 143835.00 3192932.00 9271.80 4720761.39 07:42:54 eth0 141922.00 3165174.00 9147.64 4679759.21 07:42:55 eth0 142993.00 3207038.00 9216.78 4741653.05 07:42:56 eth0 141394.06 3154335.64 9113.85 4663731.73 07:42:57 eth0 141850.00 3161202.00 9144.48 4673866.07 07:42:58 eth0 143439.00 3180736.00 9246.05 4702755.35 07:42:59 eth0 143501.00 3210992.00 9249.99 4747501.84 Average: eth0 142835.66 3182165.93 9206.98 4704874.08 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 28 Nov, 2016 15 commits
-
-
Haishuang Yan authored
It shold reserved sizeof(ipv6hdr) for geneve in ipv6 tunnel. Fixes: c3ef5aa5 ('geneve: Merge ipv4 and ipv6 geneve_build_skb()') Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== Documentation: net: phy: Improve documentation This patch series addresses discussions and feedback that was recently received on the mailing-list in the area of: flow control/pause frames, interpretation of phy_interface_t and finally add some links to useful standards documents. Changes in v3: - add Timur's feedback into patch 3 Changes in v2: - clarify a few things in the RGMII section, add a paragraph about common issues with RGMII delay mismatches ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add links to the IEEE 802.3-2008 document, and the RGMII v1.3 and v2.0 revisions of the standard. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
RGMII is a recurring source of pain for people with Gigabit Ethernet hardware since it may require PHY driver and MAC driver level configuration hints. Document what are the expectations from PHYLIB and what options exist. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Describe that the Ethernet MAC controller is ultimately responsible for dealing with proper pause frames/flow control advertisement and enabling, and that it is therefore allowed to have it change phydev->supported/advertising with SUPPORTED_Pause and SUPPORTED_AsymPause. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Remove the function pointers documentation which duplicates information found in include/linux/phy.h. Maintaining documentation about two different locations just does not work, but the code is less likely to be outdated. Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andreas Färber authored
mv88e6xxx_g1_irq_setup() sets up chip->g1_irq.nirqs interrupt mappings, so free the same amount. This will be 8 or 9 in practice, less than 16. Fixes: dc30c35b ("net: dsa: mv88e6xxx: Implement interrupt support.") Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Saeed Mahameed says: ==================== Mellanox 100G mlx5 DCBX and ethtool updates This series provides the following mlx5 updates: From Huy: DCBX CEE API and DCBX firmware/host modes support. - 1st patch ensures the dcbnl_rtnl_ops is published only when the qos capability bits is on. - 2nd patch adds the support for CEE interfaces into mlx5 dcbnl_rtnl_ops - 3rd patch refactors ETS query to read ETS configuration directly from firmware rather than having a software shadow to it. The existing IEEE interfaces stays the same. - 4th patch adds the support for MLX5_REG_DCBX_PARAM and MLX5_REG_DCBX_APP firmware commands to manipulate mlx5 DCBX mode. - 5th patch adds the driver support for the new DCBX firmware. This ensures the backward compatibility versus the old and new firmware. With the new DCBX firmware, qos settings can be controlled by either firmware or software depending on the DCBX mode. From Kamal and Saeed: - mlx5 self-test support. From Shaker: - Private flag to give the user the ability to enable/disable mlx5 CQE compression. V1->V2: - Check ETS capability where needed in: ("net/mlx5e: Read ETS settings directly from firmware") - Fix return value of mlx5e_dcbnl_switch_to_host_mode in: ("net/mlx5e: ConnectX-4 firmware support for DCBX") - Update commit message of: ("net/mlx5e: ConnectX-4 firmware support for DCBX") - Fix two sparse static check warnings in en_selftest.c This series was generated against commit: e5f12b3f ("Merge branch 'mlxsw-trap-groups-and-policers'") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shaker Daibes authored
The user can now override the automatic driver decision using the rx_cqe_compress flag, which is the preference for CQE compression. The flag is initialized with the automatic driver decision. Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shaker Daibes authored
pflags is a configuration parameter for the netdev, naturally it belongs to priv->params. Also introduce MLX5E_GET_PFLAG Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
Extend the self diagnostic tests to support loopback test. The loopback test doesn't require the offline flag, it will use the generic dev_queue_xmit and a dedicated packet_type to capture and verify mlx5e selftest loopback packets. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
The self diagnostics test implementaion include the following features: 1. Link Test: Check that link is in up state. 2. Speed Test: Check that link was negotiated correctly. 3. Health Test: Check the device health. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Use setdcbx interface to set the DCBX mode to firmware or os. If setdcbx is called with mode value of zero, the DCBX mode is set to firmware. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
DBCX by default is controlled by firmware where dcbx capability bit is set. In this mode, firmware is responsible for reading/sending the TLV packets from/to the remote partner. This patch sets up the infrastructure to move between HOST/FW DCBX control mode. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Add set/query commands for DCBX_PARAM register Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-