- 16 Feb, 2016 8 commits
-
-
David S. Miller authored
Jacob Keller says: ==================== ethtool: correct {GS}CHANNELS and {GS}RXFH conflict This patch series fixes up ethtool_set_channels operation which allowed modifying the RXFH table indirectly by reducing the number of queues below the current max queue used by the Rx flow table. Most drivers incorrectly allowed this to destroy the Rx flow table and would then start by reinitializing it to default settings. However, drivers are not able to correctly handle the conflict since there was no way to differentiate between the default settings and the user requested explicit settings. To fix this, implement a new netdev private flag which we use to indicate whether the RXFH has been user configured. If someone has a better alternative of how to store this information, let me know. I am not sure that priv_flags is the best solution but I have not had any better idea. Secondly, we add a function which just calls the driver's get_rxfh callback to determine the current indirection table. Loop through this and we can determine the current highest queue that will be used by RSS. Now, modify ethtool_set_channels to add a check ensuring that if (a) we have had rxfh configured by user, (b) we can get the maximum RSS queue currently used, then we ensure that the newly requested Rx count (or combined count) is at least as high as this maximum RSS queue. The reasoning here is that we can always safely increase the number of queues. If we decrease the queues we must ensure that the decrease does not go lower than the highest in-use queue for the Rx flow table. Drivers may still need to be patched if they currently overwrite the Rx flow table during channel configuration. If the driver currently always resets Rx flow table when increasing number of queues it must be patched to only do this when netif_is_rxfh_configured returns false. The second patch simply adds a check to ensure that all provided channel counts fit within driver defined maximums. The third patch fixes fm10k to correctly reconfigure the RSS reta table whenever it is still unconfigured. This means that the default state will provide RSS to every queue. Once the user has configured RXFH, then we should maintain it. In addition, since the case where we must reconfigure the RSS table in this case should now no longer occur, add a dev_err message to indicate the user that we did so. I have also supplied an ethtool patch to enable setting the default Rx flow indirection table. Without this, current ethtool does not support sending an indir_size of 0, and thus does not correctly support configuring back to the default. Changes in v2: * fixed compile error * fixed incorrect comparison with max_rx_in_use * adjusted looping over dev_size * removed inline on function * dropped patch about separating combined vs asymmetric channels * verified behavior using fm10k driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Keller, Jacob E authored
Also print an error message incase we do have to reconfigure as this should no longer happen anymore due to ethtool changes. If it somehow does occur, user should be made aware of it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Keller, Jacob E authored
Add a sanity check to ensure that all requested channel sizes are within bounds, which should reduce errors in driver implementation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Keller, Jacob E authored
Ethernet drivers implementing both {GS}RXFH and {GS}CHANNELS ethtool ops incorrectly allow SCHANNELS when it would conflict with the settings from SRXFH. This occurs because it is not possible for drivers to understand whether their Rx flow indirection table has been configured or is in the default state. In addition, drivers currently behave in various ways when increasing the number of Rx channels. Some drivers will always destroy the Rx flow indirection table when this occurs, whether it has been set by the user or not. Other drivers will attempt to preserve the table even if the user has never modified it from the default driver settings. Neither of these situation is desirable because it leads to unexpected behavior or loss of user configuration. The correct behavior is to simply return -EINVAL when SCHANNELS would conflict with the current Rx flow table settings. However, it should only do so if the current settings were modified by the user. If we required that the new settings never conflict with the current (default) Rx flow settings, we would force users to first reduce their Rx flow settings and then reduce the number of Rx channels. This patch proposes a solution implemented in net/core/ethtool.c which ensures that all drivers behave correctly. It checks whether the RXFH table has been configured to non-default settings, and stores this information in a private netdev flag. When the number of channels is requested to change, it first ensures that the current Rx flow table is not going to assign flows to now disabled channels. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bernhard Walle authored
We need that for a custom hardware that needs the reverse reset sequence. Signed-off-by: Bernhard Walle <bernhard@bwalle.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: phy: bcm7xxx: Misc cleanups These two patches are cleanups to the BCM7xxx internal PHY driver: - fix a constant name missing a X (as in BCM7XXX) - add a macro to reduce the amount of code duplication to add new entries ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Introduce a macro which helps adding new 40NM EPHY entries and reduces the amount of boilerplate code. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The driver is BCM7xxx, we were missing an additional X in the constant naming, fix that to be consistent. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 12 Feb, 2016 12 commits
-
-
David S. Miller authored
Edward Cree says: ==================== Local Checksum Offload Re-tested VxLAN; everything else is unchanged from v4. Changes from v4: * Rebased series to fix conflicts with vxlan/vxlan6 merge. Changes from v3: * Fixed inverted checksum values introduced in v3. * Don't mangle zero checksums in GRE. * Clear skb->encapsulation in iptunnel_handle_offloads when not using CHECKSUM_PARTIAL, lest drivers incorrectly interpret that as a request for inner checksum offload. Changes from v2: * Added support for IPv4 GRE. * Split out 'always set up for checksum offload' into its own patch. * Removed csum_help from iptunnel_handle_offloads. * Rewrote LCO callers to only fold once. * Simplified nocheck handling. Changes from v1: * Enabled support in more encapsulation protocols. I think it now covers everything except GRE. * Wrote up some documentation covering TX checksum offload, LCO and RCO. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
All users now pass false, so we can remove it, and remove the code that was conditional upon it. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The only protocol affected at present is Geneve. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
If the dst device doesn't support it, it'll get fixed up later anyway by validate_xmit_skb(). Also, this allows us to take advantage of LCO to avoid summing the payload multiple times. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The arithmetic properties of the ones-complement checksum mean that a correctly checksummed inner packet, including its checksum, has a ones complement sum depending only on whatever value was used to initialise the checksum field before checksumming (in the case of TCP and UDP, this is the ones complement sum of the pseudo header, complemented). Consequently, if we are going to offload the inner checksum with CHECKSUM_PARTIAL, we can compute the outer checksum based only on the packed data not covered by the inner checksum, and the initial value of the inner checksum field. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== tcp/dccp: better use of ephemeral ports Big servers have bloated bind table, making very hard to succeed ephemeral port allocations, without special containers/namespace tricks. This patch series extends the strategy added in commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"). Since ports used by connect() are much likely to be shared among them, we give a hint to both bind() and connect() to keep the crowds separated if possible. Of course, if on a specific host an application needs to allocate ~30000 ports using bind(), it will still be able to do so. Same for ~30000 connect() to a unique 2-tuple (dst addr, dst port) New implemetation is also more friendly to softirqs and reschedules. v2: rebase after TCP SO_REUSEPORT changes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Implement strategy used in __inet_hash_connect() in opposite way : Try to find a candidate using odd ports, then fallback to even ports. We no longer disable BH for whole traversal, but one bucket at a time. We also use cond_resched() to yield cpu to other tasks if needed. I removed one indentation level and tried to mirror the loop we have in __inet_hash_connect() and variable names to ease code maintenance. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), I added a very simple heuristic, so that we got better chances to use even ports, and allow bind() users to have more available slots. It gave nice results, but with more than 200,000 TCP sessions on a typical server, the ~30,000 ephemeral ports are still a rare resource. I chose to go a step further, by looking at all even ports, and if none was available, fallback to odd ports. The companion patch does the same in bind(), but in opposite way. I've seen exec times of up to 30ms on busy servers, so I no longer disable BH for the whole traversal, but only for each hash bucket. I also call cond_resched() to be gentle to other tasks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 11 Feb, 2016 20 commits
-
-
David S. Miller authored
Jesper Dangaard Brouer says: ==================== net: mitigating kmem_cache free slowpath This patchset is the first real use-case for kmem_cache bulk _free_. The use of bulk _alloc_ is NOT included in this patchset. The full use have previously been posted here [1]. The bulk free side have the largest benefit for the network stack use-case, because network stack is hitting the kmem_cache/SLUB slowpath when freeing SKBs, due to the amount of outstanding SKBs. This is solved by using the new API kmem_cache_free_bulk(). Introduce new API napi_consume_skb(), that hides/handles bulk freeing for the caller. The drivers simply need to use this call when freeing SKBs in NAPI context, e.g. replacing their calles to dev_kfree_skb() / dev_consume_skb_any(). Driver ixgbe is the first user of this new API. [1] http://thread.gmane.org/gmane.linux.network/384302/focus=397373 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jesper Dangaard Brouer authored
There is an opportunity to bulk free SKBs during reclaiming of resources after DMA transmit completes in ixgbe_clean_tx_irq. Thus, bulk freeing at this point does not introduce any added latency. Simply use napi_consume_skb() which were recently introduced. The napi_budget parameter is needed by napi_consume_skb() to detect if it is called from netpoll. Benchmarking IPv4-forwarding, on CPU i7-4790K @4.2GHz (no turbo boost) Single CPU/flow numbers: before: 1982144 pps -> after : 2064446 pps Improvement: +82302 pps, -20 nanosec, +4.1% (SLUB and GCC version 5.1.1 20150618 (Red Hat 5.1.1-4)) Joint work with Alexander Duyck. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jesper Dangaard Brouer authored
The network stack defers SKBs free, in-case free happens in IRQ or when IRQs are disabled. This happens in __dev_kfree_skb_irq() that writes SKBs that were free'ed during IRQ to the softirq completion queue (softnet_data.completion_queue). These SKBs are naturally delayed, and cleaned up during NET_TX_SOFTIRQ in function net_tx_action(). Take advantage of this a use the skb defer and flush API, as we are already in softirq context. For modern drivers this rarely happens. Although most drivers do call dev_kfree_skb_any(), which detects the situation and calls __dev_kfree_skb_irq() when needed. This due to netpoll can call from IRQ context. Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jesper Dangaard Brouer authored
Discovered that network stack were hitting the kmem_cache/SLUB slowpath when freeing SKBs. Doing bulk free with kmem_cache_free_bulk can speedup this slowpath. NAPI context is a bit special, lets take advantage of that for bulk free'ing SKBs. In NAPI context we are running in softirq, which gives us certain protection. A softirq can run on several CPUs at once. BUT the important part is a softirq will never preempt another softirq running on the same CPU. This gives us the opportunity to access per-cpu variables in softirq context. Extend napi_alloc_cache (before only contained page_frag_cache) to be a struct with a small array based stack for holding SKBs. Introduce a SKB defer and flush API for accessing this. Introduce napi_consume_skb() as replacement for e.g. dev_consume_skb_any() when running in NAPI context. A small trick to handle/detect if we are called from netpoll is to see if budget is 0. In that case, we need to invoke dev_consume_skb_irq(). Joint work with Alexander Duyck. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Nikolay Aleksandrov says: ==================== virtio_net: better ethtool setting validation This small set is a follow-up for the recent patches that added ethtool get/set settings. Patch 1 changes the speed validation routine to check if the speed is between 0 and INT_MAX (or SPEED_UNKNOWN) and patch 2 adds port validation to virtio_net and better validation comment. This set is on top of Michael's patch which explains that speeds from 0 to INT_MAX are valid: http://patchwork.ozlabs.org/patch/578911/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
We should validate the port setting that we got from the user and check if it's what we've set it to (PORT_OTHER), also add explanation that ignoring advertising is good as long as we don't have autonegotiation. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Devices these days can have any speed and as was recently pointed out any speed from 0 to INT_MAX is valid so adjust speed validation to accept such values. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andrew F. Davis says: ==================== net: phy: dp83848: Add support for TI TLK10x Ethernet PHYs This series is [0] split into its logical components. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew F. Davis authored
Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew F. Davis authored
The TI TLK10x Ethernet PHYs are similar in the interrupt relevant registers and so are compatible with the DP83848x devices already supported. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew F. Davis authored
Reorganize code by moving the desired interrupt mask definition out of function. Also rearrange the enable/disable interrupt function to prevent accidental over-writing of values in registers. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew F. Davis authored
After acquiring National Semiconductor, TI appears to have changed the Vendor Model Number for the DP83848C PHYs, add this new ID to supported IDs. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew F. Davis authored
Add a helper macro for defining dp83848 compatible phy devices. Update copyright info. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Vecera authored
The EVB (virtual bridge) functionality should be disabled on older BE3 and Lancer chips if SR-IOV is disabled in the NIC's BIOS. This setting is identified by the zero value of total VFs reported by the card. The GET_HSW_CONFIG command cannot be used as it is not supported by these older chipset's FW. v2: added the comment Cc: Sathya Perla <sathya.perla@broadcom.com> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com> Cc: Padmanabh Ratnakar <padmanabh.ratnakar@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Helmut Buchsbaum says: ==================== Add support for MICREL KSZ8795CLX 5-port switch This patch series refactors the spi-ks8995 driver to finally add support for the MICREL KSZ8795CLX. Additionally support for controlling a GPIO line for resetting the switch is added. Helmut Changes since v2: - use GPIO_ACTIVE_LOW according to Andrew's remark. - use ePAPR compliant node name in example, thanks to Sergei for pointing out Changes since v1: - removed initializing registers from Device Tree following Florian's advice - fixed GPIO handling for reset according to Andrew's remark. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Helmut Buchsbaum authored
Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Helmut Buchsbaum authored
Add support for MICREL KSZ8795CLX Integrated 5-Port, 10-/100-Managed Ethernet Switch with Gigabit GMII/RGMII and MII/RMII interfaces. Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Helmut Buchsbaum authored
Prepare creating SPI reads and writes for other switch families. The KS8995 family uses the straight forward <8bit CMD><8bit ADDR> sequence. To be able to support KSZ8795 family, which uses <3bit CMD><12bit ADDR><1 bit TR> make the SPI command creation chip variant dependent. Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Helmut Buchsbaum authored
When using device tree it is no more possible to reset the PHY at board level. Furthermore, doing in the driver allows to power down the switch when it is not used any more. The patch introduces a new optional property "reset-gpios" denoting an appropriate GPIO handle, e.g.: reset-gpios = <&gpio0 46 GPIO_ACTIVE_LOW> Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Helmut Buchsbaum authored
Since the chip variant is now determined by spi_device_id, verify family and chip id and determine the revision id. Signed-off-by: Helmut Buchsbaum <helmut.buchsbaum@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-