- 07 Dec, 2015 10 commits
-
-
Bjørn Mork authored
Adding a writable sysfs attribute for the "NDP to end" quirk flag. This makes it easier for end users to test new devices for this firmware bug. We've been lucky so far, but we should not depend on reporters capable of rebuilding the driver. Cc: Enrico Mioso <mrkiko.rs@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Or Gerlitz says: ==================== Add HA and LAG support for mlx4 SRIOV VFs This series is built upon the code added in commit ce388fff "Merge branch 'mlx4-next'" which added HA and LAG support to mlx4 RoCE and SRIOV services. We add HA and Link Aggregation support to single ported mlx4 Ethernet VFs. In this case, the PF Ethernet interfaces are bonded, the VFs see single port HW devices (already supported) -- however, now this port is highly available. This means that all VF HW QPs (both VF Ethernet driver and VF RoCE / RAW QPs) are subject to the V2P (Virtual-To-Physical) mapping which is managed by the PF driver, and hence resilient across link failures and such events. When bonding operates in Dynamic link aggregation (802.3ad) mode, traffic from each VF will go over the VF "base port" (the port this VF is assigned to by the admin) as long as this port is up. When the port fails, traffic from all VFs that are defined on this port will move to the other port, and be back to their base-port when it recovers. Moni and Or. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Moni Shoua authored
When the mlx4 driver runs in HA mode, and all VFs are single ported ones, we make their single port Highly-Available. This is done by taking advantage of the HA mode properties (following bonding changes with programming the port V2P map, etc) and adding the missing parts which are unique to SRIOV such as mirroring VF steering rules on both ports. Due to limits on the MAC and VLAN table this mode is enabled only when number of total VFs is under 64. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Or Gerlitz authored
Under HA mode, it's possible that the VF registered its GID (and expects to get mads through the PV scheme) on a port which is different from the one this mad arrived on, due to HA fail over. Therefore, if the gid is not matched on the port that the packet arrived on, check for a match on the other port if HA mode is active -- and if a match is found on the other port, continue processing the mad using that other port. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Moni Shoua authored
Due to HW limitations, indexes to MAC and VLAN tables are always taken from the table of the actual port. So, if a resource holds an index to a table, it may refer to different values during the lifetime of the resource, unless the tables are mirrored. Also, even when driver is not in HA mode the policy of allocating an index to these tables is such to make sure, as much as possible, that when the time comes the mirroring will be successful. This means that in multifunction mode the allocation of a free index in a port's table tries to make sure that the same index in the other's port table is also free. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Moni Shoua authored
Under HA mode, steering rules set by VFs should be mirrored on both ports of the device so packets will be accepted no matter on which port they arrived. Since getting into HA mode is done dynamically when the user bonds mlx4 Ethernet netdevs, we keep hold of the VF DMFS rule mbox with the port value flipped (1->2,2->1) and execute the mirroring when getting into HA mode. Later, when going out of HA mode, we unset the mirrored rules. In that context note that mirrored rules cannot be removed explicitly. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Moni Shoua authored
Under HA mode, the link down event should be sent to VFs only if both ports are down. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Or Gerlitz authored
In HA mode, the link state for VFs for which the policy is "auto" (i.e. follow the physical link state) should be ORed from both ports. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julia Lawall authored
Remove unneeded variable used to store return value. Generated by: scripts/coccinelle/misc/returnvar.cocci CC: Asias He <asias@redhat.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Julia Lawall <julia.lawall@lip6.fr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjørn Mork authored
The notifier calls were thrown in as a last-minute fix for an imagined "this device could be part of a bridge" problem. That revealed a certain lack of locking. Not to mention testing... Avoid this splat: RTNL: assertion failed at net/core/dev.c (1639) CPU: 0 PID: 4293 Comm: bash Not tainted 4.4.0-rc3+ #358 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000000 ffff8800ad253d60 ffffffff8122f7cf ffff8800ad253d98 ffff8800ad253d88 ffffffff813833ab 0000000000000002 ffff880230f48560 ffff880230a12900 ffff8800ad253da0 ffffffff813833da 0000000000000002 Call Trace: [<ffffffff8122f7cf>] dump_stack+0x4b/0x63 [<ffffffff813833ab>] call_netdevice_notifiers_info+0x3d/0x59 [<ffffffff813833da>] call_netdevice_notifiers+0x13/0x15 [<ffffffffa09be227>] raw_ip_store+0x81/0x193 [qmi_wwan] [<ffffffff8131e149>] dev_attr_store+0x20/0x22 [<ffffffff811d858b>] sysfs_kf_write+0x49/0x50 [<ffffffff811d8027>] kernfs_fop_write+0x10a/0x151 [<ffffffff8117249a>] __vfs_write+0x26/0xa5 [<ffffffff81085ed4>] ? percpu_down_read+0x53/0x7f [<ffffffff81174c9e>] ? __sb_start_write+0x5f/0xb0 [<ffffffff81174c9e>] ? __sb_start_write+0x5f/0xb0 [<ffffffff81172c37>] vfs_write+0xa3/0xe7 [<ffffffff811734ad>] SyS_write+0x50/0x7e [<ffffffff8145c517>] entry_SYSCALL_64_fastpath+0x12/0x6f Fixes: 32f7adf6 ("net: qmi_wwan: support "raw IP" mode") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 06 Dec, 2015 24 commits
-
-
Singhai, Anjali authored
This reverts commit 8fe26999. The case where VXLAN is a module and i40e driver is inbuilt will not be handled properly with this change since i40e will have an undefined symbol vxlan_get_rx_port in it. v2: Add a signed-off-by. Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2015-12-05 This series contains updates to fm10k only. Jacob provides the remaining fm10k patches in the series. First change ensures that all the logic regarding the setting of netdev features is consolidated in one place of the driver. Fixed an issue where an assumption was being made on how many queues are available, especially when init_hw_vf() errors out. Fixed up an number of issues with init_hw() where failures were not being handled properly or at all, so update the driver to check returned error codes and respond appropriately. Fixed up typecasting issues found where either the incorrect typecast size was used or explicitly typecast values. Added additional debugging statistics and rename statistic to better reflect its true value. Added support for ITR scaling based on PCIe link speed for fm10k. Fixed up code comment where "hardware" was misspelled. v2: Dropped patches #1 and #10 from original submission, patch #1 was from Nick Krause and due to his past kernel interactions, dropping his patch. Patch #10 had questions and concerns from Tom Herbert which cannot be addressed at this time since the author (Jacob Keller) is currently on sabbatical, so dropping this patch for now until we can properly address Tom's questions and concerns. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jacob Keller authored
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The current default ITR for Tx is overly restrictive. Using a simple netperf TCP_STREAM test, we top out at about 10Gb/s for a single thread when running using 1500 byte frames. By reducing the ITR value to 25usec (up to 40K interrupts a second from 10K), we are able to achieve 36Gb/s for a single thread TCP stream test. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The existing adaptive ITR algorithm is overly restrictive. It throttles incorrectly for various traffic rates, and does not produce good performance. The algorithm now allows for more interrupts per second, and does some calculation to help improve for smaller packet loads. In addition, take into account the new itr_scale from the hardware which indicates how much to scale due to PCIe link speed. Reported-by: Matthew Vick <matthew.vick@intel.com> Reported-by: Alex Duyck <alexander.duyck@gmail.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Define a macro for identifying when the itr value is dynamic or adaptive. The concept was taken from i40e. This helps make clear what the check is, and reduces the line length to something more reasonable in a few places. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The Intel Ethernet Switch FM10000 Host Interface interrupt throttle timers are based on the PCIe link speed. Because of this, the value being programmed into the ITR registers must be scaled accordingly. For the PF, this is as simple as reading the PCIe link speed and storing the result. However, in the case of SR-IOV, the VF's interrupt throttle timers are based on the link speed of the PF. However, the VF is unable to get the link speed information from its configuration space, so the PF must inform it of what scale to use. Rather than pass this scale via mailbox message, take advantage of unused bits in the TDLEN register to pass the scale. It is the responsibility of the PF to program this for the VF while setting up the VF queues and the responsibility of the VF to get the information accordingly. This is preferable because it allows the VF to set up the interrupts properly during initialization and matches how the MAC address is passed in the TDBAL/TDBAH registers. Since we're modifying fm10k_type.h, we may as well also update the copyright year. Reported-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Originally this statistic was renamed because the method of dropping was called "drop_oversized_messages", but this logic has changed much, and this counter does actually represent messages which we failed to transmit for a number of reasons. Rename the counter back to tx_dropped since this is when it will increment, and it is less confusing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
A previous bug was uncovered by addition of a debug stat to indicate the actual number of DWORDS we pulled from the mbmem. It turned out this was not the same as the tx_dwords counter. While the previous bug fix should have corrected this in all cases, add some debug stats that count the number of DWORDs pushed or pulled from the mbmem. A future debugger may take advantage of this statistic for debugging purposes. Since we're modifying fm10k_mbx.h, update the copyright year as well. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Since the resultant data type of the mac_update.mac_upper field is u16, it does not make sense to typecast u8 variables to u32 first. Since we're modifying fm10k_pf.c, also update the copyright year. Reported-by: Matthew Vick <matthew.vick@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The init_hw function may fail, and in the case of VFs, it might change the number of maximum queues available. Thus, for every flow which checks init_hw, we need to ensure that we clear the queue scheme before, and initialize it after. The fm10k_io_slot_reset path will end up triggering a reset so fm10k_reinit needs this change. The fm10k_io_error_detected and fm10k_io_resume also need to properly clear and reinitialize the queue scheme. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
A recent change modified init_hw in some flows the function may fail on VF devices. For example, if a VF doesn't yet own its own queues. However, many callers of init_hw didn't bother to check the error code. Other callers checked but only displayed diagnostic messages without actually handling the consequences. Fix this by (a) always returning and preventing the netdevice from going up, and (b) printing the diagnostic in every flow for consistency. This should resolve an issue where VF drivers would attempt to come up before the PF has finished assigning queues. In addition, change the dmesg output to explicitly show the actual function that failed, instead of combining reset_hw and init_hw into a single check, to help for future debugging. Fixes: 1d568b0f6424 ("fm10k: do not assume VF always has 1 queue") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
VF drivers must detect how many queues are available. Previously, the driver assumed that each VF has at minimum 1 queue. This assumption is incorrect, since it is possible that the PF has not yet assigned the queues to the VF by the time the VF checks. To resolve this, we added a check first to ensure that the first queue is infact owned by the VF at init_hw_vf time. However, the code flow did not reset hw->mac.max_queues to 0. In some cases, such as during reinit flows, we call init_hw_vf without clearing the previous value of hw->mac.max_queues. Due to this, when init_hw_vf errors out, if its error code is not properly handled the VF driver may still believe it has queues which no longer belong to it. Fix this by clearing the hw->mac.max_queues on exit due to errors. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
Don't change netdev hw_features later in fm10k_probe, instead set all values inside fm10k_alloc_netdev. To do so, we need to know the MAC type (whether it is PF or VF) in order to determine what to do. This helps ensure that all logic regarding features is co-located. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
David S. Miller authored
Sergei Shtylyov says: ==================== Renesas: read MAC address registers only once Here's 2 patches against DaveM's 'net-next.git' repo. Here we optimize the MAC address register reading (left over from a bootloader). [1/2] ravb: read MAC address registers only once [2/2] sh_eth: read MAC address registers only once ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
The code reading the MAHR/MALR registers in read_mac_address() is terribly ineffective -- it reads MAHR 4 times and MALR 2 times, while it's enough to read each register only once. Use the local variables to achieve that, somewhat beautifying the code while at it... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
The code reading the MAHR/MALR registers in ravb_read_mac_address() is terribly ineffective -- it reads MAHR 4 times and MALR 2 times, while it's enough to read each register only once. Use the local variables to achieve that, somewhat beautifying the code while at it... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Michal Schmidt says: ==================== bnx2x: fewer error messages, simplification This removes one redundant error message in bnx2x and changes another one to WARN_ONCE. The third patch is a small simplification in ethtool stats. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
The 'flags' field in bnx2x_stats_arr[] serves only one purpose - to tell us if the statistic is a per-port stat and thus should not be shown for virtual functions. It's strange that the field can have three different values. A boolean will do just fine. Also remove IS_FUNC_STAT(). It was used only once and it's in fact just a negation of IS_PORT_STAT(). Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
It's supposed to be impossible for TPA to give us anything else than IPv4 or IPv6 here. But in case there is a way to reach this error by some strange received frames, we don't want to flood the kernel log. WARN_ONCE is better for this purpose. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
alloc_pages() already prints a warning when it fails. No need to emit another message. Certainly not at KERN_ERR level, because it is no big deal if this GFP_ATOMIC allocation fails occasionally. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 05 Dec, 2015 6 commits
-
-
Jiri Pirko authored
As suggested by Eric, these helpers should have const dev param. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Pieczko authored
A change in MCFW behaviour means that the net driver must update its record of the warm_boot_count by reading it from the ER_DZ_BIU_MC_SFT_STATUS register. On v4.6.x MCFW the global boot count was incremented when some functions needed to be reset to enable multicast chaining, so all functions saw the same value. In that case, the driver needed to increment its warm_boot_count when other functions were reset, to avoid noticing it later and then trying to reset itself to recover unnecessarily. With v4.7+ MCFW, the boot count in firmware doesn't change as that is unnecessary since the PFs that have been reset will each receive an MC reboot notification. In that case, the driver re-reads the unchanged value. Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
LABBE Corentin authored
The simple_strtol function is obsolete. This patch replace it by kstrtoint. This will simplify code, since some error case not handled by simple_strtol are handled by kstrtoint. Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andrew Lunn says: ==================== Allow BATMAN to use hdlc-eth interfaces BATMAN works over Ethernet like interfaces. hdlc-eth provides the need requirements. However, hdlc devices are often created as raw hdlc devices, which batman cannot use, and are then be transmuted into other types using sethdlc(1). Have the HDLC code emit NETDEV_*_TYPE_CHANGE events when the type changes, and have BATMAN react on these events. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
A network interface can change type. It may change from a type which batman does not support, e.g. hdlc, to one it does, e.g. hdlc-eth. When an interface changes type, it sends two notifications. Handle these notifications. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
An interface changing type may not have IPv6 addresses. Don't call the address configuration type change in this case. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-