- 04 Jun, 2014 31 commits
-
-
Vlad Yasevich authored
To make TLB mode work, the patch allows learning packets to be sent using mac addresses assigned to macvlan devices, also taking into an account vlans that may be between the bond and macvlan device. To make RLB work, all we have to do is accept ARP packets for addresses added to the bond dev->uc list. Since RLB mode will take care to update the peers directly with correct mac addresses, learning packets for these addresses do not have be send to switch. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Yasevich authored
Bonding and team drivers generate specific events during failover that trigger switch updates. When a macvlan device is configured on top of bonding, we want switches to learn about the macvlan devices as well. This patch adds a handler to macvlan driver to propagate these events to all macvlan devices. We let the generic inetdev event handler do the work. This allows macvlan to operated correctly over active-backup mode bond. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Yasevich authored
Bonding devices manage the unicast filters of the underlying interfaces, but do not turn on IFF_UNICAST_FLT flag. Thus anytime a unicast address is added to the bond, the bond is places in promiscuous mode. Turn on IFF_UNICAST_FLT on the bond device so that the bond does not go into promiscuous mode needlesly. If an underlying device does not support unicast filtering, that device will automaticall enter promiscuous mode already. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sasha Levin authored
This reverts commit 30f38d2f. fib_triestat is surrounded by a big lie: while it claims that it's a seq_file (fib_triestat_seq_open, fib_triestat_seq_show), it isn't: static const struct file_operations fib_triestat_fops = { .owner = THIS_MODULE, .open = fib_triestat_seq_open, .read = seq_read, .llseek = seq_lseek, .release = single_release_net, }; Yes, fib_triestat is just a regular file. A small detail (assuming CONFIG_NET_NS=y) is that while for seq_files you could do seq_file_net() to get the net ptr, doing so for a regular file would be wrong and would dereference an invalid pointer. The fib_triestat lie claimed a victim, and trying to show the file would be bad for the kernel. This patch just reverts the issue and fixes fib_triestat, which still needs a rewrite to either be a seq_file or stop claiming it is. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antonio Ospite authored
Signed-off-by: Antonio Ospite <ao2@ao2.it> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xiubo Li authored
Building with CONFIG_DEBUG_SECTION_MISMATCH enabled, the following WARNING is occured: LD drivers/net/built-in.o WARNING: drivers/net/built-in.o(.text+0xcd4c): Section mismatch in reference from the function gfar_probe() to the function .init.text:gfar_init_addr_hash_table() The function gfar_probe() references the function __init gfar_init_addr_hash_table(). This is often because gfar_probe lacks a __init annotation or the annotation of gfar_init_addr_hash_table is wrong. Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Wei Liu says: ==================== This is rebased version of Andrew's V8 patch series. The original cover letter: -------------------- xen-net{back, front}: Multiple transmit and receive queues This patch series implements multiple transmit and receive queues (i.e. multiple shared rings) for the xen virtual network interfaces. The series is split up as follows: - Patch 1 brings the 'grant_copy_op' array back into struct xenvif, in preparation for multi-queue support. See the patch itself for more details. - Patches 2 and 4 factor out the queue-specific data for netback and netfront respectively, and modify the rest of the code to use these as appropriate. - Patches 3 and 5 introduce new XenStore keys to negotiate and use multiple shared rings and event channels, and code to connect these as appropriate. - Patch 6 documents the XenStore keys required for the new feature in include/xen/interface/io/netif.h All other transmit and receive processing remains unchanged, i.e. there is a kthread per queue and a NAPI context per queue. The performance of these patches has been analysed in detail, with results available at: http://wiki.xenproject.org/wiki/Xen-netback_and_xen-netfront_multi-queue_performance_testing To summarise: * Using multiple queues allows a VM to transmit at line rate on a 10 Gbit/s NIC, compared with a maximum aggregate throughput of 6 Gbit/s with a single queue. * For intra-host VM--VM traffic, eight queues provide 171% of the throughput of a single queue; almost 12 Gbit/s instead of 6 Gbit/s. * There is a corresponding increase in total CPU usage, i.e. this is a scaling out over available resources, not an efficiency improvement. * Results depend on the availability of sufficient CPUs, as well as the distribution of interrupts and the distribution of TCP streams across the queues. Queue selection is currently achieved via an L4 hash on the packet (i.e. TCP src/dst port, IP src/dst address) and is not negotiated between the frontend and backend, since only one option exists. Future patches to support other frontends (particularly Windows) will need to add some capability to negotiate not only the hash algorithm selection, but also allow the frontend to specify some parameters to this. Note that queue selection is a decision by the transmitting system about which queue to use for a particular packet. In general, the algorithm may differ between the frontend and the backend with no adverse effects. Queue-specific XenStore entries for ring references and event channels are stored hierarchically, i.e. under .../queue-N/... where N varies from 0 to one less than the requested number of queues (inclusive). If only one queue is requested, it falls back to the flat structure where the ring references and event channels are written at the same level as other vif information. V8: - Squash the queue error handling code into patch 3. - Update the documentation (patch 6) according to comments on the equivalent patch to Xen. V7: - Rebase on latest net-next, which includes the netback grant mapping patch series from Zoltan Kiss - Reduce QUEUE_NAME_SIZE by 1 to avoid double-counting the trailing '\0' - Simplify the queue hashing by using (hash % num_queues) instead of multiply & shift. - Add ratelimited warning for invalid queue selection. - Fix error handling to correctly tear down already setup queues. - Use dev->real_num_tx_queues instead of separately maintaining a count of the number of queues. V6: - Use 'max_queues' as the module param. name for both netback and netfront. V5: - Fix bug in xenvif_free() that could lead to an attempt to transmit an skb after the queue structures had been freed. - Improve the XenStore protocol documentation in netif.h. - Fix IRQ_NAME_SIZE double-accounting for null terminator. - Move rx_gso_checksum_fixup stat into struct xenvif_stats (per-queue). - Don't initialise a local variable that is set in both branches (xspath). V4: - Add MODULE_PARM_DESC() for the multi-queue parameters for netback and netfront modules. - Move del_timer_sync() in netfront to after unregister_netdev, which restores the order in which these functions were called before applying these patches. V3: - Further indentation and style fixups. V2: - Rebase onto net-next. - Change queue->number to queue->id. - Add atomic operations around the small number of stats variables that are not queue-specific or per-cpu. - Fixup formatting and style issues. - XenStore protocol changes documented in netif.h. - Default max. number of queues to num_online_cpus(). - Check requested number of queues does not exceed maximum. -------------------- I rebased this on top of net-next. No functional change is introduced. The patch that needed some extra care was "xen-netback: Factor queue-specific data into queue struct" because it clashed with a fix introduced in net. A simple test of creating guest, iperf, then shutting down guest worked as expected. The last patch fixes a minor problem that queue name is not initialised in xen-netfront, resulting in names like "-tx" "-rx" in /proc/interrupt. Changes since v9 (no functional change introduced): * include commit summary in the commit message of first patch * fold David Vrabel's Reviewed-by into last patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Liu authored
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew J. Bennieston authored
Document the multi-queue feature in terms of XenStore keys to be written by the backend and by the frontend. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew J. Bennieston authored
Build on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Check XenStore for multi-queue support, and set up the rings and event channels accordingly. Write ring references and event channels to XenStore in a queue hierarchy if appropriate, or flat when using only one queue. Update the xennet_select_queue() function to choose the queue on which to transmit a packet based on the skb hash result. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew J. Bennieston authored
In preparation for multi-queue support in xen-netfront, move the queue-specific data from struct netfront_info to struct netfront_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_etherdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0, selecting the first (and only) queue. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew J. Bennieston authored
Builds on the refactoring of the previous patch to implement multiple queues between xen-netfront and xen-netback. Writes the maximum supported number of queues into XenStore, and reads the values written by the frontend to determine how many queues to use. Ring references and event channels are read from XenStore on a per-queue basis and rings are connected accordingly. Also adds code to handle the cleanup of any already initialised queues if the initialisation of a subsequent queue fails. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Liu authored
In preparation for multi-queue support in xen-netback, move the queue-specific data from struct xenvif into struct xenvif_queue, and update the rest of the code to use this. Also adds loops over queues where appropriate, even though only one is configured at this point, and uses alloc_netdev_mq() and the corresponding multi-queue netif wake/start/stop functions in preparation for multiple active queues. Finally, implements a trivial queue selection function suitable for ndo_select_queue, which simply returns 0 for a single queue and uses skb_get_hash() to compute the queue index otherwise. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew J. Bennieston authored
This array was allocated separately in commit ac3d5ac2 ("xen-netback: fix guest-receive-side array sizes") due to it being very large, and a struct xenvif is allocated as the netdev_priv part of a struct net_device, i.e. via kmalloc() but falling back to vmalloc() if the initial alloc. fails. In preparation for the multi-queue patches, where this array becomes part of struct xenvif_queue and is always allocated through vzalloc(), move this back into the struct xenvif. Signed-off-by: Andrew J. Bennieston <andrew.bennieston@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-nextDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to e1000, igb and ixgbe. Emil provides his version 2 fix for the detection of SFP+ capable interfaces. In cases where the driver is loaded while there are no SFP+ modules in cage, the interface was not being detected as SFP capable. Resolve the issue by identifying interfaces with no PHY type set as SFP capable which allows the driver to detect the SFP module when the interface is brought up. In this version 2 of the patch, the 82599 specific check was removed since we only have 82598 devices that are SFP capable. Jacob removes the including of the export header in the ixgbe PTP core, since it is not needed. Renames igb_ptp_enable() to igb_ptp_feature_enable() to better reflect the actual functions purpose. Todd fixes the ethtool loopback test for i354 backplane devices since we do not know what PHY is to be used for the devices, use MAC loopback for ethtool tests. Todd also sets the packet buffer size register defaults for i210 devices. Yongjian Xu removes the check for skb->len being negative or zero since there is never a case where it would be zero or negative for e1000. Manuel Schölling updates e1000 to use the time_after() helper function. v2: Fix indentation on wrapped line in patch 3 of the series based on feedback from David Miller ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manuel Schölling authored
To be future-proof and for better readability the time comparisons are modified to use time_after() instead of plain, error-prone math. Signed-off-by: Manuel Schölling <manuel.schoelling@gmx.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Yongjian Xu authored
There is no case skb->len would be 0 or 'negative'. Remove the check. Signed-off-by: Yongjian Xu <xuyongjiande@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Todd Fujinaka authored
Set the defaults on probe for the packet buffer size registers for the i210. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Todd Fujinaka authored
We can't know what PHY is to be used for i354 backplane, so use MAC loopback for ethtool tests. Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The name igb_ptp_enable is not synonymous with the purpose of this function, so rename it to better explain its purpose. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
We don't need this header file, so we shouldn't be including it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Emil Tantilov authored
In cases where the driver is loaded while there are no SFP+ modules in the cage the interface was not being detected as SFP capable. To account for this the driver called identify_sfp in ixgbe_get_settings to make sure the data is correct. However when there is no SFP+ module in the cage the driver waits for the I2C reads to time out which can take more than a second and will cause issues with tools (like net-snmp) that may poll for that information. This patch resolves the issue by identifying interfaces with no PHY type set as SFP capable which allows the driver to detect the SFP module when the interface is brought up. As result of this we can also remove the identify_sfp call from ixgbe_get_settings. v2: remove the 82599 specific check since we have 82598 devices that are SFP capable. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Conflicts: include/net/inetpeer.h net/ipv6/output_core.c Changes in net were fixing bugs in code removed in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
Commit 4a55530f (net: sh_eth: modify the definitions of register) managed to leave out the E-DMAC register entries in sh_eth_offset_fast_sh3_sh2[], thus totally breaking SH7619/771x support. Add the missing entries using the data from before that commit. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Acked-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ben Dooks authored
The current behaviour of the sh_eth driver is not to use the RNC bit for the receive ring. This means that every packet recieved is not only generating an IRQ but it also stops the receive ring DMA as well until the driver re-enables it after unloading the packet. This means that a number of the following errors are generated due to the receive packet FIFO overflowing due to nowhere to put packets: net eth0: Receive FIFO Overflow Since feedback from Yoshihiro Shimoda shows that every supported LSI for this driver should have the bit enabled it seems the best way is to remove the RMCR default value from the per-system data and just write it when initialising the RMCR value. This is discussed in the message (http://www.spinics.net/lists/netdev/msg284912.html). I have tested the RMCR_RNC configuration with NFS root filesystem and the driver has not failed yet. There are further test reports from Sergei Shtylov and others for both the R8A7790 and R8A7791. There is also feedback fron Cao Minh Hiep[1] which reports the same issue in (http://comments.gmane.org/gmane.linux.network/316285) showing this fixes issues with losing UDP datagrams under iperf. Tested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Acked-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Acked-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
WANG Cong authored
When we jump to free_pcpu on failure in alloc_netdev_mqs() rx and tx queues are not yet allocated, so no need to free them. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
It is possible that ->newlink() fails before registering the device, in this case we should just free it, it's safe to call free_netdev(). Fixes: commit 0e0eee24 (net: correct error path in rtnl_newlink()) Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
wenxiong@linux.vnet.ibm.com authored
A rmb() is required to ensure that the CQE is not read before it is written by the adapter DMA. PCI ordering rules will make sure the other fields are written before the marker at the end of struct eth_fast_path_rx_cqe but without rmb() a weakly ordered processor can process stale data. Without the barrier we have observed various crashes including bnx2x_tpa_start being called on queues not stopped (resulting in message start of bin not in stop) and NULL pointer exceptions from bnx2x_rx_int. Signed-off-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
wenxiong@linux.vnet.ibm.com authored
When injecting EEH error to bnx2x adapter, adapter couldn't be recovery and caused recursive EEH errors. The patch fixes the issue. Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Balakumaran Kannan authored
As smsc driver supports carrier detection, it should unset NOCARRIER flag only after carrier state determination. By default that flag is off so driver should set it before starting auto-negotiation Signed-off-by: Balakumaran <Balakumaran.Kannan@ap.sony.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
The uuid structure could be managed as a const in several places. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 03 Jun, 2014 9 commits
-
-
Benoit Taine authored
This issue was reported by coccicheck using the semantic patch at scripts/coccinelle/api/resource_size.cocci Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Kubecek authored
The xfrm_user module registers its pernet init/exit after xfrm itself so that its net exit function xfrm_user_net_exit() is executed before xfrm_net_exit() which calls xfrm_state_fini() to cleanup the SA's (xfrm states). This opens a window between zeroing net->xfrm.nlsk pointer and deleting all xfrm_state instances which may access it (via the timer). If an xfrm state expires in this window, xfrm_exp_state_notify() will pass null pointer as socket to nlmsg_multicast(). As the notifications are called inside rcu_read_lock() block, it is sufficient to retrieve the nlsk socket with rcu_dereference() and check the it for null. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Zhangfei Gao says: ==================== add hix5hd2 mac driver v4: Update indent Use usleep_range instead of udelay v3: Remove .ndo_get_stats as mentioned by Tobias Add __le32 conversion pointed by Mark v2: Update binding accoring to Sergei comments Update descriptor as Arnd's suggestion ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhangfei Gao authored
Add support for the hix5hd2 XGMAC 1Gb ethernet device. The controller requires two queues for tx and two queues for rx. Controller fetch buffer from free queue and then push to used queue. Diver should prepare free queue and free buffer from used queue. Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhangfei Gao authored
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Michael Chan says: ==================== cnic fixes Fix 2 sleeping function from invalid context bugs and 1 missing iscsi netlink message bug. v2: Fixed typo in rcu_dereference_protected() and tested with CONFIG_PROVE_RCU ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
The iSCSI netlink message needs to be sent before the ulp_ops is cleared as it is sent through a function pointer in the ulp_ops. This bug causes iscsid to not get the message when the bnx2i driver is unloaded. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
We are allocating memory with GFP_KERNEL under spinlock. Since this is the only call manipulating the cnic_udev_list and it is always under rtnl_lock, cnic_dev_lock can be safely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Because the called function, such as bnx2fc_indicate_netevent(), can sleep, we cannot take rcu_lock(). To prevent the rcu protected ulp_ops from going away, we use the cnic_lock mutex and set the ULP_F_CALL_PENDING flag. The code already waits for ULP_F_CALL_PENDING flag to clear in cnic_unregister_device(). Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-