- 10 Dec, 2013 10 commits
-
-
Zhi Yong Wu authored
By checking related codes, it is impossible that ret > len or total_len, so we should remove some useless coeds in both above functions. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Durrant authored
The way that flow control works without this patch is that, in start_xmit() the code uses xenvif_count_skb_slots() to predict how many slots xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek' counter which it then uses to determine if the shared ring has that amount of space available by checking whether 'req_prod' has passed that value. If the ring doesn't have space the tx queue is stopped. xenvif_gop_skb() will then consume slots and update 'req_cons' and issue responses, updating 'rsp_prod' as it goes. The frontend will consume those responses and post new requests, by updating req_prod. So, req_prod chases req_cons which chases rsp_prod, and can never exceed that value. Thus if xenvif_count_skb_slots() ever returns a number of slots greater than xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot possibly achieve (since it's limited by the 'real' req_cons) and, if this happens enough times, req_cons_peek gets more than a ring size ahead of req_cons and the tx queue then remains stopped forever waiting for an unachievable amount of space to become available in the ring. Having two routines trying to calculate the same value is always going to be fragile, so this patch does away with that. All we essentially need to do is make sure that we have 'enough stuff' on our internal queue without letting it build up uncontrollably. So start_xmit() makes a cheap optimistic check of how much space is needed for an skb and only turns the queue off if that is unachievable. net_rx_action() is the place where we could do with an accurate predicition but, since that has proven tricky to calculate, a cheap worse-case (but not too bad) estimate is all we really need since the only thing we *must* prevent is xenvif_gop_skb() consuming more slots than are available. Without this patch I can trivially stall netback permanently by just doing a large guest to guest file copy between two Windows Server 2008R2 VMs on a single host. Patch tested with frontends in: - Windows Server 2008R2 - CentOS 6.0 - Debian Squeeze - Debian Wheezy - SLES11 Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Annie Li <annie.li@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Erik Hugne authored
struct 'tipc_bearer' is a generic representation of the underlying media type, and exists in a one-to-one relationship to each interface TIPC is using. The struct contains a 'blocked' flag that mirrors the operational and execution state of the represented interface, and is updated through notification calls from the latter. The users of tipc_bearer are checking this flag before each attempt to send a packet via the interface. This state mirroring serves no purpose in the current code base. TIPC links will not discover a media failure any faster through this mechanism, and in reality the flag only adds overhead at packet sending and reception. Furthermore, the fact that the flag needs to be protected by a spinlock aggregated into tipc_bearer has turned out to cause a serious and completely unnecessary deadlock problem. CPU0 CPU1 ---- ---- Time 0: bearer_disable() link_timeout() Time 1: spin_lock_bh(&b_ptr->lock) tipc_link_push_queue() Time 2: tipc_link_delete() tipc_bearer_blocked(b_ptr) Time 3: k_cancel_timer(&req->timer) spin_lock_bh(&b_ptr->lock) Time 4: del_timer_sync(&req->timer) I.e., del_timer_sync() on CPU0 never returns, because the timer handler on CPU1 is waiting for the bearer lock. We eliminate the 'blocked' flag from struct tipc_bearer, along with all tests on this flag. This not only resolves the deadlock, but also simplifies and speeds up the data path execution of TIPC. It also fits well into our ongoing effort to make the locking policy simpler and more manageable. An effect of this change is that we can get rid of functions such as tipc_bearer_blocked(), tipc_continue() and tipc_block_bearer(). We replace the latter with a new function, tipc_reset_bearer(), which resets all links associated to the bearer immediately after an interface goes down. A user might notice one slight change in link behaviour after this change. When an interface goes down, (e.g. through a NETDEV_DOWN event) all attached links will be reset immediately, instead of leaving it to each link to detect the failure through a timer-driven mechanism. We consider this an improvement, and see no obvious risks with the new behavior. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Paul Gortmaker <Paul.Gortmaker@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
wangweidong authored
use pr_<level> instead of printk(LEVEL) Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
This patch introduces a PACKET_QDISC_BYPASS socket option, that allows for using a similar xmit() function as in pktgen instead of taking the dev_queue_xmit() path. This can be very useful when PF_PACKET applications are required to be used in a similar scenario as pktgen, but with full, flexible packet payload that needs to be provided, for example. On default, nothing changes in behaviour for normal PF_PACKET TX users, so everything stays as is for applications. New users, however, can now set PACKET_QDISC_BYPASS if needed to prevent own packets from i) reentering packet_rcv() and ii) to directly push the frame to the driver. In doing so we can increase pps (here 64 byte packets) for PF_PACKET a bit: # CPUs -- QDISC_BYPASS -- qdisc path -- qdisc path[**] 1 CPU == 1,509,628 pps -- 1,208,708 -- 1,247,436 2 CPUs == 3,198,659 pps -- 2,536,012 -- 1,605,779 3 CPUs == 4,787,992 pps -- 3,788,740 -- 1,735,610 4 CPUs == 6,173,956 pps -- 4,907,799 -- 1,909,114 5 CPUs == 7,495,676 pps -- 5,956,499 -- 2,014,422 6 CPUs == 9,001,496 pps -- 7,145,064 -- 2,155,261 7 CPUs == 10,229,776 pps -- 8,190,596 -- 2,220,619 8 CPUs == 11,040,732 pps -- 9,188,544 -- 2,241,879 9 CPUs == 12,009,076 pps -- 10,275,936 -- 2,068,447 10 CPUs == 11,380,052 pps -- 11,265,337 -- 1,578,689 11 CPUs == 11,672,676 pps -- 11,845,344 -- 1,297,412 [...] 20 CPUs == 11,363,192 pps -- 11,014,933 -- 1,245,081 [**]: qdisc path with packet_rcv(), how probably most people seem to use it (hopefully not anymore if not needed) The test was done using a modified trafgen, sending a simple static 64 bytes packet, on all CPUs. The trick in the fast "qdisc path" case, is to avoid reentering packet_rcv() by setting the RAW socket protocol to zero, like: socket(PF_PACKET, SOCK_RAW, 0); Tradeoffs are documented as well in this patch, clearly, if queues are busy, we will drop more packets, tc disciplines are ignored, and these packets are not visible to taps anymore. For a pktgen like scenario, we argue that this is acceptable. The pointer to the xmit function has been placed in packet socket structure hole between cached_dev and prot_hook that is hot anyway as we're working on cached_dev in each send path. Done in joint work together with Jesper Dangaard Brouer. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
As we need it elsewhere, move the inline helper function of skb_needs_linearize() over to skbuff.h include file. While at it, also convert the return to 'bool' instead of 'int' and add a proper kernel doc. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Merge 'net' into 'net-next' to get the AF_PACKET bug fix that Daniel's direct transmit changes depend upon. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Commit e40526cb introduced a cached dev pointer, that gets hooked into register_prot_hook(), __unregister_prot_hook() to update the device used for the send path. We need to fix this up, as otherwise this will not work with sockets created with protocol = 0, plus with sll_protocol = 0 passed via sockaddr_ll when doing the bind. So instead, assign the pointer directly. The compiler can inline these helper functions automagically. While at it, also assume the cached dev fast-path as likely(), and document this variant of socket creation as it seems it is not widely used (seems not even the author of TX_RING was aware of that in his reference example [1]). Tested with reproducer from e40526cb. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap#Example Fixes: e40526cb ("packet: fix use after free race in send path when dev is released") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Salam Noureddine <noureddine@aristanetworks.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Commit 6da7c8fc ("qdisc: allow setting default queuing discipline") added the ability to change default qdisc from pfifo_fast to say fq But as most modern ethernet devices are multiqueue, we cant really see all the statistics from "tc -s qdisc show", as the default root qdisc is mq. This patch adds the calls to qdisc_list_add() to mq and mqprio Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-nextDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to i40e only. Jacob provides a i40e patch to get 1588 work correctly by separating TSYNVALID and TSYNINDX fields in the receive descriptor. Jesse provides several i40e patches, first to correct the checking of the multi-bit state. The hash is reported correctly in the RSS field if and only if the filter status is 3. Other values of the filter status mean different things and we should not depend on a bitwise result. Then provides a patch to enable a couple of workarounds based on revision ID that allow the driver to work more fully on early hardware. Shannon provides several i40e patches as well. First sets the media type in the hardware structure based on the external connection type. Then provides a patch to only setup the rings that will be used. Lastly provides a fix where the TESTING state was still set when exiting the ethtool diagnostics. Kevin Scott provides one i40e patch to add a new flag to the i40e_add_veb() which allows the driver to request the hardware to filter on layer 2 parameters. Anjali provides four i40e patches, first refactors the reset code in order to re-size queues and vectors while the interface is still up. Then provides a patch to enable all PCTYPEs expect FCoE for RSS. Adds a message to notify the user of how many VFs are initialized on each port. Lastly adds a new variable to track the number of PF instances, this is a global counter on purpose so that each PF loaded has a unique ID. Catherine bumps the driver version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Dec, 2013 12 commits
-
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
The driver core clears the driver data to NULL after device_release or on probe failure. Thus, it is not needed to manually clear the device driver data to NULL. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Dec, 2013 14 commits
-
-
Anjali Singhai Jain authored
Track the number of physical functions (PFs) found, this is a global counter on purpose so that each pf loaded has a unique ID. Change-Id: I74d618520afbce4a774d0235449e3b5f97ff6d4a Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Anjali Singhai Jain authored
Print a message to notify the user of how many VFs are initialized on each port. Change-Id: I29ac2acc478ee4e588fd6ffcc35133d4c6607ca9 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Shannon Nelson authored
Put the print and reset statements in the actual test functions to make them more self-contained, and only run the reset for tests that need it. Change-Id: Ic70f49b11bf8bae82e59d8fd25b46215c90c4510 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Shannon Nelson authored
Fix a bug where the TESTING state was still set when exiting the ethtool diagnostics. Change-Id: Ic47950d2e86a67167d1d282256d477cecd86d820 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Shannon Nelson authored
The VSI may be allocated more queues (alloc_queue_pairs) than actually are to be used (num_queue_pairs), so only allocate rings for the queues to be used. The numbers will likely be the same for most VSIs, but can be different based on how TCs are assigned and enabled. Change-Id: Ie40f7ad0affbc4b45d6f049bcf02ee2fa24edc74 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Anjali Singhai Jain authored
RSS can steer packets based on recognition of all sorts of different headers. Enable some more of them. Change-Id: I2264dedae66fb0bceca6fb6e772e050e3ca8efc8 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Anjali Singhai Jain authored
In order to re-size queues and vectors while the interface is still up, we need to be able to call functions to free and re-allocate without bringing down the VSI. We also need to reset the existing setup, update the configuration and then rebuild again. This requires us to have the reset flow broken down into two parts. Change-Id: I374dd25aabf769decda69b676491c7b7730a4635 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Catherine Sullivan authored
Update the driver version to 0.3.12-k Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jeff Kirsher authored
Whitespace fixes Change-Id: I95f4d02e4a2a92d6b6fca3ae2b7865c4b916a9bb Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com>
-
Jesse Brandeburg authored
Enable a couple of workarounds based on revision ID that allow the driver to work more fully on early hardware. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Kevin Scott authored
Add a new flag to the add VEB command which allows the driver to request the hardware to filter on L2 parameters. This is an implementation of the driver access to a new firmware feature. Change-Id: Id61d3cad4125bdc68b8fd9d555c448a10c344b6b Signed-off-by: Kevin Scott <kevin.c.scott@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jesse Brandeburg authored
Set the media type in the hardware structure, based on the external connection type. Add Direct Attach to the type of media reported by ethtool. Change-Id: I4ad2f5bf882766d6e737fac4477abf049491b3b3 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jesse Brandeburg authored
The hash is reported correctly in the rss field if and only if the filter status is 3. Other values of filter status mean different things and we shouldn't depend on a bitwise result. The issue was that a & b --> returns true for b={1,2,3} the fix is a & b == b Also refactor this function to use constant operations because we are in fast path. Change-Id: I4e29be87439c1cf8b60bc31bea29dff89596c013 Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
In order to get 1588 to work correctly the defines need a bit of a tweak. Change-Id: Ie50ce2a18e1593441f1560411e5a4f51c6d48aaa Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
- 06 Dec, 2013 4 commits
-
-
Joe Perches authored
Add a new check for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to reduce the number of or's used in the ether_addr_equal comparison to very slightly improve function performance. Simplify the ether_addr_equal_64bits implementation. Integrate and remove the zap_last_2bytes helper as it's now used only once. Remove the now unused compare_ether_addr function. Update the unaligned-memory-access documentation to remove the compare_ether_addr description and show how unaligned accesses could occur with ether_addr_equal. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
wangweidong authored
use pr_<level> instead of printk(LEVEL) Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Creating an address with this flag set will result in kernel taking care of temporary addresses in the same way as if the address was created by kernel itself (after RA receive). This allows userspace applications implementing the autoconfiguration (NetworkManager for example) to implement ipv6 addresses privacy. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
There is no more space in u8 ifa_flags. So do what davem suffested and add another netlink attr called IFA_FLAGS for carry more flags. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Thomas Haller <thaller@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-