- 20 Aug, 2020 7 commits
-
-
Eric Dumazet authored
There are significant gains using huge pages when available, as shown in [1]. This patch adds mmap_large_buffer() and uses it in client side (tx path of this reference tool) Following patch will use the feature for server side. [1] https://patchwork.ozlabs.org/project/netdev/patch/20200820154359.1806305-1-edumazet@google.com/Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
When TCP_ZEROCOPY_RECEIVE operation has been added, I made the mistake of automatically un-mapping prior content before mapping new pages. This has the unfortunate effect of adding potentially long MMU operations (like TLB flushes) while socket lock is held. Using madvise(MADV_DONTNEED) right after pages has been used has two benefits : 1) This releases pages sooner, allowing pages to be recycled if they were part of a page pool in a NIC driver. 2) No more long unmap operations while preventing immediate processing of incoming packets. The cost of the added system call is small enough. Arjun will submit a kernel patch allowing to opt out from the unmap attempt in tcp_zerocopy_receive() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Arjun Roy <arjunroy@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Currently, tcp sendmsg(MSG_ZEROCOPY) is building skbs with order-0 fragments. Compared to standard sendmsg(), these skbs usually contain up to 16 fragments on arches with 4KB page sizes, instead of two. This adds considerable costs on various ndo_start_xmit() handlers, especially when IOMMU is in the picture. As high performance applications are often using huge pages, we can try to combine adjacent pages belonging to same compound page. Tested on AMD Rome platform, with IOMMU, nominal single TCP flow speed is roughly doubled (~55Gbit -> ~100Gbit), when user application is using hugepages. For reference, nominal single TCP flow speed on this platform without MSG_ZEROCOPY is ~65Gbit. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Simon Horman says: ==================== nfp: flower: add support for QinQ matching Louis says: Add new feature to the Netronome flower driver to enable QinQ offload. This needed a bit of gymnastics in order to not break compatibility with older firmware as the flow key sent to the firmware had to be updated in order to make space for the extra field. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Louis Peens authored
When both the driver and the firmware supports QinQ the flow key structure that is send to the firmware is updated as the old method of matching on VLAN did not allow for space to add another VLAN tag. VLAN flows can now also match on the tpid field, not constrained to just 0x8100 as before. Signed-off-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Louis Peens authored
Add a check to make sure the total length of the flow key sent to the firmware stays within the supported limit. Signed-off-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rahul Kundu authored
IPv6 filters can occupy up to 4 slots and will exhaust HPFILTER region much sooner. So, continue searching for free slots in the HASH or NORMAL filter regions, as long as the rule's priority does not conflict with existing rules in those regions. Signed-off-by: Rahul Kundu <rahul.kundu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 19 Aug, 2020 19 commits
-
-
David S. Miller authored
Kurt Kanzenbach says: ==================== ptp: Add generic helper functions in order to reduce code duplication (and cut'n'paste errors) in the ptp code of DSA, Ethernet and Phy drivers, create helper functions and move them to ptp_classify. This way all drivers can share the same implementation. This is version four and contains bugfixes. Implemented as discussed [1] [2] [3] [4]. Previous versions can be found here: * https://lkml.kernel.org/netdev/20200723074946.14253-1-kurt@linutronix.de/ * https://lkml.kernel.org/netdev/20200727090601.6500-1-kurt@linutronix.de/ * https://lkml.kernel.org/netdev/20200730080048.32553-1-kurt@linutronix.de/ Thanks, Kurt Changes sinve v3: * Coding style issues (Richard Cochran, Petr Machata) * Add better documentation (Grygorii Strashko) * Fix cpts code (Grygorii Strashko) * Use ntohs() for TI code (Grygorii Strashko) * Add tags Changes since v2: * Make ptp_parse_header() work in all scenarios (Russell King) * Fix msgtype offset for ptp v1 packets Changes since v1: * Fix Kconfig (Richard Cochran) * Include more drivers (Richard Cochran) [1] - https://lkml.kernel.org/netdev/20200713140112.GB27934@hoboy/ [2] - https://lkml.kernel.org/netdev/20200720142146.GB16001@hoboy/ [3] - https://lkml.kernel.org/netdev/20200723074946.14253-1-kurt@linutronix.de/ [4] - https://lkml.kernel.org/netdev/20200729100257.GX1551@shell.armlinux.org.uk/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
The offset for the control field is not needed anymore. Remove it. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Suggested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-and-tested-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
In order to reduce code duplication between ptp drivers, generic helper functions were introduced. Use them. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
The message type is located at different offsets within the ptp header depending on the ptp version (v1 or v2). Therefore, drivers which also deal with ptp v1 have some code for it. Extract this into a helper function for drivers to be used. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kurt Kanzenbach authored
Reason: A lot of the ptp drivers - which implement hardware time stamping - need specific fields such as the sequence id from the ptp v2 header. Currently all drivers implement that themselves. Introduce a generic function to retrieve a pointer to the start of the ptp v2 header. Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cristobal Forno authored
Currently the driver reads RX and TX subCRQ handle array directly from a DMA-mapped buffer address when it needs to make a H_SEND_SUBCRQ hcall. This patch stores that information in the ibmvnic_sub_crq_queue structure instead of reading from the buffer received at login. The overall goal of this patch is to parse relevant information from the login response buffer and store it in the driver's private data structures so that we don't need to read directly from the buffer and can then free up that memory. Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Heiner Kallweit says: ==================== r8169: use napi_complete_done return value Consider the return value of napi_complete_done(), this allows users to use the gro_flush_timeout sysfs attribute as an alternative to classic interrupt coalescing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
After making use of the gro_flush_timeout attribute I once got a tx timeout due to an interrupt that wasn't handled. Seems using irq_enabled can be racy, and it's not needed any longer anyway, so remove it. I've never seen a report about such a race before, therefore treat the change as an improvement. There's just one small drawback: If a legacy PCI interrupt is used, and if this interrupt is shared with a device with high interrupt rate, then we may handle interrupts even if NAPI disabled them, and we may see a certain performance decrease under high network load. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Consider the return value of napi_complete_done(), this allows users to use the gro_flush_timeout sysfs attribute as an alternative to classic interrupt coalescing. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
James Chapman authored
Kernel documentation of L2TP has not been kept up to date and lacks coverage of some L2TP APIs. While addressing this, refactor to improve readability, separating the parts which focus on user APIs and internal implementation into sections. Changes in v2: - fix checkpatch warnings about trailing whitespace and long lines Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Miaohe Lin authored
We've been warning about SO_BSDCOMPAT usage for many years. We may remove this code completely now. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: dsa: loop: Expose VLAN table through devlink Changes in v2: - set the DSA configure_vlan_while_not_filtering boolean - return the actual occupancy ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
We return the VLAN table size through devlink as a simple parameter, we do not support altering it at runtime: devlink resource show mdio_bus/fixed-0:1f mdio_bus/fixed-0:1f: name VTU size 4096 occ 0 unit entry dpipe_tables none and after configure a bridge with VLAN filtering: devlink resource show mdio_bus/fixed-0:1f mdio_bus/fixed-0:1f: name VTU size 4096 occ 1 unit entry dpipe_tables none Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Since this is a mock-up driver with no real data path for now, but we will have one at some point, enable VLANs while not filtering. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 18 Aug, 2020 9 commits
-
-
Miaohe Lin authored
The frags of skb_shared_info of the data is assigned in following loop. It is meaningless to do a memcpy of frags here. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Dewar authored
Remove a couple of unused #defines in cs89x0.h. Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Miaohe Lin authored
Convert the uses of fallthrough comments to fallthrough macro. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Johannes Berg says: ==================== netlink: allow NLA_BINARY length range validation In quite a few places (perhaps particularly in wireless) we need to validation an NLA_BINARY attribute with both a minimum and a maximum length. Currently, we can do either of the two, but not both, given that we have NLA_MIN_LEN (minimum length) and NLA_BINARY (maximum). Extend the range mechanisms that we use for integer validation to apply to NLA_BINARY as well. After converting everything to use NLA_POLICY_MIN_LEN() we can thus get rid of the NLA_MIN_LEN type since that's now a special case of NLA_BINARY with a minimum length validation. Similarly, NLA_EXACT_LEN can be specified using NLA_POLICY_EXACT_LEN() and also maps to the new NLA_BINARY validation (min == max == desired length). Finally, NLA_POLICY_EXACT_LEN_WARN() also gets to be a somewhat special case of this. I haven't included the patch here now that converts nl82011 to use this because it doesn't apply without another cleanup patch, but we can remove a number of hand-coded min/max length checks and get better error messages from the general validation code while doing that. As I had originally built the netlink policy export to userspace in a way that has min/max length for NLA_BINARY (for the types that we used to call NLA_MIN_LEN, NLA_BINARY and NLA_EXACT_LEN) anyway, it doesn't really change anything there except that now there's a chance that userspace sees min length < max length, which previously wasn't possible. v2: * fix the min<max comment to correctly say min<=max ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Add range validation for NLA_BINARY, allowing validation of any combination of combination minimum or maximum lengths, using the existing NLA_POLICY_RANGE()/NLA_POLICY_FULL_RANGE() macros, just like for integers where the value is checked. Also make NLA_POLICY_EXACT_LEN(), NLA_POLICY_EXACT_LEN_WARN() and NLA_POLICY_MIN_LEN() special cases of this, removing the old types NLA_EXACT_LEN and NLA_MIN_LEN. This allows us to save some code where both minimum and maximum lengths are requires, currently the policy only allows maximum (NLA_BINARY), minimum (NLA_MIN_LEN) or exact (NLA_EXACT_LEN), so a range of lengths cannot be accepted and must be checked by the code that consumes the attributes later. Also, this allows advertising the correct ranges in the policy export to userspace. Here, NLA_MIN_LEN and NLA_EXACT_LEN already were special cases of NLA_BINARY with min and min/max length respectively. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Change places that open-code NLA_POLICY_MIN_LEN() to use the macro instead, giving us flexibility in how we handle the details of the macro. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
Change places that open-code NLA_POLICY_EXACT_LEN() to use the macro instead, giving us flexibility in how we handle the details of the macro. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull mailmap update from Kees Cook: "This was originally part of my pstore tree, but when I realized that mailmap needed re-alphabetizing, I decided to wait until -rc1 to send this, as I saw a lot of mailmap additions pending in -next for the merge window. It's a programmatic reordering and the addition of a pstore contributor's preferred email address" * tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mailmap: Add WeiXiong Liao mailmap: Restore dictionary sorting
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: "Another batch of fixes: 1) Remove nft_compat counter flush optimization, it generates warnings from the refcount infrastructure. From Florian Westphal. 2) Fix BPF to search for build id more robustly, from Jiri Olsa. 3) Handle bogus getopt lengths in ebtables, from Florian Westphal. 4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and Oleksij Rempel. 5) Reset iter properly on mptcp sendmsg() error, from Florian Westphal. 6) Show a saner speed in bonding broadcast mode, from Jarod Wilson. 7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones. 8) Fix double unregister in bonding during namespace tear down, from Cong Wang. 9) Disable RP filter during icmp_redirect selftest, from David Ahern" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits) otx2_common: Use devm_kcalloc() in otx2_config_npa() net: qrtr: fix usage of idr in port assignment to socket selftests: disable rp_filter for icmp_redirect.sh Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" phylink: <linux/phylink.h>: fix function prototype kernel-doc warning mptcp: sendmsg: reset iter on error redux net: devlink: Remove overzealous WARN_ON with snapshots tipc: not enable tipc when ipv6 works as a module tipc: fix uninit skb->data in tipc_nl_compat_dumpit() net: Fix potential wrong skb->protocol in skb_vlan_untag() net: xdp: pull ethernet header off packet after computing skb->protocol ipvlan: fix device features bonding: fix a potential double-unregister can: j1939: add rxtimer for multipacket broadcast session can: j1939: abort multipacket broadcast session when timeout occurs can: j1939: cancel rxtimer on multipacket broadcast session complete can: j1939: fix support for multipacket broadcast message net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs' net: fddi: skfp: cfm: Remove set but unused variable 'oldstate' net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs' ...
-
- 17 Aug, 2020 5 commits
-
-
Xu Wang authored
A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kcalloc". Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Necip Fazil Yildiran authored
Passing large uint32 sockaddr_qrtr.port numbers for port allocation triggers a warning within idr_alloc() since the port number is cast to int, and thus interpreted as a negative number. This leads to the rejection of such valid port numbers in qrtr_port_assign() as idr_alloc() fails. To avoid the problem, switch to idr_alloc_u32() instead. Fixes: bdabad3e ("net: Add Qualcomm IPC router") Reported-by: syzbot+f31428628ef672716ea8@syzkaller.appspotmail.com Signed-off-by: Necip Fazil Yildiran <necip@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
h1 is initially configured to reach h2 via r1 rather than the more direct path through r2. If rp_filter is set and inherited for r2, forwarding fails since the source address of h1 is reachable from eth0 vs the packet coming to it via r1 and eth1. Since rp_filter setting affects the test, explicitly reset it. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kees Cook authored
WeiXiong Liao noted to me offlist that his preference for email address had changed and that he'd like it updated in the mailmap so people discussing pstore/blk would be able to reach him. Cc: WeiXiong Liao <gmpy.liaowx@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Several names had been recently appended (instead of inserted). While git-shortlog doesn't need this file to be sorted, it helps humans to keep it organized this way. Sort the entire file (which includes some minor shuffling for dictionary order). Done with the following commands: grep -E '^(#|$)' .mailmap > .mailmap.head grep -Ev '^(#|$)' .mailmap > .mailmap.body sort -f .mailmap.body > .mailmap.body.sort cat .mailmap.head .mailmap.body.sort > .mailmap rm .mailmap.head .mailmap.body.sort Signed-off-by: Kees Cook <keescook@chromium.org>
-