- 07 Aug, 2018 11 commits
-
-
Pieter Jansen van Vuuren authored
Allow matching on options in Geneve tunnel headers. This makes use of existing tunnel metadata support. The options can be described in the form CLASS:TYPE:DATA/CLASS_MASK:TYPE_MASK:DATA_MASK, where CLASS is represented as a 16bit hexadecimal value, TYPE as an 8bit hexadecimal value and DATA as a variable length hexadecimal value. e.g. # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev geneve0 ingress # tc filter add dev geneve0 protocol ip parent ffff: \ flower \ enc_src_ip 10.0.99.192 \ enc_dst_ip 10.0.99.193 \ enc_key_id 11 \ geneve_opts 0102:80:1122334421314151/ffff:ff:ffffffffffffffff \ ip_proto udp \ action mirred egress redirect dev eth1 This patch adds support for matching Geneve options in the order supplied by the user. This leads to an efficient implementation in the software datapath (and in our opinion hardware datapaths that offload this feature). It is also compatible with Geneve options matching provided by the Open vSwitch kernel datapath which is relevant here as the Flower classifier may be used as a mechanism to program flows into hardware as a form of Open vSwitch datapath offload (sometimes referred to as OVS-TC). The netlink Kernel/Userspace API may be extended, for example by adding a flag, if other matching options are desired, for example matching given options in any order. This would require an implementation in the TC software datapath. And be done in a way that drivers that facilitate offload of the Flower classifier can reject or accept such flows based on hardware datapath capabilities. This approach was discussed and agreed on at Netconf 2017 in Seoul. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Horman authored
Allow the existing 'dissection' of tunnel metadata to 'dissect' options already present in tunnel metadata. This dissection is controlled by a new dissector key, FLOW_DISSECTOR_KEY_ENC_OPTS. This dissection only occurs when skb_flow_dissect_tunnel_info() is called, currently only the Flower classifier makes that call. So there should be no impact on other users of the flow dissector. This is in preparation for allowing the flower classifier to match on Geneve options. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
John Hurley authored
The addition of FLOW_DISSECTOR_KEY_ENC_IP to TC flower means that the ToS and TTL of the tunnel header can now be matched on. Extend the NFP tunnel match function to include these new fields. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
John Hurley authored
The TTL for encapsulating headers in IPv4 UDP tunnels is taken from a route lookup. Modify this to first check if a user has specified a TTL to be used in the TC action. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: Support Wake-on-LAN using filters This is technically a v2, but this patch series builds on your feedback and defines the following: - a new WAKE_* bit: WAKE_FILTER which can be enabled alongside other type of Wake-on-LAN to support waking up on a programmed filter (match + action) - a new RX_CLS_FLOW_WAKE flow action which can be specified by an user when inserting a flow using ethtool::rxnfc, similar to the existing RX_CLS_FLOW_DISC The bcm_sf2 and bcmsysport drivers are updated accordingly to work in concert to allow matching packets at the switch level, identified by their filter location to be used as a match by the SYSTEM PORT (CPU/management controller) during Wake-on-LAN. Let me know if this looks better than the previous incarnation of the patch series. Attached is also the ethtool patch that I would be submitting once the uapi changes are committed. Thank you! Changes in v2: - bail out earlier in bcm_sf2_cfp's get_rxnfc if an error is encountered (Andrew) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The SYSTEMPORT MAC allows up to 8 filters to be programmed to wake-up from LAN. Verify that we have up to 8 filters and program them to the appropriate RXCHK entries to be matched (along with their masks). We need to update the entry and exit to Wake-on-LAN mode to keep the RXCHK engine running to match during suspend, but this is otherwise fairly similar to Magic Packet detection. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Allow propagating ethtool::rxnfc programming to the CPU/management port such that it is possible for such a CPU to perform e.g: Wake-on-LAN using filters configured by the switch. We need a tiny bit of cooperation between the switch drivers which is able to do the full flow matching, whereas the CPU/management port might not. The CPU/management driver needs to return -EOPNOTSUPP to indicate an non critical error, any other error code otherwise. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add the ability to specify through ethtool::rxnfc that a rule location is special and will be used to participate in Wake-on-LAN, by e.g: having a specific pattern be matched. When this is the case, fs->ring_cookie must be set to the special value RX_CLS_FLOW_WAKE. We also define an additional ethtool::wolinfo flag: WAKE_FILTER which can be used to configure an Ethernet adapter to allow Wake-on-LAN using previously programmed filters. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-08-07 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add cgroup local storage for BPF programs, which provides a fast accessible memory for storing various per-cgroup data like number of transmitted packets, etc, from Roman. 2) Support bpf_get_socket_cookie() BPF helper in several more program types that have a full socket available, from Andrey. 3) Significantly improve the performance of perf events which are reported from BPF offload. Also convert a couple of BPF AF_XDP samples overto use libbpf, both from Jakub. 4) seg6local LWT provides the End.DT6 action, which allows to decapsulate an outer IPv6 header containing a Segment Routing Header. Adds this action now to the seg6local BPF interface, from Mathieu. 5) Do not mark dst register as unbounded in MOV64 instruction when both src and dst register are the same, from Arthur. 6) Define u_smp_rmb() and u_smp_wmb() to their respective barrier instructions on arm64 for the AF_XDP sample code, from Brian. 7) Convert the tcp_client.py and tcp_server.py BPF selftest scripts over from Python 2 to Python 3, from Jeremy. 8) Enable BTF build flags to the BPF sample code Makefile, from Taeung. 9) Remove an unnecessary rcu_read_lock() in run_lwt_bpf(), from Taehee. 10) Several improvements to the README.rst from the BPF documentation to make it more consistent with RST format, from Tobin. 11) Replace all occurrences of strerror() by calls to strerror_r() in libbpf and fix a FORTIFY_SOURCE build error along with it, from Thomas. 12) Fix a bug in bpftool's get_btf() function to correctly propagate an error via PTR_ERR(), from Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
This patch fixes the following sparse warning about mismatch rcu attribute for address space annotation: ... error: incompatible types in comparison expression (different modifiers) error: incompatible types in comparison expression (different address spaces) ... Some __rcu annotation was at non-pointers list head structures and one was missing in edge information which is used by rcu_assign_pointer() to update edge setting information. Cc: Stefan Schmidt <stefan@datenfreihafen.org> Fixes: f25da51f ("ieee802154: hwsim: add replacement for fakelb") Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Roman Gushchin authored
__cgroup_bpf_attach() and __cgroup_bpf_detach() functions have a good amount of duplicated code, which is possible to eliminate by introducing the update_effective_progs() helper function. The update_effective_progs() calls compute_effective_progs() and then in case of success it calls activate_effective_progs() for each descendant cgroup. In case of failure (OOM), it releases allocated prog arrays and return the error code. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
- 06 Aug, 2018 29 commits
-
-
David S. Miller authored
Merge branch 'ieee802154-for-davem-2018-08-06' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2018-08-06 An update from ieee802154 for *net-next* Romuald added a socket option to get the LQI value of the received datagram. Alexander added a new hardware simulation driver modelled after hwsim of the wireless people. It allows runtime configuration for new nodes and edges over a netlink interface (a config utlity is making its way into wpan-tools). We also have three fixes in here. One from Colin which is more of a cleanup and two from Alex fixing tailroom and single frame space problems. I would normally put the last two into my fixes tree, but given we are already in -rc8 I simply put them here and added a cc: stable to them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We accidentally removed the parentheses here, but they are required because '!' has higher precedence than '&'. Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yangbo Lu authored
This is a fix-up patch for below build issue with multi_v7_defconfig. drivers/ptp/ptp_qoriq.o: In function `qoriq_ptp_probe': ptp_qoriq.c:(.text+0xd0c): undefined reference to `__aeabi_uldivmod' Fixes: 91305f28 ("ptp_qoriq: support automatic configuration for ptp timer") Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yafang Shao authored
The sock_flag() check is alreay inside sock_enable_timestamp(), so it is unnecessary checking it in the caller. void sock_enable_timestamp(struct sock *sk, int flag) { if (!sock_flag(sk, flag)) { ... } } Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
We use to have message like: struct vhost_msg { int type; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; Unfortunately, there will be a hole of 32bit in 64bit machine because of the alignment. This leads a different formats between 32bit API and 64bit API. What's more it will break 32bit program running on 64bit machine. So fixing this by introducing a new message type with an explicit 32bit reserved field after type like: struct vhost_msg_v2 { __u32 type; __u32 reserved; union { struct vhost_iotlb_msg iotlb; __u8 padding[64]; }; }; We will have a consistent ABI after switching to use this. To enable this capability, introduce a new ioctl (VHOST_SET_BAKCEND_FEATURE) for userspace to enable this feature (VHOST_BACKEND_F_IOTLB_V2). Fixes: 6b1e6cc7 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
zhong jiang authored
The err is not modified after initalization, So remove it and make it to be void function. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Al Viro authored
__inet6_lookup_established() expect th->dport passed in host-endian, not net-endian. The reason is microoptimization in __inet6_lookup(), but if you use the lower-level helpers, you have to play by their rules... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Aring authored
Since mac802154_hwsim the fakelb driver will get deprecated. This patch will notifier all users of fakelb to switch to the new mac802154_hwsim driver. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
-
Alexander Aring authored
This patch adds a new virtual driver mac802154_hwsim which is based on the fakelb driver. The fakelb driver will get deprecated and hopefully removed someday. The main reason for doing this step is to rename the driver to mac802154_hwsim to have a similar naming scheme as mac80211_hwsim, which is more popular in the 802.11 wireless word and the idea is the same behind this driver. The new features of this driver are to have knowledge about connected edges, which can be changed during runtime. This offers a testing environment for routing protocols e.g. RPL. The default behaviour is still as fakelb: two radios connected to each other. New added radios during runtime will not be connected to other wpan_hwsim instances. The netlink api is not namespace aware on purpose, only the registered wpan_phy's can be moved to namespaces. The physical layer according to wiresless "air" communication can be handled across namespaces. Furthermore the edges can be weighted with the LQI value according IEEE 802.15.4 which offers additional handling to mark bad or good connection indicators to other connected virtual phys. Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
-
Colin Ian King authored
Pointers fq and net are being assigned but are never used hence they are redundant and can be removed. Cleans up clang warnings: warning: variable 'fq' set but not used [-Wunused-but-set-variable] warning: variable 'net' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
-
Alexander Aring authored
This patch is necessary if case of AF_PACKET or other socket interface which I am aware of it and didn't allocated the necessary room. Reported-by: David Palma <david.palma@ntnu.no> Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
-
Alexander Aring authored
This patch fixes patch add handling to take care tail and headroom for single 6lowpan frames. We need to be sure we have a skb with the right head and tailroom for single frames. This patch do it by using skb_copy_expand() if head and tailroom is not enough allocated by upper layer. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195059Reported-by: David Palma <david.palma@ntnu.no> Reported-by: Rabi Narayan Sahoo <rabinarayans0828@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
-
Stefan Schmidt authored
-
Vlad Buslov authored
Match patterns for some skbedit tests contain duplicate whitespace that is not present in actual tc output. This causes tests to fail because they can't match required action, even when it was successfully created. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
Match patterns for some connmark tests contain duplicate whitespace that is not present in actual tc output. This causes tests to fail because they can't match required action, even when it was successfully created. Fixes: 1dad0f9f ("tc-testing: add connmark action tests") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
Test 6fb4 creates one mirred and one pipe action, but only flushes mirred on teardown. Leaking pipe action causes failures in other tests. Add additional teardown command to also flush gact actions. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vlad Buslov authored
Fix expected ip address to actually match configured ip address. Fix test to expect single matched filter. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'wireless-drivers-next-for-davem-2018-08-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.19 This time a bigger pull request as we have two new Mediatek drivers MT76x2u (CONFIG_MT76x2U) and MT76x0U (CONFIG_MT76x0U). Also iwlwifi got support for the new IEEE 802.11ax standard, the successor for 802.11ac. And naturally smaller new features and bugfixes all over. Major changes: wcn36xx * fix WEP in client mode wil6210 * add support for Talyn-MB (Talyn ver 2.0) device * add support for enhanced DMA firmware feature iwlwifi * implement 802.11ax D2.0 * support for the new 22560 device family * new PCI IDs for 22000 and 22560 qtnfmac * implement cfg80211 power management callback * enable multiple SSIDs scan support * qtnfmac: implement basic WoWLAN support mt7601u * fall back to software encryption for hw unsupported ciphers * enable 802.11 Management Frame Protection (MFP) mt76 * support setting RTS threshold * add USB support * add support for MT76x2u devices * add support for MT76x0U devices mwifiex * allow user space to set all other IEs except WMM IE rsi * add firmware support for AP+BT dual mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2018-08-05 Here's the main bluetooth-next pull request for the 4.19 kernel. - Added support for Bluetooth Advertising Extensions - Added vendor driver support to hci_h5 HCI driver - Added serdev support to hci_h5 driver - Added support for Qualcomm wcn3990 controller - Added support for RTL8723BS and RTL8723DS controllers - btusb: Added new ID for Realtek 8723DE - Several other smaller fixes & cleanups Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: Enable MC-aware mode for mlxsw ports Petr says: Due to an issue in Spectrum chips, when unicast traffic shares the same queue as BUM traffic, and there is a congestion, the BUM traffic is admitted to the queue anyway, thus pushing out all UC traffic. In order to give unicast traffic precedence over BUM traffic, configure multicast-aware mode on all ports. Under multicast-aware regime, when assigning traffic class to a packet, the switch doesn't merely take the value prescribed by the QTCT register. For BUM traffic, it instead assigns that value plus 8. That limits the number of available TCs, but since mlxsw currently only uses the lower eight anyway, it is no real loss. The two TCs (UC and MC one) are then mapped to the same subgroup and strictly prioritized so that UC traffic is preferred in case of congestion. In patch #1, introduce a new register, QTCTM, which enables the multicast-aware mode. In patch #2, fix a typo in related code. In patch #3, set up TCs and QTCTM to enable multicast-aware mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
In order to give unicast traffic precedence over BUM traffic, configure multicast-aware mode on all ports. Under multicast-aware regime, when assigning traffic class to a packet, the switch doesn't merely take the value prescribed by the QTCT register. For BUM traffic, it instead assigns that value plus 8. ETS elements for TCs 8..15 thus need to be configured as well. Extend mlxsw_sp_port_ets_init() so that it maps each of them to the same subgroup as their corresponding TC from the range 0..7, such that TCs X and X+8 map to the same subgroup. The existing code configures TCs with strict priority. So far this was immaterial, because each TC had its own subgroup. Now that two TCs share a subgroup it becomes important. TCs are prioritized in order of 7, 6, ..., 0, 15, 14, ..., 8: the higher TCs used for BUM traffic end up being deprioritized. Since that's what's needed, keep that configuration as it is, and configure the new TCs likewise. Finally in mlxsw_sp_port_create(), invoke configuration of QTCTM to enable MC-aware mode on each port. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
This register configures if the Switch Priority to Traffic Class mapping is based on Multicast packet indication. If so, then multicast packets will get a Traffic Class that is plus (cap_max_tclass_data/2) the value configured by QTCT. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1402059 ("Missing break in switch") Addresses-Coverity-ID: 1402060 ("Missing break in switch") Addresses-Coverity-ID: 1402061 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We forgot to set the error code on this path, so we return NULL instead of an error pointer. In the current code kzalloc() won't fail for small allocations so this doesn't really affect runtime. Fixes: b95ec7eb ("net: sched: cls_flower: implement chain templates") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Li RongQing authored
dev_set_mtu_ext is able to fail with a valid mtu value, at that condition, extack._msg is not set and random since it is in stack, then kernel will crash when print it. Fixes: 7a4c53be ("net: report invalid mtu value via netlink extack") Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
don't bother with pathological cases, they only waste cycles. IPv6 requires a minimum MTU of 1280 so we should never see fragments smaller than this (except last frag). v3: don't use awkward "-offset + len" v2: drop IPv4 part, which added same check w. IPV4_MIN_MTU (68). There were concerns that there could be even smaller frags generated by intermediate nodes, e.g. on radio networks. Cc: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Peter Oskolkov says: ==================== ip: Use rb trees for IP frag queue. This patchset * changes IPv4 defrag behavior to match that of IPv6: overlapping fragments now cause the whole IP datagram to be discarded (suggested by David Miller): there are no legitimate use cases for overlapping fragments; * changes IPv4 defrag queue from a list to a rb tree (suggested by Eric Dumazet): this change removes a potential attach vector. Upcoming patches will contain similar changes for IPv6 frag queue, as well as a comprehensive IP defrag self-test (temporarily delayed). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Peter Oskolkov authored
Similar to TCP OOO RX queue, it makes sense to use rb trees to store IP fragments, so that OOO fragments are inserted faster. Tested: - a follow-up patch contains a rather comprehensive ip defrag self-test (functional) - ran neper `udp_stream -c -H <host> -F 100 -l 300 -T 20`: netstat --statistics Ip: 282078937 total packets received 0 forwarded 0 incoming packets discarded 946760 incoming packets delivered 18743456 requests sent out 101 fragments dropped after timeout 282077129 reassemblies required 944952 packets reassembled ok 262734239 packet reassembles failed (The numbers/stats above are somewhat better re: reassemblies vs a kernel without this patchset. More comprehensive performance testing TBD). Reported-by: Jann Horn <jannh@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-