- 24 Aug, 2017 33 commits
-
-
Matan Barak authored
When adding a flow table entry (fte) to a flow group (fg), we first need to check whether this fte exist. In such a case we just merge the destinations (if possible). Currently, this is done by traversing the fte list available in a fg. This could take a lot of time when using large flow groups. Speeding this up by using rhashtable, which is much faster. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Matan Barak authored
The current code stores fte_match_param in the software representation of FTEs and FGs. fte_match_param contains a large reserved area at the bottom of the struct. Since downstream patches are going to hash this part, we would like to avoid doing so on a reserved part. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Matan Barak authored
When allocating a flow table entry, we need to allocate a free index in the flow group. Currently, this is done by traversing the existing flow table entries in the flow group, until a free index is found. Replacing this by using a ida, which allows us to find a free index much faster. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Gal Pressman authored
Fix the following checkpatch warning in en_ethtool.c: WARNING: suspect code indent for conditional statements (8, 9) + for (i = 0; i < NUM_PCIE_PERF_STALL_COUNTERS(priv); i++) + strcpy(data + (idx++) * ETH_GSTRING_LEN, Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Gal Pressman authored
mlx5_core_wq is no longer being used and should be removed from the code. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Saeed Mahameed authored
The blank line should be after u32 val = ... and not after __be32 __iomem *addr = ... Fixes: ad5b39a9 ("net/mlx5: Add a blank line after declarations") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Joe Perches <joe@perches.com>
-
Daniel Borkmann authored
No need to test for it in fast-path, every dev in bpf_dtab_netdev is guaranteed to be non-NULL, otherwise dev_map_update_elem() will fail in the first place. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shubham Bansal authored
As eBPF JIT support for arm32 was added recently with commit 39c13c20, it seems appropriate to add arm32 as arch with support for eBPF JIT in bpf and sysctl docs as well. Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Edward Cree says: ==================== bpf: verifier fixes Fix a couple of bugs introduced in my recent verifier patches. Patch #2 does slightly increase the insn count on bpf_lxc.o, but only by about a hundred insns (i.e. 0.2%). v2: added test for write-marks bug (patch #1); reworded comment on propagate_liveness() for clarity. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The liveness tracking algorithm is quite subtle; add comments to explain it. Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The optimisation it does is broken when the 'new' register value has a variable offset and the 'old' was constant. I broke it with my pointer types unification (see Fixes tag below), before which the 'new' value would have type PTR_TO_MAP_VALUE_ADJ and would thus not compare equal; other changes in that patch mean that its original behaviour (ignore min/max values) cannot be restored. Tests on a sample set of cilium programs show no change in count of processed instructions. Fixes: f1174f77 ("bpf/verifier: rework value tracking") Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexei Starovoitov authored
The test makes a read through a map value pointer, then considers pruning a branch where the register holds an adjusted map value pointer. It should not prune, but currently it does. Signed-off-by: Alexei Starovoitov <ast@fb.com> [ecree@solarflare.com: added test-name and patch description] Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
The fact that writes occurred in reaching the continuation state does not screen off its reads from us, because we're not really its parent. So detect 'not really the parent' in do_propagate_liveness, and ignore write marks in that case. Fixes: dc503a8a ("bpf/verifier: track liveness for pruning") Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Writes in straight-line code should not prevent reads from propagating along jumps. With current verifier code, the jump from 3 to 5 does not add a read mark on 3:R0 (because 5:R0 has a write mark), meaning that the jump from 1 to 3 gets pruned as safe even though R0 is NOT_INIT. Verifier output: 0: (61) r2 = *(u32 *)(r1 +0) 1: (35) if r2 >= 0x0 goto pc+1 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 2: (b7) r0 = 0 3: (35) if r2 >= 0x0 goto pc+1 R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 4: (b7) r0 = 0 5: (95) exit from 3 to 5: safe from 1 to 3: safe processed 8 insns, stack depth 0 Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
iph is being assigned the same value twice; remove the redundant first assignment. (Thanks to Nikolay Aleksandrov for pointing out that the first asssignment should be removed and not the second) Fixes warning: net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arvind Yadav authored
genl_ops are not supposed to change at runtime. All functions working with genl_ops provided by <net/genetlink.h> work with const genl_ops. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
The functions set_ctrl0 and set_ctrl1 are local to the source and do not need to be in global scope, so make them static. Cleans up sparse warnings: symbol 'set_ctrl0' was not declared. Should it be static? symbol 'set_ctrl1' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
This is necessary to allow the user to disable peeking with offset once it's enabled. Unix sockets already allow the above, with this patch we permit it for udp[6] sockets, too. Fixes: 627d2d6b ("udp: enable MSG_PEEK at non-zero offset") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: spectrum: Introduce multichain TC offload This patchset introduces offloading of rules added to chain with non-zero index, which was previously forbidden. Also, goto_chain termination action is offloaded allowing to jump to processing of desired chain. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
If action is gact goto_chain, offload it to HW by jumping to another ruleset. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
We need to lookup ruleset in order to offload goto_chain termination action. This patch adds it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
For goto_chain action we need to know group_id of a ruleset to jump to. Provide infrastructure in order to get it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add helpers to find out if a gact instance is goto_chain termination action and to get chain index. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Reflect chain index coming down from TC core and create a ruleset per chain. Note that only chain 0, being the implicit chain, is bound to the device for processing. The rest of chains have to be "jumped-to" by actions. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Antoine Tenart says: ==================== net: mvpp2: software TSO support This series adds the s/w TSO support in the PPv2 driver, in addition to two cosmetic commits. As stated in patch 3/3: Using iperf and 10G ports, using TSO shows a significant performance improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a significant CPU usage drop (from 25% to 15%). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
The patch uses the tso API to implement the tso functionality in Marvell PPv2 driver. Using iperf and 10G ports, using TSO shows a significant performance improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a significant CPU usage drop (from 25% to 15%). Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
The txq size is defined by MVPP2_AGGR_TXQ_SIZE, which is sometime not used directly but through variables. As it is a fixed value use the define everywhere in the driver. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Antoine Ténart authored
The TSO header size was defined in many drivers. Factorize the code and define its size in net/tso.h. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
Now when ipv4 route inserts a fib_info, it memcmp fib_metrics. It means ipv4 route identifies one route also with metrics. But when removing a route, it tries to find the route without caring about the metrics. It will cause that the route with right metrics can't be removed. Thomas noticed this issue when doing the testing: 1. add: # ip route append 192.168.7.0/24 dev v window 1000 # ip route append 192.168.7.0/24 dev v window 1001 # ip route append 192.168.7.0/24 dev v window 1002 # ip route append 192.168.7.0/24 dev v window 1003 2. delete: # ip route delete 192.168.7.0/24 dev v window 1002 3. show: 192.168.7.0/24 proto boot scope link window 1001 192.168.7.0/24 proto boot scope link window 1002 192.168.7.0/24 proto boot scope link window 1003 The one with window 1002 wasn't deleted but the first one was. This patch is to do metrics match when looking up and deleting one route. Reported-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Mike Maloney says: ==================== net: Add software rx timestamp for TCP. Add software rx timestamps for TCP, and a test to ensure consistency of behavior between IP, UDP, and TCP implementation. Changes since v1: -Initialize tss->ts[1] to 0 if caller requested any timestamps. -Fix test case to validate that tss->ts[1] is zero. -Fix tests to actually use a raw socket. -Fix --tcp flag to work on the test. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mike Maloney authored
Validate the behavior of the combination of various timestamp socket options, and ensure consistency across ip, udp, and tcp. Signed-off-by: Mike Maloney <maloney@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mike Maloney authored
When SOF_TIMESTAMPING_RX_SOFTWARE is enabled for tcp sockets, return the timestamp corresponding to the highest sequence number data returned. Previously the skb->tstamp is overwritten when a TCP packet is placed in the out of order queue. While the packet is in the ooo queue, save the timestamp in the TCB_SKB_CB. This space is shared with the gso_* options which are only used on the tx path, and a previously unused 4 byte hole. When skbs are coalesced either in the sk_receive_queue or the out_of_order_queue always choose the timestamp of the appended skb to maintain the invariant of returning the timestamp of the last byte in the recvmsg buffer. Signed-off-by: Mike Maloney <maloney@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Manlunas authored
In the NIC firmware, the 1-bit flag indicating "firmware is loaded" moved from SLI_SCRATCH_1 to SLI_SCRATCH_2 (these are Octeon general-purpose scratch registers). Make the PF driver conform to this change. Remove code that sets the "firmware is loaded" flag because it's now the firmware's job to do that. In the code that detects whether or not the firmware is loaded, don't just rely on checking the "firmware is loaded" flag because that may cause a rare false negative. Add code that deduces whether or not the firmware is loaded; that will never give a false negative. Also bump up driver version to match newer NIC firmware. Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Aug, 2017 4 commits
-
-
William Tu authored
Fix typo: pnet_tap_faied. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Daniel Borkmann says: ==================== Two minor BPF cleanups Two minor cleanups on devmap and redirect I still had in my queue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Some minor code cleanups, while going over it I also noticed that we're accounting the bitmap only for one CPU currently, so fix that up as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Few cleanups including: bpf_redirect_map() is really XDP only due to the return code. Move it to a more appropriate location where we do the XDP redirect handling and change it's name into bpf_xdp_redirect_map() to make it consistent to the bpf_xdp_redirect() helper. xdp_do_redirect_map() helper can be static since only used out of filter.c file. Drop the goto in xdp_do_generic_redirect() and only return errors directly. In xdp_do_flush_map() only clear ri->map_to_flush which is the arg we're using in that function, ri->map is cleared earlier along with ri->ifindex. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 Aug, 2017 3 commits
-
-
Daniel Borkmann authored
Currently, iproute2's BPF ELF loader works fine with array of maps when retrieving the fd from a pinned node and doing a selfcheck against the provided map attributes from the object file, but we fail to do the same for hash of maps and thus refuse to get the map from pinned node. Reason is that when allocating hash of maps, fd_htab_map_alloc() will set the value size to sizeof(void *), and any user space map creation requests are forced to set 4 bytes as value size. Thus, selfcheck will complain about exposed 8 bytes on 64 bit archs vs. 4 bytes from object file as value size. Contract is that fdinfo or BPF_MAP_GET_FD_BY_ID returns the value size used to create the map. Fix it by handling it the same way as we do for array of maps, which means that we leave value size at 4 bytes and in the allocation phase round up value size to 8 bytes. alloc_htab_elem() needs an adjustment in order to copy rounded up 8 bytes due to bpf_fd_htab_map_update_elem() calling into htab_map_update_elem() with the pointer of the map pointer as value. Unlike array of maps where we just xchg(), we're using the generic htab_map_update_elem() callback also used from helper calls, which published the key/value already on return, so we need to ensure to memcpy() the right size. Fixes: bcc6b1b7 ("bpf: Add hash of maps support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
There is a missing break causing a fall-through and setting ctx.use_bbit_insns to the wrong value. Fix this by adding the missing break. Detected with cppcheck: "Variable 'ctx.use_bbit_insns' is reassigned a value before the old one has been used. 'break;' missing?" Fixes: 8d8d18c3 ("MIPS,bpf: Fix using smp_processor_id() in preemptible splat.") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
High order GFP_KERNEL allocations can stress the host badly. Use modern kvmalloc_array()/kvfree() instead of custom allocations. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-