- 23 Jan, 2020 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-01-22 The following pull-request contains BPF updates for your *net-next* tree. We've added 92 non-merge commits during the last 16 day(s) which contain a total of 320 files changed, 7532 insertions(+), 1448 deletions(-). The main changes are: 1) function by function verification and program extensions from Alexei. 2) massive cleanup of selftests/bpf from Toke and Andrii. 3) batched bpf map operations from Brian and Yonghong. 4) tcp congestion control in bpf from Martin. 5) bulking for non-map xdp_redirect form Toke. 6) bpf_send_signal_thread helper from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexei Starovoitov authored
Martin KaFai Lau says: ==================== This set adds bpf_cubic.c example. It was separated from the earlier BPF STRUCT_OPS series. Some highlights since the last post: 1. It is based on EricD recent fixes to the kernel tcp_cubic. [1] 2. The bpf jiffies reading helper is inlined by the verifier. Different from the earlier version, it only reads jiffies alone and does not do usecs/jiffies conversion. 3. The bpf .kconfig map is used to read CONFIG_HZ. [1]: https://patchwork.ozlabs.org/cover/1215066/ v3: - Remove __weak from CONFIG_HZ in patch 3. (Andrii) v2: - Move inlining to fixup_bpf_calls() in patch 1. (Daniel) - It is inlined for 64 BITS_PER_LONG and jit_requested as the map_gen_lookup(). Other cases could be considered together with map_gen_lookup() if needed. - Use usec resolution in bictcp_update() calculation in patch 3. usecs_to_jiffies() is then removed(). (Eric) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch adds a bpf_cubic example. Some highlights: 1. CONFIG_HZ .kconfig map is used. 2. In bictcp_update(), calculation is changed to use usec resolution (i.e. USEC_PER_JIFFY) instead of using jiffies. Thus, usecs_to_jiffies() is not used in the bpf_cubic.c. 3. In bitctcp_update() [under tcp_friendliness], the original "while (ca->ack_cnt > delta)" loop is changed to the equivalent "ca->ack_cnt / delta" operation. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200122233658.903774-1-kafai@fb.com
-
Martin KaFai Lau authored
This patch sync uapi bpf.h to tools/. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com
-
Martin KaFai Lau authored
This patch adds a helper to read the 64bit jiffies. It will be used in a later patch to implement the bpf_cubic.c. The helper is inlined for jit_requested and 64 BITS_PER_LONG as the map_gen_lookup(). Other cases could be considered together with map_gen_lookup() if needed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200122233646.903260-1-kafai@fb.com
-
- 22 Jan, 2020 9 commits
-
-
Daniel Borkmann authored
Alexei Starovoitov says: ==================== The last few month BPF community has been discussing an approach to call chaining, since exiting bpt_tail_call() mechanism used in production XDP programs has plenty of downsides. The outcome of these discussion was a conclusion to implement dynamic re-linking of BPF programs. Where rootlet XDP program attached to a netdevice can programmatically define a policy of execution of other XDP programs. Such rootlet would be compiled as normal XDP program and provide a number of placeholder global functions which later can be replaced with future XDP programs. BPF trampoline, function by function verification were building blocks towards that goal. The patch 1 is a final building block. It introduces dynamic program extensions. A number of improvements like more flexible function by function verification and better libbpf api will be implemented in future patches. v1->v2: - addressed Andrii's comments - rebase ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Alexei Starovoitov authored
Add program extension tests that build on top of fexit_bpf2bpf tests. Replace three global functions in previously loaded test_pkt_access.c program with three new implementations: int get_skb_len(struct __sk_buff *skb); int get_constant(long val); int get_skb_ifindex(int val, struct __sk_buff *skb, int var); New function return the same results as original only if arguments match. new_get_skb_ifindex() demonstrates that 'skb' argument doesn't have to be first and only argument of BPF program. All normal skb based accesses are available. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200121005348.2769920-4-ast@kernel.org
-
Alexei Starovoitov authored
Add minimal support for program extensions. bpf_object_open_opts() needs to be called with attach_prog_fd = target_prog_fd and BPF program extension needs to have in .c file section definition like SEC("freplace/func_to_be_replaced"). libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and will pass it in attach_btf_id to the kernel. This approach works for tests, but more compex use case may need to request function name (and attach_btf_id that kernel sees) to be more dynamic. Such API will be added in future patches. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org
-
Alexei Starovoitov authored
Introduce dynamic program extensions. The users can load additional BPF functions and replace global functions in previously loaded BPF programs while these programs are executing. Global functions are verified individually by the verifier based on their types only. Hence the global function in the new program which types match older function can safely replace that corresponding function. This new function/program is called 'an extension' of old program. At load time the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function to be replaced. The BPF program type is derived from the target program into extension program. Technically bpf_verifier_ops is copied from target program. The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops. The extension program can call the same bpf helper functions as target program. Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program types. The verifier allows only one level of replacement. Meaning that the extension program cannot recursively extend an extension. That also means that the maximum stack size is increasing from 512 to 1024 bytes and maximum function nesting level from 8 to 16. The programs don't always consume that much. The stack usage is determined by the number of on-stack variables used by the program. The verifier could have enforced 512 limit for combined original plus extension program, but it makes for difficult user experience. The main use case for extensions is to provide generic mechanism to plug external programs into policy program or function call chaining. BPF trampoline is used to track both fentry/fexit and program extensions because both are using the same nop slot at the beginning of every BPF function. Attaching fentry/fexit to a function that was replaced is not allowed. The opposite is true as well. Replacing a function that currently being analyzed with fentry/fexit is not allowed. The executable page allocated by BPF trampoline is not used by program extensions. This inefficiency will be optimized in future patches. Function by function verification of global function supports scalars and pointer to context only. Hence program extensions are supported for such class of global functions only. In the future the verifier will be extended with support to pointers to structures, arrays with sizes, etc. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200121005348.2769920-2-ast@kernel.org
-
Heiner Kallweit authored
The first batch of driver conversions missed a few cases where we can use phy_do_ioctl too. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chris Down authored
When trying to compile with CONFIG_DEBUG_INFO_BTF enabled, I got this error: % make -s Failed to generate BTF for vmlinux Try to disable CONFIG_DEBUG_INFO_BTF make[3]: *** [vmlinux] Error 1 Compiling again without -s shows the true error (that pahole is missing), but since this is fatal, we should show the error unconditionally on stderr as well, not silence it using the `info` function. With this patch: % make -s BTF: .tmp_vmlinux.btf: pahole (pahole) is not available Failed to generate BTF for vmlinux Try to disable CONFIG_DEBUG_INFO_BTF make[3]: *** [vmlinux] Error 1 Signed-off-by: Chris Down <chris@chrisdown.name> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200122000110.GA310073@chrisdown.name
-
Daniel Díaz authored
During cross-compilation, it was discovered that LDFLAGS and LDLIBS were not being used while building binaries, leading to defaults which were not necessarily correct. OpenEmbedded reported this kind of problem: ERROR: QA Issue: No GNU_HASH in the ELF binary [...], didn't pass LDFLAGS? Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com>
-
Alexei Starovoitov authored
Restore the 'if (env->cur_state)' check that was incorrectly removed during code move. Under memory pressure env->cur_state can be freed and zeroed inside do_check(). Hence the check is necessary. Fixes: 51c39bb1 ("bpf: Introduce function-by-function verification") Reported-by: syzbot+b296579ba5015704d9fa@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200122024138.3385590-1-ast@kernel.org
-
Alexei Starovoitov authored
Though the second half of trampoline page is unused a task could be preempted in the middle of the first half of trampoline and two updates to trampoline would change the code from underneath the preempted task. Hence wait for tasks to voluntarily schedule or go to userspace. Add similar wait before freeing the trampoline. Fixes: fec56f58 ("bpf: Introduce BPF trampoline") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/bpf/20200121032231.3292185-1-ast@kernel.org
-
- 21 Jan, 2020 26 commits
-
-
Björn Töpel authored
XDP sockets use the default implementation of struct sock's sk_data_ready callback, which is sock_def_readable(). This function is called in the XDP socket fast-path, and involves a retpoline. By letting sock_def_readable() have external linkage, and being called directly, the retpoline can be avoided. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200120092917.13949-1-bjorn.topel@gmail.com
-
Al Viro authored
kernel/bpf/inode.c misuses kern_path...() - it's much simpler (and more efficient, on top of that) to use user_path...() counterparts rather than bothering with doing getname() manually. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200120232858.GF8904@ZenIV.linux.org.uk
-
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-nextDavid S. Miller authored
Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-01-21 1) Add support for TCP encapsulation of IKE and ESP messages, as defined by RFC 8229. Patchset from Sabrina Dubroca. Please note that there is a merge conflict in: net/unix/af_unix.c between commit: 3c32da19 ("unix: Show number of pending scm files of receive queue in fdinfo") from the net-next tree and commit: b50b0580 ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram") from the ipsec-next tree. The conflict can be solved as done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chen Zhou authored
Fixes coccicheck warning: ./drivers/net/ethernet/amd/declance.c:611:14-15: WARNING comparing pointer to 0 Replace "skb == 0" with "!skb". Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Shi authored
After commit 079096f1 ("tcp/dccp: install syn_recv requests into ehash table") the macro isn't used anymore. remove it. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Shi authored
It's never used after introduced. So maybe better to remove. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Arvid Brodin <arvid.brodin@alten.se> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
drivers/net/wan/hdlc_x25.c: In function ‘x25_ioctl’: drivers/net/wan/hdlc_x25.c:256:7: warning: suggest parentheses around assignment used as truth value [-Wparentheses] 256 | if (ifr->ifr_settings.size = 0) { | ^~~ Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Huazhong Tan says: ==================== net: hns3: misc updates for -net-next This series includes some misc updates for the HNS3 ethernet driver. [patch 1] adds a limitation for the error log in the hns3_clean_tx_ring(). [patch 2] adds a check for pfmemalloc flag before reusing pages since these pages may be used some special case. [patch 3] assigns a default reset type 'HNAE3_NONE_RESET' to VF's reset_type after initializing or reset. [patch 4] unifies macro HCLGE_DFX_REG_TYPE_CNT's definition into header file. [patch 5] refines the parameter 'size' of snprintf() in the hns3_init_module(). [patch 6] rewrites a debug message in hclge_put_vector(). [patch 7~9] adds some cleanups related to coding style. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
This patch removes some unnecessary return value assignments, some duplicated printing in the caller, refines the judgment of 0 and uses le16_to_cpu to replace __le16_to_cpu. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
All kmalloc-based functions print enough information on failures. So this patch removes the log in hclge_get_dfx_reg() when returns ENOMEM. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guangbin Huang authored
This patch deletes some unnecessary blank lines and spaces to clean up code, and in hclgevf_set_vlan_filter() moves the comment to the front of hclgevf_send_mbx_msg(). Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yonglong Liu authored
When gets vector fails, hclge_put_vector() should print out the vector instead of vector_id in the log and return the wrong vector_id to its caller. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guojia Liao authored
The function snprintf() writes at most size bytes (including the terminating null byte ('\0') to str. Now, We can guarantee that the parameter of size is lager than the length of str to be formatting including its terminating null byte. So it's unnecessary to minus 1 for the input parameter 'size'. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guojia Liao authored
Macro HCLGE_GET_DFX_REG_TYPE_CNT in hclge_dbg_get_dfx_bd_num() and macro HCLGE_DFX_REG_BD_NUM in hclge_get_dfx_reg_bd_num() have the same meaning, so just defines HCLGE_GET_DFX_REG_TYPE_CNT in hclge_main.h. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huazhong Tan authored
reset_type means what kind of reset the driver is handling now, so after initializing or reset, the reset_type of VF should be set to HNAE3_NONE_RESET, otherwise, this unknown default value may be a little misleading when the device is running. Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yunsheng Lin authored
HNS3 driver allocates pages for DMA with dev_alloc_pages(), which calls alloc_pages_node() with the __GFP_MEMALLOC flag. So, in case of OOM condition, HNS3 can get pages with pfmemalloc flag set. So do not reuse the pages with pfmemalloc flag set because those pages are reserved for special cases, such as low memory case. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yunsheng Lin authored
The error log printed by netdev_err() in the hns3_clean_tx_ring() may spam the kernel log. This patch uses hns3_rl_err() to ratelimit the error log in the hns3_clean_tx_ring(). Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin Schiller authored
o call skb_reset_network_header() before hdlc->xmit() o change skb proto to HDLC (0x0019) before hdlc->xmit() o call dev_queue_xmit_nit() before hdlc->xmit() This changes make it possible to trace (tcpdump) outgoing layer2 (ETH_P_HDLC) packets Additionally call skb_reset_network_header() after each skb_push() / skb_pull(). Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin Schiller authored
This enables you to configure mode (DTE/DCE), Modulo, Window, T1, T2, N2 via sethdlc (which needs to be patched as well). Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hans Wippel authored
The current flags of the SMC_PNET_GET command only allow privileged users to retrieve entries from the pnet table via netlink. The content of the pnet table may be useful for all users though, e.g., for debugging smc connection problems. This patch removes the GENL_ADMIN_PERM flag so that unprivileged users can read the pnet table. Signed-off-by: Hans Wippel <ndev@hwipl.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Heiner Kallweit says: ==================== net: phy: add new version of phy_do_ioctl and convert suitable drivers We just added phy_do_ioctl, but it turned out that we need another version of this function that doesn't check whether net_device is running. So rename phy_do_ioctl to phy_do_ioctl_running and add a new version of phy_do_ioctl. Eventually convert suitable drivers to use phy_do_ioctl. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Convert suitable network drivers to use phy_do_ioctl. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Add a new version of phy_do_ioctl that doesn't check whether net_device is running. It will typically be used if suitable drivers attach the PHY in probe already. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
We just added phy_do_ioctl, but it turned out that we need another version of this function that doesn't check whether net_device is running. So rename phy_do_ioctl to phy_do_ioctl_running. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chen Zhou authored
snprintf returns the number of bytes that would be written, which may be greater than the the actual length to be written. Here use extra code to handle this. scnprintf returns the number of bytes that was actually written, just use scnprintf to simplify the code. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chen Zhou authored
The return value of snprintf may be greater than the size of HNS3_DBG_READ_LEN, use scnprintf instead in hns3_dbg_cmd_read. Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-