- 16 Sep, 2022 8 commits
-
-
Xin Liu authored
Legacy BPF map declarations are no longer supported in libbpf v1.0 [0]. Only BTF-defined maps are supported starting from v1.0, so it is time to remove the definition of bpf_map_def in bpf_helpers.h. [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220913073643.19960-1-liuxin350@huawei.com
-
Andrii Nakryiko authored
Add a small tool, veristat, that allows mass-verification of a set of *libbpf-compatible* BPF ELF object files. For each such object file, veristat will attempt to verify each BPF program *individually*. Regardless of success or failure, it parses BPF verifier stats and outputs them in human-readable table format. In the future we can also add CSV and JSON output for more scriptable post-processing, if necessary. veristat allows to specify a set of stats that should be output and ordering between multiple objects and files (e.g., so that one can easily order by total instructions processed, instead of default file name, prog name, verdict, total instructions order). This tool should be useful for validating various BPF verifier changes or even validating different kernel versions for regressions. Here's an example for some of the heaviest selftests/bpf BPF object files: $ sudo ./veristat -s insns,file,prog {pyperf,loop,test_verif_scale,strobemeta,test_cls_redirect,profiler}*.linked3.o File Program Verdict Duration, us Total insns Total states Peak states ------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ ----------- loop3.linked3.o while_true failure 350990 1000001 9663 9663 test_verif_scale3.linked3.o balancer_ingress success 115244 845499 8636 2141 test_verif_scale2.linked3.o balancer_ingress success 77688 773445 3048 788 pyperf600.linked3.o on_event success 2079872 624585 30335 30241 pyperf600_nounroll.linked3.o on_event success 353972 568128 37101 2115 strobemeta.linked3.o on_event success 455230 557149 15915 13537 test_verif_scale1.linked3.o balancer_ingress success 89880 554754 8636 2141 strobemeta_nounroll2.linked3.o on_event success 433906 501725 17087 1912 loop6.linked3.o trace_virtqueue_add_sgs success 282205 398057 8717 919 loop1.linked3.o nested_loops success 125630 361349 5504 5504 pyperf180.linked3.o on_event success 2511740 160398 11470 11446 pyperf100.linked3.o on_event success 744329 87681 6213 6191 test_cls_redirect.linked3.o cls_redirect success 54087 78925 4782 903 strobemeta_subprogs.linked3.o on_event success 57898 65420 1954 403 test_cls_redirect_subprogs.linked3.o cls_redirect success 54522 64965 4619 958 strobemeta_nounroll1.linked3.o on_event success 43313 57240 1757 382 pyperf50.linked3.o on_event success 194355 46378 3263 3241 profiler2.linked3.o tracepoint__syscalls__sys_enter_kill success 23869 43372 1423 542 pyperf_subprogs.linked3.o on_event success 29179 36358 2499 2499 profiler1.linked3.o tracepoint__syscalls__sys_enter_kill success 13052 27036 1946 936 profiler3.linked3.o tracepoint__syscalls__sys_enter_kill success 21023 26016 2186 915 profiler2.linked3.o kprobe__vfs_link success 5255 13896 303 271 profiler1.linked3.o kprobe__vfs_link success 7792 12687 1042 1041 profiler3.linked3.o kprobe__vfs_link success 7332 10601 865 865 profiler2.linked3.o kprobe_ret__do_filp_open success 3417 8900 216 199 profiler2.linked3.o kprobe__vfs_symlink success 3548 8775 203 186 pyperf_global.linked3.o on_event success 10007 7563 520 520 profiler3.linked3.o kprobe_ret__do_filp_open success 4708 6464 532 532 profiler1.linked3.o kprobe_ret__do_filp_open success 3090 6445 508 508 profiler3.linked3.o kprobe__vfs_symlink success 4477 6358 521 521 profiler1.linked3.o kprobe__vfs_symlink success 3381 6347 507 507 profiler2.linked3.o raw_tracepoint__sched_process_exec success 2464 5874 292 189 profiler3.linked3.o raw_tracepoint__sched_process_exec success 2677 4363 397 283 profiler2.linked3.o kprobe__proc_sys_write success 1800 4355 143 138 profiler1.linked3.o raw_tracepoint__sched_process_exec success 1649 4019 333 240 pyperf600_bpf_loop.linked3.o on_event success 2711 3966 306 306 profiler2.linked3.o raw_tracepoint__sched_process_exit success 1234 3138 83 66 profiler3.linked3.o kprobe__proc_sys_write success 1755 2623 223 223 profiler1.linked3.o kprobe__proc_sys_write success 1222 2456 193 193 loop2.linked3.o while_true success 608 1783 57 30 profiler3.linked3.o raw_tracepoint__sched_process_exit success 789 1680 146 146 profiler1.linked3.o raw_tracepoint__sched_process_exit success 592 1526 133 133 strobemeta_bpf_loop.linked3.o on_event success 1015 1512 106 106 loop4.linked3.o combinations success 165 524 18 17 profiler3.linked3.o raw_tracepoint__sched_process_fork success 196 299 25 25 profiler1.linked3.o raw_tracepoint__sched_process_fork success 109 265 19 19 profiler2.linked3.o raw_tracepoint__sched_process_fork success 111 265 19 19 loop5.linked3.o while_true success 47 84 9 9 ------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ ----------- Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-4-andrii@kernel.org
-
Andrii Nakryiko authored
Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF for freplace programs. It's wrong to search in vmlinux BTF and libbpf doesn't even mark vmlinux BTF as required for freplace programs. So trying to search anything in obj->vmlinux_btf might cause NULL dereference if nothing else in BPF object requires vmlinux BTF. Instead, error out if freplace (EXT) program doesn't specify attach_prog_fd during at the load time. Fixes: 91abb4a6 ("libbpf: Support attachment of BPF tracing programs to kernel modules") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org
-
Andrii Nakryiko authored
Use proper SEC("tc") for test_verif_scale{1,3} programs. It's not a problem for selftests right now because we manually set type programmatically, but not having correct SEC() definitions makes it harded to generically load BPF object files. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-2-andrii@kernel.org
-
Jiri Olsa authored
The dispatcher function is attached/detached to trampoline by dispatcher update function. At the same time it's available as ftrace attachable function. After discussion [1] the proposed solution is to use compiler attributes to alter bpf_dispatcher_##name##_func function: - remove it from being instrumented with __no_instrument_function__ attribute, so ftrace has no track of it - but still generate 5 nop instructions with patchable_function_entry(5) attribute, which are expected by bpf_arch_text_poke used by dispatcher update function Enabling HAVE_DYNAMIC_FTRACE_NO_PATCHABLE option for x86, so __patchable_function_entries functions are not part of ftrace/mcount locations. Adding attributes to bpf_dispatcher_XXX function on x86_64 so it's kept out of ftrace locations and has 5 byte nop generated at entry. These attributes need to be arch specific as pointed out by Ilya Leoshkevic in here [2]. The dispatcher image is generated only for x86_64 arch, so the code can stay as is for other archs. [1] https://lore.kernel.org/bpf/20220722110811.124515-1-jolsa@kernel.org/ [2] https://lore.kernel.org/bpf/969a14281a7791c334d476825863ee449964dd0c.camel@linux.ibm.com/Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20220903131154.420467-3-jolsa@kernel.org
-
Peter Zijlstra (Intel) authored
x86 will shortly start using -fpatchable-function-entry for purposes other than ftrace, make sure the __patchable_function_entry section isn't merged in the mcount_loc section. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220903131154.420467-2-jolsa@kernel.org
-
Yauheni Kaliuta authored
The full CAP_SYS_ADMIN requirement for blinding looks too strict nowadays. These days given unprivileged BPF is disabled by default, the main users for constant blinding coming from unprivileged in particular via cBPF -> eBPF migration (e.g. old-style socket filters). Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831090655.156434-1-ykaliuta@redhat.com Link: https://lore.kernel.org/bpf/20220905090149.61221-1-ykaliuta@redhat.com
-
Wang Yufen authored
Use kvmemdup_bpfptr helper instead of open-coding to simplify the code. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/1663058433-14089-1-git-send-email-wangyufen@huawei.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
- 15 Sep, 2022 1 commit
-
-
Dave Marchevsky authored
BPF_PTR_POISON was added in commit c0a5a21c ("bpf: Allow storing referenced kptr in map") to denote a bpf_func_proto btf_id which the verifier will replace with a dynamically-determined btf_id at verification time. This patch adds verifier 'poison' functionality to BPF_PTR_POISON in order to prepare for expanded use of the value to poison ret- and arg-btf_id in ongoing work, namely rbtree and linked list patchsets [0, 1]. Specifically, when the verifier checks helper calls, it assumes that BPF_PTR_POISON'ed ret type will be replaced with a valid type before - or in lieu of - the default ret_btf_id logic. Similarly for arg btf_id. If poisoned btf_id reaches default handling block for either, consider this a verifier internal error and fail verification. Otherwise a helper w/ poisoned btf_id but no verifier logic replacing the type will cause a crash as the invalid pointer is dereferenced. Also move BPF_PTR_POISON to existing include/linux/posion.h header and remove unnecessary shift. [0]: lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com [1]: lore.kernel.org/bpf/20220904204145.3089-1-memxor@gmail.com Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220912154544.1398199-1-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 11 Sep, 2022 9 commits
-
-
Dave Marchevsky authored
Verifier logic to confirm that a callback function returns 0 or 1 was added in commit 69c087ba ("bpf: Add bpf_for_each_map_elem() helper"). At the time, callback return value was only used to continue or stop iteration. In order to support callbacks with a broader return value range, such as those added in rbtree series[0] and others, add a callback_ret_range to bpf_func_state. Verifier's helpers which set in_callback_fn will also set the new field, which the verifier will later use to check return value bounds. Default to tnum_range(0, 0) instead of using tnum_unknown as a sentinel value as the latter would prevent the valid range (0, U64_MAX) being used. Previous global default tnum_range(0, 1) is explicitly set for extant callback helpers. The change to global default was made after discussion around this patch in rbtree series [1], goal here is to make it more obvious that callback_ret_range should be explicitly set. [0]: lore.kernel.org/bpf/20220830172759.4069786-1-davemarchevsky@fb.com/ [1]: lore.kernel.org/bpf/20220830172759.4069786-2-davemarchevsky@fb.com/ Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220908230716.2751723-1-davemarchevsky@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Lorenzo Bianconi authored
Check properly the connection tracking entry status configured running bpf_ct_change_status kfunc. Remove unnecessary IPS_CONFIRMED status configuration since it is already done during entry allocation. Fixes: 6eb7fba0 ("selftests/bpf: Add tests for new nf_conntrack kfuncs") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/813a5161a71911378dfac8770ec890428e4998aa.1662623574.git.lorenzo@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Daniel Xu says: ==================== Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Past discussion: - v4: https://lore.kernel.org/bpf/cover.1661192455.git.dxu@dxuuu.xyz/ - v3: https://lore.kernel.org/bpf/cover.1660951028.git.dxu@dxuuu.xyz/ - v2: https://lore.kernel.org/bpf/CAP01T74Sgn354dXGiFWFryu4vg+o8b9s9La1d9zEbC4LGvH4qg@mail.gmail.com/T/ - v1: https://lore.kernel.org/bpf/cover.1660592020.git.dxu@dxuuu.xyz/ Changes since v4: - Use exported function pointer + mutex to handle CONFIG_NF_CONNTRACK=m case Changes since v3: - Use a mutex to protect module load/unload critical section Changes since v2: - Remove use of NOT_INIT for btf_struct_access write path - Disallow nf_conn writing when nf_conntrack module not loaded - Support writing to nf_conn___init:mark Changes since v1: - Add unimplemented stub for when !CONFIG_BPF_SYSCALL ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
Add a simple extension to the existing selftest to write to nf_conn:mark. Also add a failure test for writing to unsupported field. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/f78966b81b9349d2b8ebb4cee2caf15cb6b38ee2.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
Support direct writes to nf_conn:mark from TC and XDP prog types. This is useful when applications want to store per-connection metadata. This is also particularly useful for applications that run both bpf and iptables/nftables because the latter can trivially access this metadata. One example use case would be if a bpf prog is responsible for advanced packet classification and iptables/nftables is later used for routing due to pre-existing/legacy code. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/ebca06dea366e3e7e861c12f375a548cc4c61108.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
These symbols will be used in nf_conntrack.ko to support direct writes to `nf_conn`. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/3c98c19dc50d3b18ea5eca135b4fc3a5db036060.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
Returning a bpf_reg_type only makes sense in the context of a BPF_READ. For writes, prefer to explicitly return 0 for clarity. Note that is non-functional change as it just so happened that NOT_INIT == 0. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/01772bc1455ae16600796ac78c6cc9fff34f95ff.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
Add corresponding unimplemented stub for when CONFIG_BPF_SYSCALL=n Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/4021398e884433b1fef57a4d28361bb9fcf1bd05.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Xu authored
Since commit 27ae7997 ("bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS") there has existed bpf_verifier_ops:btf_struct_access. When btf_struct_access is _unset_ for a prog type, the verifier runs the default implementation, which is to enforce read only: if (env->ops->btf_struct_access) { [...] } else { if (atype != BPF_READ) { verbose(env, "only read is supported\n"); return -EACCES; } [...] } When btf_struct_access is _set_, the expectation is that btf_struct_access has full control over accesses, including if writes are allowed. Rather than carve out an exception for each prog type that may write to BTF ptrs, delete the redundant check and give full control to btf_struct_access. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/962da2bff1238746589e332ff1aecc49403cd7ce.1662568410.git.dxu@dxuuu.xyzSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 10 Sep, 2022 2 commits
-
-
Punit Agrawal authored
In the percpu freelist code, it is a common pattern to iterate over the possible CPUs mask starting with the current CPU. The pattern is implemented using a hand rolled while loop with the loop variable increment being open-coded. Simplify the code by using for_each_cpu_wrap() helper to iterate over the possible cpus starting with the current CPU. As a result, some of the special-casing in the loop also gets simplified. No functional change intended. Signed-off-by: Punit Agrawal <punit.agrawal@bytedance.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20220907155746.1750329-1-punit.agrawal@bytedance.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Tetsuo Handa authored
syzbot is reporting ODEBUG bug in htab_map_alloc() [1], for commit 86fe28f7 ("bpf: Optimize element count in non-preallocated hash map.") added percpu_counter_init() to htab_map_alloc() but forgot to add percpu_counter_destroy() to the error path. Link: https://syzkaller.appspot.com/bug?extid=5d1da78b375c3b5e6c2b [1] Reported-by: syzbot <syzbot+5d1da78b375c3b5e6c2b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 86fe28f7 ("bpf: Optimize element count in non-preallocated hash map.") Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/e2e4cc0e-9d36-4ca1-9bfa-ce23e6f8310b@I-love.SAKURA.ne.jpSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 09 Sep, 2022 5 commits
-
-
Martin KaFai Lau authored
YiFei Zhu says: ==================== Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This series adds the invocation for cgroup/connect{4,6} programs to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW). This also adds a test to verify that the hooks are being called and invoking bpf_bind() from within the hook actually binds the socket. Patch 1 adds the invocation of the hook. Patch 2 deduplicates write_sysctl in BPF test_progs. Patch 3 adds the tests for this hook. v1 -> v2: * Added static to bindaddr_v6 in prog_tests/connect_ping.c * Deduplicated much of the test logic in prog_tests/connect_ping.c * Deduplicated write_sysctl() to test_progs.c v2 -> v3: * Renamed variable "obj" to "skel" for the BPF skeleton object in prog_tests/connect_ping.c v3 -> v4: * Fixed error path to destroy skel in prog_tests/connect_ping.c ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
YiFei Zhu authored
This tests that when an unprivileged ICMP ping socket connects, the hooks are actually invoked. We also ensure that if the hook does not call bpf_bind(), the bound address is unmodified, and if the hook calls bpf_bind(), the bound address is exactly what we provided to the helper. A new netns is used to enable ping_group_range in the test without affecting ouside of the test, because by default, not even root is permitted to use unprivileged ICMP ping... Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/086b227c1b97f4e94193e58aae7576d0261b68a4.1662682323.git.zhuyifei@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
YiFei Zhu authored
This helper is needed in multiple tests. Instead of copying it over and over, better to deduplicate this helper to test_progs.c. test_progs.c is chosen over testing_helpers.c because of this helper's use of CHECK / ASSERT_*, and the CHECK was modified to use ASSERT_* so it does not rely on a duration variable. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/9b4fc9a27bd52f771b657b4c4090fc8d61f3a6b5.1662682323.git.zhuyifei@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
YiFei Zhu authored
Usually when a TCP/UDP connection is initiated, we can bind the socket to a specific IP attached to an interface in a cgroup/connect hook. But for pings, this is impossible, as the hook is not being called. This adds the hook invocation to unprivileged ICMP ping (i.e. ping sockets created with SOCK_DGRAM IPPROTO_ICMP(V6) as opposed to SOCK_RAW. Logic is mirrored from UDP sockets where the hook is invoked during pre_connect, after a check for suficiently sized addr_len. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Link: https://lore.kernel.org/r/5764914c252fad4cd134fb6664c6ede95f409412.1662682323.git.zhuyifei@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>
-
Daniel Borkmann authored
This reverts commit 14e5ce79 ("libbpf: Add GCC support for bpf_tail_call_static"). Reason is that gcc invented their own BPF asm which is not conform with LLVM one, and going forward this would be more painful to maintain here and in other areas of the library. Thus remove it; ask to gcc folks is to align with LLVM one to use exact same syntax. Fixes: 14e5ce79 ("libbpf: Add GCC support for bpf_tail_call_static") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: James Hilliard <james.hilliard1@gmail.com> Cc: Jose E. Marchesi <jose.marchesi@oracle.com>
-
- 07 Sep, 2022 15 commits
-
-
Kumar Kartikeya Dwivedi authored
For a lot of use cases in future patches, we will want to modify the state of registers part of some same 'group' (e.g. same ref_obj_id). It won't just be limited to releasing reference state, but setting a type flag dynamically based on certain actions, etc. Hence, we need a way to easily pass a callback to the function that iterates over all registers in current bpf_verifier_state in all frames upto (and including) the curframe. While in C++ we would be able to easily use a lambda to pass state and the callback together, sadly we aren't using C++ in the kernel. The next best thing to avoid defining a function for each case seems like statement expressions in GNU C. The kernel already uses them heavily, hence they can passed to the macro in the style of a lambda. The statement expression will then be substituted in the for loop bodies. Variables __state and __reg are set to current bpf_func_state and reg for each invocation of the expression inside the passed in verifier state. Then, convert mark_ptr_or_null_regs, clear_all_pkt_pointers, release_reference, find_good_pkt_pointers, find_equal_scalars to use bpf_for_each_reg_in_vstate. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-16-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Kumar Kartikeya Dwivedi authored
We need this helper to skip over special fields (bpf_spin_lock, bpf_timer, kptrs) while zeroing a map value. Use the same logic as copy_map_value but memset instead of memcpy. Currently, the code zeroing map value memory does not have to deal with special fields, hence this is a prerequisite for introducing such support. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-4-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Kumar Kartikeya Dwivedi authored
Enable support for kptrs in percpu BPF arraymap by wiring up the freeing of these kptrs from percpu map elements. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-3-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Kumar Kartikeya Dwivedi authored
bpf_long_memcpy is used while copying to remote percpu regions from BPF syscall and helpers, so that the copy is atomic at word size granularity. This might not be possible when you copy from map value hosting kptrs from or to percpu maps, as the alignment or size in disjoint regions may not be multiple of word size. Hence, to avoid complicating the copy loop, we only use bpf_long_memcpy when special fields are not present, otherwise use normal memcpy to copy the disjoint regions. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220904204145.3089-2-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jules Irenge authored
Sparse reported a warning at bpf_map_free_kptrs() "warning: Using plain integer as NULL pointer" During the process of fixing this warning, it was discovered that the current code erroneously writes to the pointer variable instead of deferencing and writing to the actual kptr. Hence, Sparse tool accidentally helped to uncover this problem. Fix this by doing WRITE_ONCE(*p, 0) instead of WRITE_ONCE(p, 0). Note that the effect of this bug is that unreferenced kptrs will not be cleared during check_and_free_fields. It is not a problem if the clearing is not done during map_free stage, as there is nothing to free for them. Fixes: 14a324f6 ("bpf: Wire up freeing of referenced kptr") Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Link: https://lore.kernel.org/r/Yxi3pJaK6UDjVJSy@playgroundSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Benjamin Tissoires says: ==================== Hi, well, given that the HID changes haven't moved a lot in the past revisions and that I am cc-ing a bunch of people, I have dropped them while we focus on the last 2 requirements in bpf-core changes. I'll submit a HID targeted series when we get these in tree, which would make things a lore more independent. For reference, the whole reasons for these 2 main changes are at https://lore.kernel.org/bpf/20220902132938.2409206-1-benjamin.tissoires@redhat.com/ Compared to v10 (in addition of dropping the HID changes), I have changed the selftests so we can test both light skeletons and libbbpf calls. Cheers, Benjamin ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
We add 2 new kfuncs that are following the RET_PTR_TO_MEM capability from the previous commit. Then we test them in selftests: the first tests are testing valid case, and are not failing, and the later ones are actually preventing the program to be loaded because they are wrong. To work around that, we mark the failing ones as not autoloaded (with SEC("?tc")), and we manually enable them one by one, ensuring the verifier rejects them. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-8-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
For drivers (outside of network), the incoming data is not statically defined in a struct. Most of the time the data buffer is kzalloc-ed and thus we can not rely on eBPF and BTF to explore the data. This commit allows to return an arbitrary memory, previously allocated by the driver. An interesting extra point is that the kfunc can mark the exported memory region as read only or read/write. So, when a kfunc is not returning a pointer to a struct but to a plain type, we can consider it is a valid allocated memory assuming that: - one of the arguments is either called rdonly_buf_size or rdwr_buf_size - and this argument is a const from the caller point of view We can then use this parameter as the size of the allocated memory. The memory is either read-only or read-write based on the name of the size parameter. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-7-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
net/bpf/test_run.c is already presenting 20 kfuncs. net/netfilter/nf_conntrack_bpf.c is also presenting an extra 10 kfuncs. Given that all the kfuncs are regrouped into one unique set, having only 2 space left prevent us to add more selftests. Bump it to 256. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-6-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
We need to also export the kfunc set to the syscall program type, and then add a couple of eBPF programs that are testing those calls. The first one checks for valid access, and the second one is OK from a static analysis point of view but fails at run time because we are trying to access outside of the allocated memory. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-5-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
When a function was trying to access data from context in a syscall eBPF program, the verifier was rejecting the call unless it was accessing the first element. This is because the syscall context is not known at compile time, and so we need to check this when actually accessing it. Check for the valid memory access if there is no convert_ctx callback, and allow such situation to happen. Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220906151303.2780789-4-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
btf_check_subprog_arg_match() was used twice in verifier.c: - when checking for the type mismatches between a (sub)prog declaration and BTF - when checking the call of a subprog to see if the provided arguments are correct and valid This is problematic when we check if the first argument of a program (pointer to ctx) is correctly accessed: To be able to ensure we access a valid memory in the ctx, the verifier assumes the pointer to context is not null. This has the side effect of marking the program accessing the entire context, even if the context is never dereferenced. For example, by checking the context access with the current code, the following eBPF program would fail with -EINVAL if the ctx is set to null from the userspace: ``` SEC("syscall") int prog(struct my_ctx *args) { return 0; } ``` In that particular case, we do not want to actually check that the memory is correct while checking for the BTF validity, but we just want to ensure that the (sub)prog definition matches the BTF we have. So split btf_check_subprog_arg_match() in two so we can actually check for the memory used when in a call, and ignore that part when not. Note that a further patch is in preparation to disentangled btf_check_func_arg_match() from these two purposes, and so right now we just add a new hack around that by adding a boolean to this function. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220906151303.2780789-3-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Benjamin Tissoires authored
Similar to tools/testing/selftests/bpf/prog_tests/dynptr.c: we declare an array of tests that we run one by one in a for loop. Followup patches will add more similar-ish tests, so avoid a lot of copy paste by grouping the declaration in an array. For light skeletons, we have to rely on the offsetof() macro so we can statically declare which program we are using. In the libbpf case, we can rely on bpf_object__find_program_by_name(). So also change the Makefile to generate both light skeletons and normal ones. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220906151303.2780789-2-benjamin.tissoires@redhat.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Yonghong Song says: ==================== Currently struct arguments are not supported for trampoline based progs. One of major reason is that struct argument may pass by value which may use more than one registers. This breaks trampoline progs where each argument is assumed to take one register. bcc community reported the issue ([1]) where struct argument is not supported for fentry program. typedef struct { uid_t val; } kuid_t; typedef struct { gid_t val; } kgid_t; int security_path_chown(struct path *path, kuid_t uid, kgid_t gid); Inside Meta, we also have a use case to attach to tcp_setsockopt() typedef struct { union { void *kernel; void __user *user; }; bool is_kernel : 1; } sockptr_t; int tcp_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval, unsigned int optlen); This patch added struct value support for bpf tracing programs which uses trampoline. Only <= 16 byte struct size is supported for now which covers use cases in the above. For x86/arm64/bpf, <= 16 struct value will be passed in registers instead of by reference. Only x86_64 is supported in this patch. arm64 support can be added later. [1] https://github.com/iovisor/bcc/issues/3657 Changelog: v3 -> v4: - fix a test failure where no casting for bpf_get_func_arg() value as the value type is 'int'. - add tracing_struct test in DENYLIST.s390x - simplify macro BPF_REG_CNT for BPF_PROG2. v2 -> v3: - previously struct arguments (<= 16 bytes) are passed by reference for bpf programs. Suggested by Alexei, it is passed by value now. - in order to support passing <= 16 struct value, a new macro BPF_PROG2 is invented. rfc v1 -> v2: - changed bpf_func_model struct info fields to arg_flags[] to make it easy to iterate arguments in arch specific {save|restore}_regs() functions. - added fexit tests to test return values with struct arguments. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Add tracing_struct test in DENYLIST.s390x since s390x does not support trampoline now. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152723.2081551-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-