- 26 Oct, 2022 21 commits
-
-
Daniel Müller authored
When running tests, we should probably accept any help we can get when it comes to detecting issues early or making them more debuggable. We have seen a few cases where a test_progs_noalu32 run, for example, encountered a soft lockup and stopped making progress. It was only interrupted once we hit the overall test timeout [0]. We can not and do not want to necessarily rely on test timeouts, because those rely on infrastructure provided by the environment we run in (and which is not present in tools/testing/selftests/bpf/vmtest.sh, for example). To that end, let's enable panics on soft as well as hard lockups to fail fast should we encounter one. That's happening in the configuration indented to be used for selftests (including when using vmtest.sh or when running in BPF CI). [0] https://github.com/kernel-patches/bpf/runs/7844499997Signed-off-by: Daniel Müller <deso@posteo.net> Link: https://lore.kernel.org/r/20221025231546.811766-1-deso@posteo.netSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Yonghong Song says: ==================== There already exists a local storage implementation for cgroup-attached bpf programs. See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper bpf_get_local_storage(). But there are use cases such that non-cgroup attached bpf progs wants to access cgroup local storage data. For example, tc egress prog has access to sk and cgroup. It is possible to use sk local storage to emulate cgroup local storage by storing data in socket. But this is a waste as it could be lots of sockets belonging to a particular cgroup. Alternatively, a separate map can be created with cgroup id as the key. But this will introduce additional overhead to manipulate the new map. A cgroup local storage, similar to existing sk/inode/task storage, should help for this use case. This patch implemented new cgroup local storage available to non-cgroup-attached bpf programs. In the patch series, Patches 1 and 2 are preparation patches. Patch 3 implemented new cgroup local storage kernel support. Patches 4 and 5 implemented libbpf and bpftool support. Patches 6-8 fixed one existing test and added four new tests to validate kernel/libbpf implementations. Patch 9 added documentation for new BPF_MAP_TYPE_CGRP_STORAGE map type and comparison of the old and new cgroup local storage maps. Changelogs: v5 -> v6: . fix selftest test_libbpf_str/bpf_map_type_str due to marking BPF_MAP_TYPE_CGROUP_STORAGE as deprecated. . add cgrp_local_storage test in s390x denylist since the test has some fentry/fexit programs. v4 -> v5: . additional refactoring in patch 2 . fix the call site for bpf_cgrp_storage_free() in kernel/cgroup/cgroup.c. . add a test for progs attaching to cgroups . add a negative test (the helper key is a task instead of expected cgroup) . some spelling fixes v3 -> v4: . fix a config guarding problem in kernel/cgroup/cgroup.c when cgrp_storage is deleted (CONFIG_CGROUP_BPF => CONFIG_BPF_SYSCALL). . rename selftest from cgroup_local_storage.c to cgrp_local_storage.c so the name can better align with map name. . fix a few misspellings. v2 -> v3: . fix a config caused kernel test complaint. . better description/comments in uapi bpf.h and bpf_cgrp_storage.c. . factor code for better resue for map_alloc/map_free. . improved explanation in map documentation. v1 -> v2: . change map name from BPF_MAP_TYPE_CGROUP_LOCAL_STORAGE to BPF_MAP_TYPE_CGRP_STORAGE. . removed support of sleepable programs. . changed the place of freeing cgrp local storage from put_css_set_locked() to css_free_rwork_fn(). . added map documentation. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Add some descriptions and examples for BPF_MAP_TYPE_CGRP_STORAGE. Also illustate the major difference between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE and recommend to use BPF_MAP_TYPE_CGRP_STORAGE instead of BPF_MAP_TYPE_CGROUP_STORAGE in the end. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042922.676383-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Test cgrp_local_storage have some programs utilizing trampoline. Arch s390x does not support trampoline so add the test to the corresponding DENYLIST file. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042917.675685-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Add four tests for new cgroup local storage, (1) testing bpf program helpers and user space map APIs, (2) testing recursive fentry triggering won't deadlock, (3) testing progs attached to cgroups, and (4) a negative test if the bpf_cgrp_storage_get() helper key is not a cgroup btf id. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042911.675546-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Previous bpf patch made a change to uapi bpf.h like @@ -922,7 +922,14 @@ enum bpf_map_type { BPF_MAP_TYPE_SOCKHASH, - BPF_MAP_TYPE_CGROUP_STORAGE, + BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, + BPF_MAP_TYPE_CGROUP_STORAGE = BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, where BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED and BPF_MAP_TYPE_CGROUP_STORAGE have the same enum value. This will cause selftest test_libbpf_str/bpf_map_type_str failing. This patch fixed the issue by avoid the check for BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED in the test. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042906.674830-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Add support for new cgroup local storage Acked-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042901.674177-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Add support for new cgroup local storage. Acked-by: David Vernet <void@manifault.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042856.673989-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Similar to sk/inode/task storage, implement similar cgroup local storage. There already exists a local storage implementation for cgroup-attached bpf programs. See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper bpf_get_local_storage(). But there are use cases such that non-cgroup attached bpf progs wants to access cgroup local storage data. For example, tc egress prog has access to sk and cgroup. It is possible to use sk local storage to emulate cgroup local storage by storing data in socket. But this is a waste as it could be lots of sockets belonging to a particular cgroup. Alternatively, a separate map can be created with cgroup id as the key. But this will introduce additional overhead to manipulate the new map. A cgroup local storage, similar to existing sk/inode/task storage, should help for this use case. The life-cycle of storage is managed with the life-cycle of the cgroup struct. i.e. the storage is destroyed along with the owning cgroup with a call to bpf_cgrp_storage_free() when cgroup itself is deleted. The userspace map operations can be done by using a cgroup fd as a key passed to the lookup, update and delete operations. Typically, the following code is used to get the current cgroup: struct task_struct *task = bpf_get_current_task_btf(); ... task->cgroups->dfl_cgrp ... and in structure task_struct definition: struct task_struct { .... struct css_set __rcu *cgroups; .... } With sleepable program, accessing task->cgroups is not protected by rcu_read_lock. So the current implementation only supports non-sleepable program and supporting sleepable program will be the next step together with adding rcu_read_lock protection for rcu tagged structures. Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used for cgroup storage available to non-cgroup-attached bpf programs. The old cgroup storage supports bpf_get_local_storage() helper to get the cgroup data. The new cgroup storage helper bpf_cgrp_storage_get() can provide similar functionality. While old cgroup storage pre-allocates storage memory, the new mechanism can also pre-allocate with a user space bpf_map_update_elem() call to avoid potential run-time memory allocation failure. Therefore, the new cgroup storage can provide all functionality w.r.t. the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can be deprecated since the new one can provide the same functionality. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Refactor codes so that inode/task/sk storage implementation can maximally share the same code. I also added some comments in new function bpf_local_storage_unlink_nolock() to make codes easy to understand. There is no functionality change. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042845.672944-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Yonghong Song authored
Make struct cgroup btf id global so later patch can reuse the same btf id. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042840.672602-1-yhs@fb.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Martin KaFai Lau says: ==================== From: Martin KaFai Lau <martin.lau@kernel.org> The commit bc235cdb ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]") added deadlock detection to avoid a tracing program from recurring on the bpf_task_storage_{get,delete}() helpers. These helpers acquire a spin lock and it will lead to deadlock. It is unnecessary for the bpf_lsm and bpf_iter programs which do not recur. The situation is the same as the existing bpf_pid_task_storage_{lookup,delete}_elem() which are used in the syscall and they also do not have deadlock detection. This set is to add new bpf_task_storage_{get,delete}() helper proto without the deadlock detection. The set also removes the prog->active check from the bpf_lsm and bpf_iter program. Please see the individual patch for details. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch modifies the task_ls_recursion test to check that the first bpf_task_storage_get(&map_a, ...) in BPF_PROG(on_update) can still do the lockless lookup even it cannot acquire the percpu busy lock. If the lookup succeeds, it will increment the value by 1 and the value in the task storage map_a will become 200+1=201. After that, BPF_PROG(on_update) tries to delete from map_a and should get -EBUSY because it cannot acquire the percpu busy lock after finding the data. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-10-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch adds a test to check for deadlock failure in bpf_task_storage_{get,delete} when called by a sleepable bpf_lsm prog. It also checks if the prog_info.recursion_misses is non zero. The test starts with 32 threads and they are affinitized to one cpu. In my qemu setup, with CONFIG_PREEMPT=y, I can reproduce it within one second if it is run without the previous patches of this set. Here is the test error message before adding the no deadlock detection version of the bpf_task_storage_{get,delete}: test_nodeadlock:FAIL:bpf_task_storage_get busy unexpected bpf_task_storage_get busy: actual 2 != expected 0 test_nodeadlock:FAIL:bpf_task_storage_delete busy unexpected bpf_task_storage_delete busy: actual 2 != expected 0 Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-9-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
The bpf_lsm and bpf_iter do not recur that will cause a deadlock. The situation is similar to the bpf_pid_task_storage_delete_elem() which is called from the syscall map_delete_elem. It does not need deadlock detection. Otherwise, it will cause unnecessary failure when calling the bpf_task_storage_delete() helper. This patch adds bpf_task_storage_delete proto that does not do deadlock detection. It will be used by bpf_lsm and bpf_iter program. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-8-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
Similar to the earlier change in bpf_task_storage_get_recur. This patch changes bpf_task_storage_delete_recur such that it does the lookup first. It only returns -EBUSY if it needs to take the spinlock to do the deletion when potential deadlock is detected. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-7-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
The bpf_lsm and bpf_iter do not recur that will cause a deadlock. The situation is similar to the bpf_pid_task_storage_lookup_elem() which is called from the syscall map_lookup_elem. It does not need deadlock detection. Otherwise, it will cause unnecessary failure when calling the bpf_task_storage_get() helper. This patch adds bpf_task_storage_get proto that does not do deadlock detection. It will be used by bpf_lsm and bpf_iter programs. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-6-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
bpf_task_storage_get() does a lookup and optionally inserts new data if BPF_LOCAL_STORAGE_GET_F_CREATE is present. During lookup, it will cache the lookup result and caching requires to acquire a spinlock. When potential deadlock is detected (by the bpf_task_storage_busy pcpu-counter added in commit bc235cdb ("bpf: Prevent deadlock from recursive bpf_task_storage_[get|delete]")), the current behavior is returning NULL immediately to avoid deadlock. It is too pessimistic. This patch will go ahead to do a lookup (which is a lockless operation) but it will avoid caching it in order to avoid acquiring the spinlock. When lookup fails to find the data and BPF_LOCAL_STORAGE_GET_F_CREATE is set, an insertion is needed and this requires acquiring a spinlock. This patch will still return NULL when a potential deadlock is detected. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-5-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch creates a new function __bpf_task_storage_get() and moves the core logic of the existing bpf_task_storage_get() into this new function. This new function will be shared by another new helper proto in the latter patch. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-4-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
This patch adds the "_recur" naming to the bpf_task_storage_{get,delete} proto. In a latter patch, they will only be used by the tracing programs that requires a deadlock detection because a tracing prog may use bpf_task_storage_{get,delete} recursively and cause a deadlock. Another following patch will add a different helper proto for the non tracing programs because they do not need the deadlock prevention. This patch does this rename to prepare for this future proto additions. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-3-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Martin KaFai Lau authored
The commit 64696c40 ("bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampoline") removed prog->active check for struct_ops prog. The bpf_lsm and bpf_iter is also using trampoline. Like struct_ops, the bpf_lsm and bpf_iter have fixed hooks for the prog to attach. The kernel does not call the same hook in a recursive way. This patch also removes the prog->active check for bpf_lsm and bpf_iter. A later patch has a test to reproduce the recursion issue for a sleepable bpf_lsm program. This patch appends the '_recur' naming to the existing enter and exit functions that track the prog->active counter. New __bpf_prog_{enter,exit}[_sleepable] function are added to skip the prog->active tracking. The '_struct_ops' version is also removed. It also moves the decision on picking the enter and exit function to the new bpf_trampoline_{enter,exit}(). It returns the '_recur' ones for all tracing progs to use. For bpf_lsm, bpf_iter, struct_ops (no prog->active tracking after 64696c40), and bpf_lsm_cgroup (no prog->active tracking after 69fd337a), it will return the functions that don't track the prog->active. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221025184524.3526117-2-martin.lau@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 25 Oct, 2022 19 commits
-
-
Alan Maguire authored
When examining module BTF, it is common to see core kernel structures such as sk_buff, net_device duplicated in the module. After adding debug messaging to BTF it turned out that much of the problem was down to the identical struct test failing during deduplication; sometimes the compiler adds identical structs. However it turns out sometimes that type ids of identical struct members can also differ, even when the containing structs are still identical. To take an example, for struct sk_buff, debug messaging revealed that the identical struct matching was failing for the anon struct "headers"; specifically for the first field: __u8 __pkt_type_offset[0]; /* 128 0 */ Looking at the code in BTF deduplication, we have code that guards against the possibility of identical struct definitions, down to type ids, and identical array definitions. However in this case we have a struct which is being defined twice but does not have identical type ids since each duplicate struct has separate type ids for the above array member. A similar problem (though not observed) could occur for struct-in-struct. The solution is to make the "identical struct" test check members not just for matching ids, but to also check if they in turn are identical structs or arrays. The results of doing this are quite dramatic (for some modules at least); I see the number of type ids drop from around 10000 to just over 1000 in one module for example. For testing use latest pahole or apply [1], otherwise dedups can fail for the reasons described there. Also fix return type of btf_dedup_identical_arrays() as suggested by Andrii to match boolean return type used elsewhere. Fixes: efdd3eb8 ("libbpf: Accommodate DWARF/compiler bug with duplicated structs") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1666622309-22289-1-git-send-email-alan.maguire@oracle.com [1] https://lore.kernel.org/bpf/1666364523-9648-1-git-send-email-alan.maguire
-
Alexei Starovoitov authored
Jiri Olsa says: ==================== hi, Martynas reported kprobe _multi link does not resolve symbols from kernel modules, which attach by address works. In addition while fixing that I realized we do not take module reference if the module has kprobe_multi link on top of it and can be removed. There's mo crash related to this, it will silently disappear from ftrace tables, while kprobe_multi link stays up with no data. This patchset has fixes for both issues. v3 changes: - reorder fields in struct bpf_kprobe_multi_link [Andrii] - added ack [Andrii] v2 changes: - added acks (Song) - added comment to kallsyms_callback (Song) - change module_callback realloc logic (Andrii) - get rid of macros in tests (Andrii) thanks, jirka ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding kprobe_multi kmod attach api tests that attach bpf_testmod functions via bpf_program__attach_kprobe_multi_opts. Running it as serial test, because we don't want other tests to reload bpf_testmod while it's running. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-9-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding test that makes sure the kernel module won't be removed if there's kprobe multi link defined on top of it. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-8-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding 3 bpf_testmod_fentry_* functions to have a way to test kprobe multi link on kernel module. They follow bpf_fentry_test* functions prototypes/code. Adding equivalent functions to all bpf_fentry_test* does not seems necessary at the moment, could be added later. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-7-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Adding load_kallsyms_refresh function to re-read symbols from /proc/kallsyms file. This will be needed to get proper functions addresses from bpf_testmod.ko module, which is loaded/unloaded several times during the tests run, so symbols might be already old when we need to use them. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-6-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Currently we allow to create kprobe multi link on function from kernel module, but we don't take the module reference to ensure it's not unloaded while we are tracing it. The multi kprobe link is based on fprobe/ftrace layer which takes different approach and releases ftrace hooks when module is unloaded even if there's tracer registered on top of it. Adding code that gathers all the related modules for the link and takes their references before it's attached. All kernel module references are released after link is unregistered. Note that we do it the same way already for trampoline probes (but for single address). Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-5-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Renaming __bpf_kprobe_multi_cookie_cmp to bpf_kprobe_multi_addrs_cmp, because it's more suitable to current and upcoming code. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-4-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Currently ftrace_lookup_symbols iterates only over core symbols, adding module_kallsyms_on_each_symbol call to check on modules symbols as well. Also removing 'args.found == args.cnt' condition, because it's already checked in kallsyms_callback function. Also removing 'err < 0' check, because both *kallsyms_on_each_symbol functions do not return error. Reported-by: Martynas Pumputis <m@lambda.lt> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-3-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiri Olsa authored
Making module_kallsyms_on_each_symbol generally available, so it can be used outside CONFIG_LIVEPATCH option in following changes. Rather than adding another ifdef option let's make the function generally available (when CONFIG_KALLSYMS and CONFIG_MODULES options are defined). Cc: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20221025134148.3300700-2-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Alexei Starovoitov authored
Quentin Monnet says: ==================== To disassemble instructions for JIT-ed programs, bpftool has relied on the libbfd library. This has been problematic in the past: libbfd's interface is not meant to be stable and has changed several times, hence the detection of the two related features from the Makefile (disassembler-four-args and disassembler-init-styled). When it comes to shipping bpftool, this has also caused issues with several distribution maintainers unwilling to support the feature (for example, Debian's page for binutils-dev, libbfd's package, says: "Note that building Debian packages which depend on the shared libbfd is Not Allowed."). This patchset adds support for LLVM as the primary library for disassembling instructions for JIT-ed programs. We keep libbfd as a fallback. One reason for this is that currently it works well, we have all we need in terms of features detection in the Makefile, so it provides a fallback for disassembling JIT-ed programs if libbfd is installed but LLVM is not. The other reason is that libbfd supports nfp instruction for Netronome's SmartNICs and can be used to disassemble offloaded programs, something that LLVM cannot do (Niklas confirmed that the feature is still in use). However, if libbfd's interface breaks again in the future, we might reconsider keeping support for it. v4: - Rebase to address a conflict with commit 2c76238e ("bpftool: Add "bootstrap" feature to version output"). v3: - Extend commit description (patch 6) with notes on llvm-dev and LLVM's disassembler stability. v2: - Pass callback when creating the LLVM disassembler, so that the branch targets are printed as addresses (instead of byte offsets). - Add last commit to "support" other arch with LLVM, although we don't know any supported triple yet. - Use $(LLVM_CONFIG) instead of llvm-config in Makefile. - Pass components to llvm-config --libs to limit the number of libraries to pass on the command line, in Makefile. - Rebase split of FEATURE_TESTS and FEATURE_DISPLAY in Makefile. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
Similarly to "libbfd", add a "llvm" feature to the output of command "bpftool version" to indicate that LLVM is used for disassembling JIT-ed programs. This feature is mutually exclusive (from Makefile definitions) with "libbfd". Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221025150329.97371-9-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
For offloaded BPF programs, instead of failing to create the LLVM disassembler without even looking for a triple at all, do run the function that attempts to retrieve a valid architecture name for the device. It will still fail for the LLVM disassembler, because currently we have no valid triple to return (NFP disassembly is not supported by LLVM). But failing in that function is more logical than to assume in jit_disasm.c that passing an "arch" name is simply not supported. Suggested-by: Song Liu <song@kernel.org> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/r/20221025150329.97371-8-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
To disassemble instructions for JIT-ed programs, bpftool has relied on the libbfd library. This has been problematic in the past: libbfd's interface is not meant to be stable and has changed several times. For building bpftool, we have to detect how the libbfd version on the system behaves, which is why we have to handle features disassembler-four-args and disassembler-init-styled in the Makefile. When it comes to shipping bpftool, this has also caused issues with several distribution maintainers unwilling to support the feature (see for example Debian's page for binutils-dev, which ships libbfd: "Note that building Debian packages which depend on the shared libbfd is Not Allowed." [0]). For these reasons, we add support for LLVM as an alternative to libbfd for disassembling instructions of JIT-ed programs. Thanks to the preparation work in the previous commits, it's easy to add the library by passing the relevant compilation options in the Makefile, and by adding the functions for setting up the LLVM disassembler in file jit_disasm.c. The LLVM disassembler requires the LLVM development package (usually llvm-dev or llvm-devel). The expectation is that the interface for this disassembler will be more stable. There is a note in LLVM's Developer Policy [1] stating that the stability for the C API is "best effort" and not guaranteed, but at least there is some effort to keep compatibility when possible (which hasn't really been the case for libbfd so far). Furthermore, the Debian page for the related LLVM package does not caution against linking to the lib, as binutils-dev page does. Naturally, the display of disassembled instructions comes with a few minor differences. Here is a sample output with libbfd (already supported before this patch): # bpftool prog dump jited id 56 bpf_prog_6deef7357e7b4530: 0: nopl 0x0(%rax,%rax,1) 5: xchg %ax,%ax 7: push %rbp 8: mov %rsp,%rbp b: push %rbx c: push %r13 e: push %r14 10: mov %rdi,%rbx 13: movzwq 0xb4(%rbx),%r13 1b: xor %r14d,%r14d 1e: or $0x2,%r14d 22: mov $0x1,%eax 27: cmp $0x2,%r14 2b: jne 0x000000000000002f 2d: xor %eax,%eax 2f: pop %r14 31: pop %r13 33: pop %rbx 34: leave 35: ret LLVM supports several variants that we could set when initialising the disassembler, for example with: LLVMSetDisasmOptions(*ctx, LLVMDisassembler_Option_AsmPrinterVariant); but the default printer is used for now. Here is the output with LLVM: # bpftool prog dump jited id 56 bpf_prog_6deef7357e7b4530: 0: nopl (%rax,%rax) 5: nop 7: pushq %rbp 8: movq %rsp, %rbp b: pushq %rbx c: pushq %r13 e: pushq %r14 10: movq %rdi, %rbx 13: movzwq 180(%rbx), %r13 1b: xorl %r14d, %r14d 1e: orl $2, %r14d 22: movl $1, %eax 27: cmpq $2, %r14 2b: jne 0x2f 2d: xorl %eax, %eax 2f: popq %r14 31: popq %r13 33: popq %rbx 34: leave 35: retq The LLVM disassembler comes as the default choice, with libbfd as a fall-back. Of course, we could replace libbfd entirely and avoid supporting two different libraries. One reason for keeping libbfd is that, right now, it works well, we have all we need in terms of features detection in the Makefile, so it provides a fallback for disassembling JIT-ed programs if libbfd is installed but LLVM is not. The other motivation is that libbfd supports nfp instruction for Netronome's SmartNICs and can be used to disassemble offloaded programs, something that LLVM cannot do. If libbfd's interface breaks again in the future, we might reconsider keeping support for it. [0] https://packages.debian.org/buster/binutils-dev [1] https://llvm.org/docs/DeveloperPolicy.html#c-api-changesSigned-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221025150329.97371-7-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
Refactor disasm_print_insn() to extract the code specific to libbfd and move it to dedicated functions. There is no functional change. This is in preparation for supporting an alternative library for disassembling the instructions. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20221025150329.97371-6-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
Bpftool uses libbfd for disassembling JIT-ed programs. But the feature is optional, and the tool can be compiled without libbfd support. The Makefile sets the relevant variables accordingly. It also sets variables related to libbfd's interface, given that it has changed over time. Group all those libbfd-related definitions so that it's easier to understand what we are testing for, and only use variables related to libbfd's interface if we need libbfd in the first place. In addition to make the Makefile clearer, grouping the definitions related to disassembling JIT-ed programs will help support alternatives to libbfd. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20221025150329.97371-5-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
Make FEATURE_TESTS and FEATURE_DISPLAY easier to read and less likely to be subject to conflicts on updates by having one feature per line. Suggested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20221025150329.97371-4-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
The JIT disassembler in bpftool is the only components (with the JSON writer) using asserts to check the return values of functions. But it does not do so in a consistent way, and diasm_print_insn() returns no value, although sometimes the operation failed. Remove the asserts, and instead check the return values, print messages on errors, and propagate the error to the caller from prog.c. Remove the inclusion of assert.h from jit_disasm.c, and also from map.c where it is unused. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20221025150329.97371-3-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-
Quentin Monnet authored
_GNU_SOURCE is defined in several source files for bpftool, but only one of them takes the precaution of checking whether the value is already defined. Add #ifndef for other occurrences too. This is in preparation for the support of disassembling JIT-ed programs with LLVM, with $(llvm-config --cflags) passing -D_GNU_SOURCE as a compilation argument. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20221025150329.97371-2-quentin@isovalent.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>
-