Commits · e7cf9a48f8d634a3c65a28715158b6f6c0540e71 · Kirill Smelkov / linux

21 Aug, 2023 39 commits

selftests/bpf: Add uprobe_multi cookie test · e7cf9a48

Jiri Olsa authored Aug 09, 2023

Adding test for cookies setup/retrieval in uprobe_link uprobes
and making sure bpf_get_attach_cookie works properly.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-27-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e7cf9a48

selftests/bpf: Add uprobe_multi usdt bench test · 85209e83

Jiri Olsa authored Aug 09, 2023

Adding test that attaches 50k usdt probes in usdt_multi binary.

After the attach is done we run the binary and make sure we get
proper amount of hits.

With current uprobes:

  # perf stat --null ./test_progs -n 254/6
  #254/6   uprobe_multi_test/bench_usdt:OK
  #254     uprobe_multi_test:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

   Performance counter stats for './test_progs -n 254/6':

      1353.659680562 seconds time elapsed

With uprobe_multi link:

  # perf stat --null ./test_progs -n 254/6
  #254/6   uprobe_multi_test/bench_usdt:OK
  #254     uprobe_multi_test:OK
  Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

   Performance counter stats for './test_progs -n 254/6':

         0.322046364 seconds time elapsed
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-26-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

85209e83

selftests/bpf: Add uprobe_multi usdt test code · 4cde2d8a

Jiri Olsa authored Aug 09, 2023

Adding code in uprobe_multi test binary that defines 50k usdts
and will serve as attach point for uprobe_multi usdt bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-25-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

4cde2d8a

selftests/bpf: Add uprobe_multi bench test · 3706919e

Jiri Olsa authored Aug 09, 2023

Adding test that attaches 50k uprobes in uprobe_multi binary.

After the attach is done we run the binary and make sure we
get proper amount of hits.

The resulting attach/detach times on my setup:

  test_bench_attach_uprobe:PASS:uprobe_multi__open 0 nsec
  test_bench_attach_uprobe:PASS:uprobe_multi__attach 0 nsec
  test_bench_attach_uprobe:PASS:uprobes_count 0 nsec
  test_bench_attach_uprobe: attached in   0.346s
  test_bench_attach_uprobe: detached in   0.419s
  #262/5   uprobe_multi_test/bench_uprobe:OK
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-24-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

3706919e

selftests/bpf: Add uprobe_multi test program · 519dfeaf

Jiri Olsa authored Aug 09, 2023

Adding uprobe_multi test program that defines 50k uprobe_multi_func_*
functions and will serve as attach point for uprobe_multi bench test
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-23-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

519dfeaf

selftests/bpf: Add uprobe_multi link test · a93d22ea

Jiri Olsa authored Aug 09, 2023

Adding uprobe_multi test for bpf_link_create attach function.

Testing attachment using the struct bpf_link_create_opts.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-22-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a93d22ea

selftests/bpf: Add uprobe_multi api test · ffc68903

Jiri Olsa authored Aug 09, 2023

Adding uprobe_multi test for bpf_program__attach_uprobe_multi
attach function.

Testing attachment using glob patterns and via bpf_uprobe_multi_opts
paths/syms fields.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-21-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ffc68903

selftests/bpf: Add uprobe_multi skel test · 75b37157

Jiri Olsa authored Aug 09, 2023

Adding uprobe_multi test for skeleton load/attach functions,
to test skeleton auto attach for uprobe_multi link.

Test that bpf_get_func_ip works properly for uprobe_multi
attachment.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-20-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

75b37157

selftests/bpf: Move get_time_ns to testing_helpers.h · 3830d04a

Jiri Olsa authored Aug 09, 2023

We'd like to have single copy of get_time_ns used b bench and test_progs,
but we can't just include bench.h, because of conflicting 'struct env'
objects.

Moving get_time_ns to testing_helpers.h which is being included by both
bench and test_progs objects.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-19-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

3830d04a

libbpf: Add uprobe multi link support to bpf_program__attach_usdt · 5902da6d

Jiri Olsa authored Aug 09, 2023

Adding support for usdt_manager_attach_usdt to use uprobe_multi
link to attach to usdt probes.

The uprobe_multi support is detected before the usdt program is
loaded and its expected_attach_type is set accordingly.

If uprobe_multi support is detected the usdt_manager_attach_usdt
gathers uprobes info and calls bpf_program__attach_uprobe to
create all needed uprobes.

If uprobe_multi support is not detected the old behaviour stays.

Also adding usdt.s program section for sleepable usdt probes.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5902da6d

libbpf: Add uprobe multi link detection · 7e1b4681

Jiri Olsa authored Aug 09, 2023

Adding uprobe-multi link detection. It will be used later in
bpf_program__attach_usdt function to check and use uprobe_multi
link over standard uprobe links.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

7e1b4681

libbpf: Add support for u[ret]probe.multi[.s] program sections · 5bfdd32d

Jiri Olsa authored Aug 09, 2023

Adding support for several uprobe_multi program sections
to allow auto attach of multi_uprobe programs.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5bfdd32d

libbpf: Add bpf_program__attach_uprobe_multi function · 3140cf12

Jiri Olsa authored Aug 09, 2023

Adding bpf_program__attach_uprobe_multi function that
allows to attach multiple uprobes with uprobe_multi link.

The user can specify uprobes with direct arguments:

  binary_path/func_pattern/pid

or with struct bpf_uprobe_multi_opts opts argument fields:

  const char **syms;
  const unsigned long *offsets;
  const unsigned long *ref_ctr_offsets;
  const __u64 *cookies;

User can specify 2 mutually exclusive set of inputs:

 1) use only path/func_pattern/pid arguments

 2) use path/pid with allowed combinations of:
    syms/offsets/ref_ctr_offsets/cookies/cnt

    - syms and offsets are mutually exclusive
    - ref_ctr_offsets and cookies are optional

Any other usage results in error.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

3140cf12

libbpf: Add bpf_link_create support for multi uprobes · 5054a303

Jiri Olsa authored Aug 09, 2023

Adding new uprobe_multi struct to bpf_link_create_opts object
to pass multiple uprobe data to link_create attr uapi.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5054a303

libbpf: Add elf_resolve_pattern_offsets function · e613d1d0

Jiri Olsa authored Aug 09, 2023

Adding elf_resolve_pattern_offsets function that looks up
offsets for symbols specified by pattern argument.

The 'pattern' argument allows wildcards (*?' supported).

Offsets are returned in allocated array together with its
size and needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e613d1d0

libbpf: Add elf_resolve_syms_offsets function · 7ace84c6

Jiri Olsa authored Aug 09, 2023

Adding elf_resolve_syms_offsets function that looks up
offsets for symbols specified in syms array argument.

Offsets are returned in allocated array with the 'cnt' size,
that needs to be released by the caller.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

7ace84c6

libbpf: Add elf symbol iterator · 3774705d

Jiri Olsa authored Aug 09, 2023

Adding elf symbol iterator object (and some functions) that follow
open-coded iterator pattern and some functions to ease up iterating
elf object symbols.

The idea is to iterate single symbol section with:

  struct elf_sym_iter iter;
  struct elf_sym *sym;

  if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM))
        goto error;

  while ((sym = elf_sym_iter_next(&iter))) {
        ...
  }

I considered opening the elf inside the iterator and iterate all symbol
sections, but then it gets more complicated wrt user checks for when
the next section is processed.

Plus side is the we don't need 'exit' function, because caller/user is
in charge of that.

The returned iterated symbol object from elf_sym_iter_next function
is placed inside the struct elf_sym_iter, so no extra allocation or
argument is needed.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

3774705d

libbpf: Add elf_open/elf_close functions · f90eb70d

Jiri Olsa authored Aug 09, 2023

Adding elf_open/elf_close functions and using it in
elf_find_func_offset_from_file function. It will be
used in following changes to save some common code.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f90eb70d

libbpf: Move elf_find_func_offset* functions to elf object · 5c742725

Jiri Olsa authored Aug 09, 2023

Adding new elf object that will contain elf related functions.
There's no functional change.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5c742725

libbpf: Add uprobe_multi attach type and link names · 8097e460

Jiri Olsa authored Aug 09, 2023

Adding new uprobe_multi attach type and link names,
so the functions can resolve the new values.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8097e460

bpf: Add bpf_get_func_ip helper support for uprobe link · 686328d8

Jiri Olsa authored Aug 09, 2023

Adding support for bpf_get_func_ip helper being called from
ebpf program attached by uprobe_multi link.

It returns the ip of the uprobe.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-7-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

686328d8

bpf: Add pid filter support for uprobe_multi link · b733eead

Jiri Olsa authored Aug 09, 2023

Adding support to specify pid for uprobe_multi link and the uprobes
are created only for task with given pid value.

Using the consumer.filter filter callback for that, so the task gets
filtered during the uprobe installation.

We still need to check the task during runtime in the uprobe handler,
because the handler could get executed if there's another system
wide consumer on the same uprobe (thanks Oleg for the insight).

Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-6-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b733eead

bpf: Add cookies support for uprobe_multi link · 0b779b61

Jiri Olsa authored Aug 09, 2023

Adding support to specify cookies array for uprobe_multi link.

The cookies array share indexes and length with other uprobe_multi
arrays (offsets/ref_ctr_offsets).

The cookies[i] value defines cookie for i-the uprobe and will be
returned by bpf_get_attach_cookie helper when called from ebpf
program hooked to that specific uprobe.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-5-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

0b779b61

bpf: Add multi uprobe link · 89ae89f5

Jiri Olsa authored Aug 09, 2023

Adding new multi uprobe link that allows to attach bpf program
to multiple uprobes.

Uprobes to attach are specified via new link_create uprobe_multi
union:

  struct {
    __aligned_u64   path;
    __aligned_u64   offsets;
    __aligned_u64   ref_ctr_offsets;
    __u32           cnt;
    __u32           flags;
  } uprobe_multi;

Uprobes are defined for single binary specified in path and multiple
calling sites specified in offsets array with optional reference
counters specified in ref_ctr_offsets array. All specified arrays
have length of 'cnt'.

The 'flags' supports single bit for now that marks the uprobe as
return probe.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-4-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

89ae89f5

bpf: Add attach_type checks under bpf_prog_attach_check_attach_type · 3505cb9f

Jiri Olsa authored Aug 09, 2023

Add extra attach_type checks from link_create under
bpf_prog_attach_check_attach_type.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-3-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

3505cb9f

bpf: Switch BPF_F_KPROBE_MULTI_RETURN macro to enum · c5487f8d

Jiri Olsa authored Aug 09, 2023

Switching BPF_F_KPROBE_MULTI_RETURN macro to anonymous enum,
so it'd show up in vmlinux.h. There's not functional change
compared to having this as macro.
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230809083440.3209381-2-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c5487f8d

Merge branch 'samples-bpf-make-bpf-programs-more-libbpf-aware' · acfadf25

Alexei Starovoitov authored Aug 21, 2023

Daniel T. Lee says:

====================
samples/bpf: make BPF programs more libbpf aware

The existing tracing programs have been developed for a considerable
period of time and, as a result, do not properly incorporate the
features of the current libbpf, such as CO-RE. This is evident in
frequent usage of functions like PT_REGS* and the persistence of "hack"
methods using underscore-style bpf_probe_read_kernel from the past.
These programs are far behind the current level of libbpf and can
potentially confuse users.

The kernel has undergone significant changes, and some of these changes
have broken these programs, but on the other hand, more robust APIs have
been developed for increased stableness.

To list some of the kernel changes that this patch set is focusing on,
- symbol mismatch occurs due to compiler optimization [1]
- inline of blk_account_io* breaks BPF kprobe program [2]
- new tracepoints for the block_io_start/done are introduced [3]
- map lookup probes can't be triggered (bpf_disable_instrumentation)[4]
- BPF_KSYSCALL has been introduced to simplify argument fetching [5]
- convert to vmlinux.h and use tp argument structure within it
- make tracing programs to be more CO-RE centric

In this regard, this patch set aims not only to integrate the latest
features of libbpf into BPF programs but also to reduce confusion and
clarify the BPF programs. This will help with the potential confusion
among users and make the programs more intutitive.

[1]: https://github.com/iovisor/bcc/issues/1754
[2]: https://github.com/iovisor/bcc/issues/4261
[3]: commit 5a80bd07 ("block: introduce block_io_start/block_io_done tracepoints")
[4]: commit 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock")
[5]: commit 6f5d467d ("libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL")
====================

Link: https://lore.kernel.org/r/20230818090119.477441-1-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

acfadf25

samples/bpf: simplify spintest with kprobe.multi · 456d5355

Daniel T. Lee authored Aug 18, 2023

With the introduction of kprobe.multi, it is now possible to attach
multiple kprobes to a single BPF program without the need for multiple
definitions. Additionally, this method supports wildcard-based
matching, allowing for further simplification of BPF programs. In here,
an asterisk (*) wildcard is used to map to all symbols relevant to
spin_{lock|unlock}.

Furthermore, since kprobe.multi handles symbol matching, this commit
eliminates the need for the previous logic of reading the ksym table to
verify the existence of symbols.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-10-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

456d5355

samples/bpf: refactor syscall tracing programs using BPF_KSYSCALL macro · 8dc80551

Daniel T. Lee authored Aug 18, 2023

This commit refactors the syscall tracing programs by adopting the
BPF_KSYSCALL macro. This change aims to enhance the clarity and
simplicity of the BPF programs by reducing the complexity of argument
parsing from pt_regs.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-9-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8dc80551

samples/bpf: fix broken map lookup probe · d93a7cf6

Daniel T. Lee authored Aug 18, 2023

In the commit 7c4cd051 ("bpf: Fix syscall's stackmap lookup
potential deadlock"), a potential deadlock issue was addressed, which
resulted in *_map_lookup_elem not triggering BPF programs.
(prior to lookup, bpf_disable_instrumentation() is used)

To resolve the broken map lookup probe using "htab_map_lookup_elem",
this commit introduces an alternative approach. Instead, it utilize
"bpf_map_copy_value" and apply a filter specifically for the hash table
with map_type.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Fixes: 7c4cd051 ("bpf: Fix syscall's stackmap lookup potential deadlock")
Link: https://lore.kernel.org/r/20230818090119.477441-8-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d93a7cf6

samples/bpf: fix bio latency check with tracepoint · 92632115

Daniel T. Lee authored Aug 18, 2023

Recently, a new tracepoint for the block layer, specifically the
block_io_start/done tracepoints, was introduced in commit 5a80bd07
("block: introduce block_io_start/block_io_done tracepoints").

Previously, the kprobe entry used for this purpose was quite unstable
and inherently broke relevant probes [1]. Now that a stable tracepoint
is available, this commit replaces the bio latency check with it.

One of the changes made during this replacement is the key used for the
hash table. Since 'struct request' cannot be used as a hash key, the
approach taken follows that which was implemented in bcc/biolatency [2].
(uses dev:sector for the key)

[1]: https://github.com/iovisor/bcc/issues/4261
[2]: https://github.com/iovisor/bcc/pull/4691

Fixes: 450b7879 ("block: move blk_account_io_{start,done} to blk-mq.c")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-7-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

92632115

samples/bpf: make tracing programs to be more CO-RE centric · 11430421

Daniel T. Lee authored Aug 18, 2023

The existing tracing programs have been developed for a considerable
period of time and, as a result, do not properly incorporate the
features of the current libbpf, such as CO-RE. This is evident in
frequent usage of functions like PT_REGS* and the persistence of "hack"
methods using underscore-style bpf_probe_read_kernel from the past.

These programs are far behind the current level of libbpf and can
potentially confuse users. Therefore, this commit aims to convert the
outdated BPF programs to be more CO-RE centric.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-6-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

11430421

samples/bpf: fix symbol mismatch by compiler optimization · 02dabc24

Daniel T. Lee authored Aug 18, 2023

Currently, multiple kprobe programs are suffering from symbol mismatch
due to compiler optimization. These optimizations might induce
additional suffix to the symbol name such as '.isra' or '.constprop'.

    # egrep ' finish_task_switch| __netif_receive_skb_core' /proc/kallsyms
    ffffffff81135e50 t finish_task_switch.isra.0
    ffffffff81dd36d0 t __netif_receive_skb_core.constprop.0
    ffffffff8205cc0e t finish_task_switch.isra.0.cold
    ffffffff820b1aba t __netif_receive_skb_core.constprop.0.cold

To avoid this, this commit replaces the original kprobe section to
kprobe.multi in order to match symbol with wildcard characters. Here,
asterisk is used for avoiding symbol mismatch.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-5-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

02dabc24

samples/bpf: unify bpf program suffix to .bpf with tracing programs · 4a0ee788

Daniel T. Lee authored Aug 18, 2023

Currently, BPF programs typically have a suffix of .bpf.c. However,
some programs still utilize a mixture of _kern.c suffix alongside the
naming convention. In order to achieve consistency in the naming of
these programs, this commit unifies the inconsistency in the naming
convention of BPF kernel programs.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

4a0ee788

samples/bpf: convert to vmlinux.h with tracing programs · e7e6c774

Daniel T. Lee authored Aug 18, 2023

This commit replaces separate headers with a single vmlinux.h to
tracing programs. Thanks to that, we no longer need to define the
argument structure for tracing programs directly. For example, argument
for the sched_switch tracpepoint (sched_switch_args) can be replaced
with the vmlinux.h provided trace_event_raw_sched_switch.

Additional defines have been added to the BPF program either directly
or through the inclusion of net_shared.h. Defined values are
PERF_MAX_STACK_DEPTH, IFNAMSIZ constants and __stringify() macro. This
change enables the BPF program to access internal structures with BTF
generated "vmlinux.h" header.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-3-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e7e6c774

samples/bpf: fix warning with ignored-attributes · 34f6e38f

Daniel T. Lee authored Aug 18, 2023

Currently, compiling the bpf programs will result the warning with the
ignored attribute as follows. This commit fixes the warning by adding
cf-protection option.

    In file included from ./arch/x86/include/asm/linkage.h:6:
    ./arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
    extern __noendbr u64 ibt_save(bool disable);
           ^
    ./arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
                                       ^
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-2-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

34f6e38f

Merge branch 'remove-unnecessary-synchronizations-in-cpumap' · 5bebd3e3

Alexei Starovoitov authored Aug 21, 2023

Hou Tao says:

====================
Remove unnecessary synchronizations in cpumap

From: Hou Tao <houtao1@huawei.com>

Hi,

This is the formal patchset to remove unnecessary synchronizations in
cpu-map after address comments and collect Rvb tags from Toke
Høiland-Jørgensen (Big thanks to Toke). Patch #1 removes the unnecessary
rcu_barrier() when freeing bpf_cpu_map_entry and replaces it by
queue_rcu_work(). Patch #2 removes the unnecessary call_rcu() and
queue_work() when destroying cpu-map and does the freeing directly.

Test the patchset by using xdp_redirect_cpu and virtio-net. Both
xdp-mode and skb-mode have been exercised and no issues were reported.
As ususal, comments and suggestions are always welcome.

Change Log:
v1:
  * address comments from Toke Høiland-Jørgensen
  * add Rvb tags from Toke Høiland-Jørgensen
  * update outdated comment in cpu_map_delete_elem()

RFC: https://lore.kernel.org/bpf/20230728023030.1906124-1-houtao@huaweicloud.com
====================

Link: https://lore.kernel.org/r/20230816045959.358059-1-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

5bebd3e3

bpf, cpumask: Clean up bpf_cpu_map_entry directly in cpu_map_free · c2e42ddf

Hou Tao authored Aug 16, 2023

After synchronous_rcu(), both the dettached XDP program and
xdp_do_flush() are completed, and the only user of bpf_cpu_map_entry
will be cpu_map_kthread_run(), so instead of calling
__cpu_map_entry_replace() to stop kthread and cleanup entry after a RCU
grace period, do these things directly.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230816045959.358059-3-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c2e42ddf

bpf, cpumap: Use queue_rcu_work() to remove unnecessary rcu_barrier() · 8f8500a2

Hou Tao authored Aug 16, 2023

As for now __cpu_map_entry_replace() uses call_rcu() to wait for the
inflight xdp program to exit the RCU read critical section, and then
launch kworker cpu_map_kthread_stop() to call kthread_stop() to flush
all pending xdp frames or skbs.

But it is unnecessary to use rcu_barrier() in cpu_map_kthread_stop() to
wait for the completion of __cpu_map_entry_free(), because rcu_barrier()
will wait for all pending RCU callbacks and cpu_map_kthread_stop() only
needs to wait for the completion of a specific __cpu_map_entry_free().

So use queue_rcu_work() to replace call_rcu(), schedule_work() and
rcu_barrier(). queue_rcu_work() will queue a __cpu_map_entry_free()
kworker after a RCU grace period. Because __cpu_map_entry_free() is
running in a kworker context, so it is OK to do all of these freeing
procedures include kthread_stop() in it.

After the update, there is no need to do reference-counting for
bpf_cpu_map_entry, because bpf_cpu_map_entry is freed directly in
__cpu_map_entry_free(), so just remove it.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230816045959.358059-2-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8f8500a2

18 Aug, 2023 1 commit

selftests/bpf: Fix a selftest compilation error · 0a55264c

Yonghong Song authored Aug 18, 2023

When building the kernel and selftest with clang compiler (llvm17 or llvm18),
I hit the following compilation failure:
  In file included from progs/test_lwt_redirect.c:3:
  In file included from /usr/include/linux/ip.h:21:
  In file included from /usr/include/asm/byteorder.h:5:
  In file included from /usr/include/linux/byteorder/little_endian.h:13:
  /usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline'
    136 | static __always_inline unsigned long __swab(const unsigned long y)
        |        ^
  /usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline'
    171 | static __always_inline __u16 __swab16p(const __u16 *p)
  ...

bpf_helpers.h file provided a definition for __always_inline.
Putting 'ip.h' after 'bpf_helpers.h' fixed the issue.

Fixes: 43a7c3ef ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT")
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230818174312.1883381-1-yonghong.song@linux.devSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

0a55264c