Commits · 805b4d90c0df79a88907623b2528954a0125313d · Kirill Smelkov / linux

20 Apr, 2024 5 commits

selftests/bpf: Use connect_to_addr in cls_redirect · 805b4d90

Geliang Tang authored Apr 18, 2024

This patch uses public helper connect_to_addr() exported in
network_helpers.h instead of the local defined function connect_to_server()
in prog_tests/cls_redirect.c. This can avoid duplicate code.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/4a03ac92d2d392f8721f398fa449a83ac75577bc.1713427236.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

805b4d90

selftests/bpf: Update arguments of connect_to_addr · db9994d0

Geliang Tang authored Apr 18, 2024

Move the third argument "int type" of connect_to_addr() to the first one
which is closer to how the socket syscall is doing it. And add a
network_helper_opts argument as the fourth one. Then change its usages in
sock_addr.c too.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/088ea8a95055f93409c5f57d12f0e58d43059ac4.1713427236.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

db9994d0

selftests/bpf: Use start_server_addr in sk_assign · a2e49795

Geliang Tang authored Apr 18, 2024

Include network_helpers.h in prog_tests/sk_assign.c, use the newly
added public helper start_server_addr() instead of the local defined
function start_server(). This can avoid duplicate code.

The code that sets SO_RCVTIMEO timeout as timeo_sec (3s) can be dropped,
since start_server_addr() sets default timeout as 3s.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/2af706ffbad63b4f7eaf93a426ed1076eadf1a05.1713427236.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

a2e49795

selftests/bpf: Use start_server_addr in cls_redirect · 9851382f

Geliang Tang authored Apr 18, 2024

Include network_helpers.h in prog_tests/cls_redirect.c, use the newly
added public helper start_server_addr() instead of the local defined
function start_server(). This can avoid duplicate code.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/13f336cb4c6680175d50bb963d9532e11528c758.1713427236.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

9851382f

selftests/bpf: Add start_server_addr helper · 9c598a83

Geliang Tang authored Apr 18, 2024

In order to pair up with connect_to_addr(), this patch adds a new helper
start_server_addr(), which is a wrapper of __start_server(). It accepts
an argument 'addr' of 'struct sockaddr_storage' type instead of a string
type argument like start_server(), and a network_helper_opts argument as
the last one.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/2f01d48fa026467926738debe554ac452c19b86f.1713427236.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

9c598a83

18 Apr, 2024 1 commit

bpf: Fix JIT of is_mov_percpu_addr instruction. · 462e5e2a

Alexei Starovoitov authored Apr 17, 2024

The codegen for is_mov_percpu_addr instruction works for rax/r8 registers
only. Fix it to generate proper x86 byte code for other registers.

Fixes: 7bdbf744 ("bpf: add special internal-only MOV instruction to resolve per-CPU addrs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240417214406.15788-1-alexei.starovoitov@gmail.com

462e5e2a

17 Apr, 2024 2 commits

libbpf: Fix dump of subsequent char arrays · e739e01d

Quentin Deslandes authored Apr 13, 2024

When dumping a character array, libbpf will watch for a '\0' and set
is_array_terminated=true if found. This prevents libbpf from printing
the remaining characters of the array, treating it as a nul-terminated
string.

However, once this flag is set, it's never reset, leading to subsequent
characters array not being printed properly:

.str_multi = (__u8[2][16])[
    [
        'H',
        'e',
        'l',
    ],
],

This patch saves the is_array_terminated flag and restores its
default (false) value before looping over the elements of an array,
then restores it afterward. This way, libbpf's behavior is unchanged
when dumping the characters of an array, but subsequent arrays are
printed properly:

.str_multi = (__u8[2][16])[
    [
        'H',
        'e',
        'l',
    ],
    [
        'l',
        'o',
    ],
],
Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-3-qde@naccy.de

e739e01d

libbpf: Fix misaligned array closing bracket · 9213e529

Quentin Deslandes authored Apr 13, 2024

In btf_dump_array_data(), libbpf will call btf_dump_dump_type_data() for
each element. For an array of characters, each element will be
processed the following way:

- btf_dump_dump_type_data() is called to print the character
- btf_dump_data_pfx() prefixes the current line with the proper number
  of indentations
- btf_dump_int_data() is called to print the character
- After the last character is printed, btf_dump_dump_type_data() calls
  btf_dump_data_pfx() before writing the closing bracket

However, for an array containing characters, btf_dump_int_data() won't
print any '\0' and subsequent characters. This leads to situations where
the line prefix is written, no character is added, then the prefix is
written again before adding the closing bracket:

(struct sk_metadata){
    .str_array = (__u8[14])[
        'H',
        'e',
        'l',
        'l',
        'o',
                ],

This change solves this issue by printing the '\0' character, which
has two benefits:

- The bracket closing the array is properly aligned
- It's clear from a user point of view that libbpf uses '\0' as a
  terminator for arrays of characters.
Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-2-qde@naccy.de

9213e529

16 Apr, 2024 5 commits

bpftool: Address minor issues in bash completion · ad2d22b6

Quentin Monnet authored Apr 13, 2024

This commit contains a series of clean-ups and fixes for bpftool's bash
completion file:

- Make sure all local variables are declared as such.
- Make sure variables are initialised before being read.
- Update ELF section ("maps" -> ".maps") for looking up map names in
  object files.
- Fix call to _init_completion.
- Move definition for MAP_TYPE and PROG_TYPE higher up in the scope to
  avoid defining them multiple times, reuse MAP_TYPE where relevant.
- Simplify completion for "duration" keyword in "bpftool prog profile".
- Fix completion for "bpftool struct_ops register" and "bpftool link
  (pin|detach)" where we would repeatedly suggest file names instead of
  suggesting just one name.
- Fix completion for "bpftool iter pin ... map MAP" to account for the
  "map" keyword.
- Add missing "detach" suggestion for "bpftool link".
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413011427.14402-3-qmo@kernel.org

ad2d22b6

bpftool: Update documentation where progs/maps can be passed by name · 986e7663

Quentin Monnet authored Apr 13, 2024

When using references to BPF programs, bpftool supports passing programs
by name on the command line. The manual pages for "bpftool prog" and
"bpftool map" (for prog_array updates) mention it, but we have a few
additional subcommands that support referencing programs by name but do
not mention it in their documentation. Let's update the pages for
subcommands "btf", "cgroup", and "net".

Similarly, we can reference maps by name when passing them to "bpftool
prog load", so we update the page for "bpftool prog" as well.
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413011427.14402-2-qmo@kernel.org

986e7663

bpf: Harden and/or/xor value tracking in verifier · 1f586614

Harishankar Vishwanathan authored Apr 16, 2024

This patch addresses a latent unsoundness issue in the
scalar(32)_min_max_and/or/xor functions. While it is not a bugfix,
it ensures that the functions produce sound outputs for all inputs.

The issue occurs in these functions when setting signed bounds. The
following example illustrates the issue for scalar_min_max_and(),
but it applies to the other functions.

In scalar_min_max_and() the following clause is executed when ANDing
positive numbers:

  /* ANDing two positives gives a positive, so safe to
   * cast result into s64.
   */
  dst_reg->smin_value = dst_reg->umin_value;
  dst_reg->smax_value = dst_reg->umax_value;

However, if umin_value and umax_value of dst_reg cross the sign boundary
(i.e., if (s64)dst_reg->umin_value > (s64)dst_reg->umax_value), then we
will end up with smin_value > smax_value, which is unsound.

Previous works [1, 2] have discovered and reported this issue. Our tool
Agni [2, 3] consideres it a false positive. This is because, during the
verification of the abstract operator scalar_min_max_and(), Agni restricts
its inputs to those passing through reg_bounds_sync(). This mimics
real-world verifier behavior, as reg_bounds_sync() is invariably executed
at the tail of every abstract operator. Therefore, such behavior is
unlikely in an actual verifier execution.

However, it is still unsound for an abstract operator to set signed bounds
such that smin_value > smax_value. This patch fixes it, making the abstract
operator sound for all (well-formed) inputs.

It is worth noting that while the previous code updated the signed bounds
(using the output unsigned bounds) only when the *input signed* bounds
were positive, the new code updates them whenever the *output unsigned*
bounds do not cross the sign boundary.

An alternative approach to fix this latent unsoundness would be to
unconditionally set the signed bounds to unbounded [S64_MIN, S64_MAX], and
let reg_bounds_sync() refine the signed bounds using the unsigned bounds
and the tnum. We found that our approach produces more precise (tighter)
bounds.

For example, consider these inputs to BPF_AND:

  /* dst_reg */
  var_off.value: 8608032320201083347
  var_off.mask: 615339716653692460
  smin_value: 8070450532247928832
  smax_value: 8070450532247928832
  umin_value: 13206380674380886586
  umax_value: 13206380674380886586
  s32_min_value: -2110561598
  s32_max_value: -133438816
  u32_min_value: 4135055354
  u32_max_value: 4135055354

  /* src_reg */
  var_off.value: 8584102546103074815
  var_off.mask: 9862641527606476800
  smin_value: 2920655011908158522
  smax_value: 7495731535348625717
  umin_value: 7001104867969363969
  umax_value: 8584102543730304042
  s32_min_value: -2097116671
  s32_max_value: 71704632
  u32_min_value: 1047457619
  u32_max_value: 4268683090

After going through tnum_and() -> scalar32_min_max_and() ->
scalar_min_max_and() -> reg_bounds_sync(), our patch produces the following
bounds for s32:

  s32_min_value: -1263875629
  s32_max_value: -159911942

Whereas, setting the signed bounds to unbounded in scalar_min_max_and()
produces:

  s32_min_value: -1263875629
  s32_max_value: -1

As observed, our patch produces a tighter s32 bound. We also confirmed
using Agni and SMT verification that our patch always produces signed
bounds that are equal to or more precise than setting the signed bounds to
unbounded in scalar_min_max_and().

  [1] https://sanjit-bhat.github.io/assets/pdf/ebpf-verifier-range-analysis22.pdf
  [2] https://link.springer.com/chapter/10.1007/978-3-031-37709-9_12
  [3] https://github.com/bpfverif/agniCo-developed-by: Matan Shachnai <m.shachnai@rutgers.edu>
Signed-off-by: Matan Shachnai <m.shachnai@rutgers.edu>
Co-developed-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Signed-off-by: Srinivas Narayana <srinivas.narayana@rutgers.edu>
Co-developed-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Santosh Nagarakatte <santosh.nagarakatte@rutgers.edu>
Signed-off-by: Harishankar Vishwanathan <harishankar.vishwanathan@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240402212039.51815-1-harishankar.vishwanathan@gmail.com
Link: https://lore.kernel.org/bpf/20240416115303.331688-1-harishankar.vishwanathan@gmail.com

1f586614

bpf, tests: Fix typos in comments · dac045fc

Chen Pei authored Apr 15, 2024

Currently, there are two comments with same name "64-bit ATOMIC magnitudes",
the second one should be "32-bit ATOMIC magnitudes" based on the context.
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240415081928.17440-1-cp0613@linux.alibaba.com

dac045fc

btf: Avoid weak external references · fc5eb4a8

Ard Biesheuvel authored Apr 15, 2024

If the BTF code is enabled in the build configuration, the start/stop
BTF markers are guaranteed to exist. Only when CONFIG_DEBUG_INFO_BTF=n,
the references in btf_parse_vmlinux() will remain unsatisfied, relying
on the weak linkage of the external references to avoid breaking the
build.

Avoid GOT based relocations to these markers in the final executable by
dropping the weak attribute and instead, make btf_parse_vmlinux() return
ERR_PTR(-ENOENT) directly if CONFIG_DEBUG_INFO_BTF is not enabled to
begin with.  The compiler will drop any subsequent references to
__start_BTF and __stop_BTF in that case, allowing the link to succeed.

Note that Clang will notice that taking the address of __start_BTF can
no longer yield NULL, so testing for that condition becomes unnecessary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240415162041.2491523-8-ardb+git@google.com

fc5eb4a8

12 Apr, 2024 2 commits

selftests/bpf: Add read_trace_pipe_iter function · 4d4992ff

Jiri Olsa authored Apr 10, 2024

We have two printk tests reading trace_pipe in non blocking way,
with the very same code. Moving that in new read_trace_pipe_iter
function.

Current read_trace_pipe is used from samples/bpf and needs to
do blocking read and printf of the trace_pipe data, using new
read_trace_pipe_iter to implement that.

Both printk tests do early checks for the number of found messages
and can bail earlier, but I did not find any speed difference w/o
that condition, so I did not complicate the change more for that.

Some of the samples/bpf programs use read_trace_pipe function,
so I kept that interface untouched. I did not see any issues with
affected samples/bpf programs other than there's slight change in
read_trace_pipe output. The current code uses puts that adds new
line after the printed string, so we would occasionally see extra
new line. With this patch we read output per lines, so there's no
need to use puts and we can use just printf instead without extra
new line.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240410140952.292261-1-jolsa@kernel.org

4d4992ff

bpftool: Fix typo in error message · 23cc4fe4

Thorsten Blum authored Apr 11, 2024

s/at at/at a/
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240411164258.533063-3-thorsten.blum@toblux.com

23cc4fe4

11 Apr, 2024 11 commits

Merge branch 'export send_recv_data' · c53e853c

Martin KaFai Lau authored Apr 11, 2024

Geliang Tang says:

====================
v5:
 - address Martin's comments for v4 (thanks).
 - update patch 2, use 'return err' instead of 'return -1/0'.
 - drop patch 3 in v4.

v4:
 - fix a bug in v3, it should be 'if (err)', not 'if (!err)'.
 - move "selftests/bpf: Use log_err in network_helpers" out of this
   series.

v3:
 - add two more patches.
 - use log_err instead of ASSERT in v3.
 - let send_recv_data return int as Martin suggested.

v2:

Address Martin's comments for v1 (thanks.)
 - drop patch 1, "export send_byte helper".
 - drop "WRITE_ONCE(arg.stop, 0)".
 - rebased.

send_recv_data will be re-used in MPTCP bpf tests, but not included
in this set because it depends on other patches that have not been
in the bpf-next yet. It will be sent as another set soon.
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

c53e853c

selftests/bpf: Export send_recv_data helper · dc34e44e

Geliang Tang authored Apr 11, 2024

This patch extracts the code to send and receive data into a new
helper named send_recv_data() in network_helpers.c and export it
in network_helpers.h.

This helper will be used for MPTCP BPF selftests.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/5231103be91fadcce3674a589542c63b6a5eedd4.1712813933.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

dc34e44e

selftests/bpf: Add struct send_recv_arg · 68acca6e

Geliang Tang authored Apr 11, 2024

Avoid setting total_bytes and stop as global variables, this patch adds
a new struct named send_recv_arg to pass arguments between threads. Put
these two variables together with fd into this struct and pass it to
server thread, so that server thread can access these two variables without
setting them as global ones.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/ca1dd703b796f6810985418373e750f7068b4186.1712813933.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

68acca6e

selftests/bpf: Fix umount cgroup2 error in test_sockmap · d75142db

Geliang Tang authored Apr 09, 2024

This patch fixes the following "umount cgroup2" error in test_sockmap.c:

 (cgroup_helpers.c:353: errno: Device or resource busy) umount cgroup2

Cgroup fd cg_fd should be closed before cleanup_cgroup_environment().

Fixes: 13a5f3ff ("bpf: Selftests, sockmap test prog run without setting cgroup")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/0399983bde729708773416b8488bac2cd5e022b8.1712639568.git.tanggeliang@kylinos.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

d75142db

selftests/bpf: Enable tests for atomics with cpuv4 · ffa6b26b

Yonghong Song authored Apr 10, 2024

When looking at Alexei's patch ([1]) which added tests for atomics,
I noticed that the tests will be skipped with cpuv4. For example,
with latest llvm19, I see:
  [root@arch-fb-vm1 bpf]# ./test_progs -t arena_atomics
  #3/1     arena_atomics/add:OK
  ...
  #3/7     arena_atomics/xchg:OK
  #3       arena_atomics:OK
  Summary: 1/7 PASSED, 0 SKIPPED, 0 FAILED
  [root@arch-fb-vm1 bpf]# ./test_progs-cpuv4 -t arena_atomics
  #3       arena_atomics:SKIP
  Summary: 1/0 PASSED, 1 SKIPPED, 0 FAILED
  [root@arch-fb-vm1 bpf]#

It is perfectly fine to enable atomics-related tests for cpuv4.
With this patch, I have
  [root@arch-fb-vm1 bpf]# ./test_progs-cpuv4 -t arena_atomics
  #3/1     arena_atomics/add:OK
  ...
  #3/7     arena_atomics/xchg:OK
  #3       arena_atomics:OK
  Summary: 1/7 PASSED, 0 SKIPPED, 0 FAILED

  [1] https://lore.kernel.org/r/20240405231134.17274-2-alexei.starovoitov@gmail.comSigned-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410153326.1851055-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ffa6b26b

Merge branch 'bpf-add-bpf_link-support-for-sk_msg-and-sk_skb-progs' · ded8c009

Alexei Starovoitov authored Apr 10, 2024

Yonghong Song says:

====================
bpf: Add bpf_link support for sk_msg and sk_skb progs

One of our internal services started to use sk_msg program and currently
it used existing prog attach/detach2 as demonstrated in selftests.
But attach/detach of all other bpf programs are based on bpf_link.
Consistent attach/detach APIs for all programs will make things easy to
undersand and less error prone. So this patch added bpf_link
support for BPF_PROG_TYPE_SK_MSG. Based on comments from
previous RFC patch, I added BPF_PROG_TYPE_SK_SKB support as well
as both program types have similar treatment w.r.t. bpf_link
handling.

For the patch series, patch 1 added kernel support. Patch 2
added libbpf support. Patch 3 added bpftool support and
patches 4/5 added some new tests.

Changelogs:
  v6 -> v7:
    - fix an missing-mutex_unlock error.
  v5 -> v6:
    - resolve libbpf conflict due to recent upstream change.
    - add a bpf_link_create() test.
    - some code refactoring for better code quality.
  v4 -> v5:
    - increase scope of mutex protection in link_release.
    - remove previous-leftover entry in libbpf.map.
    - make some code changes for better understanding.
  v3 -> v4:
    - use a single mutex lock to protect both attach/detach/update
      and release/fill_info/show_fdinfo.
    - simplify code for sock_map_link_lookup().
    - fix a few bugs.
    - add more tests.
  v2 -> v3:
    - consolidate link types of sk_msg and sk_skb to
      a single link type BPF_PROG_TYPE_SOCKMAP.
    - fix bpf_link lifecycle issue. in v2, after bpf_link
      is attached, a subsequent prog_attach could change
      that bpf_link. this patch makes bpf_link having
      correct behavior such that it won't go away facing
      other prog/link attach for the same map and the same
      attach type.
====================

Link: https://lore.kernel.org/r/20240410043522.3736912-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ded8c009

selftests/bpf: Add some tests with new bpf_program__attach_sockmap() APIs · 8ba218e6

Yonghong Song authored Apr 09, 2024

Add a few more tests in sockmap_basic.c and sockmap_listen.c to
test bpf_link based APIs for SK_MSG and SK_SKB programs.
Link attach/detach/update are all tested.

All tests are passed.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043547.3738448-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

8ba218e6

selftests/bpf: Refactor out helper functions for a few tests · a15d58b2

Yonghong Song authored Apr 09, 2024

These helper functions will be used later new tests as well.
There are no functionality change.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043542.3738166-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a15d58b2

bpftool: Add link dump support for BPF_LINK_TYPE_SOCKMAP · 1f3e2091

Yonghong Song authored Apr 09, 2024

An example output looks like:
  $ bpftool link
    1776: sk_skb  prog 49730
            map_id 0  attach_type sk_skb_verdict
            pids test_progs(8424)
    1777: sk_skb  prog 49755
            map_id 0  attach_type sk_skb_stream_verdict
            pids test_progs(8424)
    1778: sk_msg  prog 49770
            map_id 8208  attach_type sk_msg_verdict
            pids test_progs(8424)
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043537.3737928-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

1f3e2091

libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP · 849989af

Yonghong Song authored Apr 09, 2024

Introduce a libbpf API function bpf_program__attach_sockmap()
which allow user to get a bpf_link for their corresponding programs.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043532.3737722-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

849989af

bpf: Add bpf_link support for sk_msg and sk_skb progs · 699c23f0

Yonghong Song authored Apr 09, 2024

Add bpf_link support for sk_msg and sk_skb programs. We have an
internal request to support bpf_link for sk_msg programs so user
space can have a uniform handling with bpf_link based libbpf
APIs. Using bpf_link based libbpf API also has a benefit which
makes system robust by decoupling prog life cycle and
attachment life cycle.
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043527.3737160-1-yonghong.song@linux.devSigned-off-by: Alexei Starovoitov <ast@kernel.org>

699c23f0

09 Apr, 2024 2 commits

selftests/bpf: Add tests for atomics in bpf_arena. · d0a2ba19

Alexei Starovoitov authored Apr 05, 2024

Add selftests for atomic instructions in bpf_arena.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240405231134.17274-2-alexei.starovoitov@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

d0a2ba19

bpf: Add support for certain atomics in bpf_arena to x86 JIT · d503a04f

Alexei Starovoitov authored Apr 05, 2024

Support atomics in bpf_arena that can be JITed as a single x86 instruction.
Instructions that are JITed as loops are not supported at the moment,
since they require more complex extable and loop logic.

JITs can choose to do smarter things with bpf_jit_supports_insn().
Like arm64 may decide to support all bpf atomics instructions
when emit_lse_atomic is available and none in ll_sc mode.

bpf_jit_supports_percpu_insn(), bpf_jit_supports_ptr_xchg() and
other such callbacks can be replaced with bpf_jit_supports_insn()
in the future.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240405231134.17274-1-alexei.starovoitov@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

d503a04f

08 Apr, 2024 1 commit

selftests/bpf: eliminate warning of get_cgroup_id_from_path() · bb761fcb

Jason Xing authored Apr 06, 2024

The output goes like this if I make samples/bpf:
...warning: no previous prototype for ‘get_cgroup_id_from_path’...

Make this function static could solve the warning problem since
no one outside of the file calls it.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240406144613.4434-1-kerneljasonxing@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

bb761fcb

06 Apr, 2024 4 commits

Merge branch 'libbpf-api-to-partially-consume-items-from-ringbuffer' · 50408d7a

Andrii Nakryiko authored Apr 06, 2024

Andrea Righi says:

====================
libbpf: API to partially consume items from ringbuffer

Introduce ring__consume_n() and ring_buffer__consume_n() API to
partially consume items from one (or more) ringbuffer(s).

This can be useful, for example, to consume just a single item or when
we need to copy multiple items to a limited user-space buffer from the
ringbuffer callback.

Practical example (where this API can be used):
https://github.com/sched-ext/scx/blob/b7c06b9ed9f72cad83c31e39e9c4e2cfd8683a55/rust/scx_rustland_core/src/bpf.rs#L217

See also:
https://lore.kernel.org/lkml/20240310154726.734289-1-andrea.righi@canonical.com/T/#u

v4:
 - open a new 1.5.0 cycle

v3:
 - rename ring__consume_max() -> ring__consume_n() and
   ring_buffer__consume_max() -> ring_buffer__consume_n()
 - add new API to a new 1.5.0 cycle
 - fixed minor nits / comments

v2:
 - introduce a new API instead of changing the callback's retcode
   behavior
====================

Link: https://lore.kernel.org/r/20240406092005.92399-1-andrea.righi@canonical.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

50408d7a

libbpf: Add ring__consume_n / ring_buffer__consume_n · 4d22ea94

Andrea Righi authored Apr 06, 2024

Introduce a new API to consume items from a ring buffer, limited to a
specified amount, and return to the caller the actual number of items
consumed.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/lkml/20240310154726.734289-1-andrea.righi@canonical.com/T
Link: https://lore.kernel.org/bpf/20240406092005.92399-4-andrea.righi@canonical.com

4d22ea94

libbpf: ringbuf: Allow to consume up to a certain amount of items · 13e8125a

Andrea Righi authored Apr 06, 2024

In some cases, instead of always consuming all items from ring buffers
in a greedy way, we may want to consume up to a certain amount of items,
for example when we need to copy items from the BPF ring buffer to a
limited user buffer.

This change allows to set an upper limit to the amount of items consumed
from one or more ring buffers.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240406092005.92399-3-andrea.righi@canonical.com

13e8125a

libbpf: Start v1.5 development cycle · 5bd2ed65

Andrea Righi authored Apr 06, 2024

Bump libbpf.map to v1.5.0 to start a new libbpf version cycle.
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240406092005.92399-2-andrea.righi@canonical.com

5bd2ed65

05 Apr, 2024 7 commits

Merge branch 'bpf-allow-invoking-kfuncs-from-bpf_prog_type_syscall-progs' · d564ffde

Andrii Nakryiko authored Apr 05, 2024

David Vernet says:

====================
bpf: Allow invoking kfuncs from BPF_PROG_TYPE_SYSCALL progs

Currently, a set of core BPF kfuncs (e.g. bpf_task_*, bpf_cgroup_*,
bpf_cpumask_*, etc) cannot be invoked from BPF_PROG_TYPE_SYSCALL
programs. The whitelist approach taken for enabling kfuncs makes sense:
it not safe to call these kfuncs from every program type. For example,
it may not be safe to call bpf_task_acquire() in an fentry to
free_task().

BPF_PROG_TYPE_SYSCALL, on the other hand, is a perfectly safe program
type from which to invoke these kfuncs, as it's a very controlled
environment, and we should never be able to run into any of the typical
problems such as recursive invoations, acquiring references on freeing
kptrs, etc. Being able to invoke these kfuncs would be useful, as
BPF_PROG_TYPE_SYSCALL can be invoked with BPF_PROG_RUN, and would
therefore enable user space programs to synchronously call into BPF to
manipulate these kptrs.
---

v1: https://lore.kernel.org/all/20240404010308.334604-1-void@manifault.com/
v1 -> v2:

- Create new verifier_kfunc_prog_types testcase meant to specifically
  validate calling core kfuncs from various program types. Remove the
  macros and testcases that had been added to the task, cgrp, and
  cpumask kfunc testcases (Andrii and Yonghong)
====================

Link: https://lore.kernel.org/r/20240405143041.632519-1-void@manifault.comSigned-off-by: Andrii Nakryiko <andrii@kernel.org>

d564ffde

selftests/bpf: Verify calling core kfuncs from BPF_PROG_TYPE_SYCALL · 1bc724af

David Vernet authored Apr 05, 2024

Now that we can call some kfuncs from BPF_PROG_TYPE_SYSCALL progs, let's
add some selftests that verify as much. As a bonus, let's also verify
that we can't call the progs from raw tracepoints. Do do this, we add a
new selftest suite called verifier_kfunc_prog_types.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240405143041.632519-3-void@manifault.com

1bc724af

bpf: Allow invoking kfuncs from BPF_PROG_TYPE_SYSCALL progs · a8e03b6b

David Vernet authored Apr 05, 2024

Currently, a set of core BPF kfuncs (e.g. bpf_task_*, bpf_cgroup_*,
bpf_cpumask_*, etc) cannot be invoked from BPF_PROG_TYPE_SYSCALL
programs. The whitelist approach taken for enabling kfuncs makes sense:
it not safe to call these kfuncs from every program type. For example,
it may not be safe to call bpf_task_acquire() in an fentry to
free_task().

BPF_PROG_TYPE_SYSCALL, on the other hand, is a perfectly safe program
type from which to invoke these kfuncs, as it's a very controlled
environment, and we should never be able to run into any of the typical
problems such as recursive invoations, acquiring references on freeing
kptrs, etc. Being able to invoke these kfuncs would be useful, as
BPF_PROG_TYPE_SYSCALL can be invoked with BPF_PROG_RUN, and would
therefore enable user space programs to synchronously call into BPF to
manipulate these kptrs.

This patch therefore enables invoking the aforementioned core kfuncs
from BPF_PROG_TYPE_SYSCALL progs.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240405143041.632519-2-void@manifault.com

a8e03b6b

bpf, docs: Editorial nits in instruction-set.rst · 00d5d22a

Dave Thaler authored Apr 05, 2024

This patch addresses a number of editorial nits including
spelling, punctuation, grammar, and wording consistency issues
in instruction-set.rst.
Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240405155245.3618-1-dthaler1968@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

00d5d22a

selftests/bpf: Make sure libbpf doesn't enforce the signature of a func pointer. · ba0cbe2b

Kui-Feng Lee authored Apr 04, 2024

The verifier in the kernel ensures that the struct_ops operators behave
correctly by checking that they access parameters and context
appropriately. The verifier will approve a program as long as it correctly
accesses the context/parameters, regardless of its function signature. In
contrast, libbpf should not verify the signature of function pointers and
functions to enable flexibility in loading various implementations of an
operator even if the signature of the function pointer does not match those
in the implementations or the kernel.

With this flexibility, user space applications can adapt to different
kernel versions by loading a specific implementation of an operator based
on feature detection.

This is a follow-up of the commit c911fc61 ("libbpf: Skip zeroed or
null fields if not found in the kernel type.")
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240404232342.991414-1-thinker.li@gmail.com

ba0cbe2b

Merge branch 'bpf-allow-bpf_for_each_map_elem-helper-with-different-input-maps' · 27095479

Alexei Starovoitov authored Apr 05, 2024

Philo Lu says:

====================
bpf: allow bpf_for_each_map_elem() helper with different input maps

Currently, taking different maps within a single bpf_for_each_map_elem
call is not allowed. For example the following codes cannot pass the
verifier (with error "tail_call abusing map_ptr"):
```
static void test_by_pid(int pid)
{
	if (pid <= 100)
		bpf_for_each_map_elem(&map1, map_elem_cb, NULL, 0);
	else
		bpf_for_each_map_elem(&map2, map_elem_cb, NULL, 0);
}
```

This is because during bpf_for_each_map_elem verifying,
bpf_insn_aux_data->map_ptr_state is expected as map_ptr (instead of poison
state), which is then needed by set_map_elem_callback_state. However, as
there are two different map ptr input, map_ptr_state is marked as
BPF_MAP_PTR_POISON, and thus the second map_ptr would be lost.
BPF_MAP_PTR_POISON is also needed by bpf_for_each_map_elem to skip
retpoline optimization in do_misc_fixups(). Therefore, map_ptr_state and
map_ptr are both needed for bpf_for_each_map_elem.

This patchset solves it by transform bpf_insn_aux_data->map_ptr_state as a
new struct, storing poison/unpriv state and map pointer together without
additional memory overhead. Then bpf_for_each_map_elem works well with
different input maps. It also makes map_ptr_state logic clearer.

A test case is added to selftest, which would fail to load without this
patchset.

Changelogs
-> v1:
- PATCH 1/3:
  - make the commit log clearer
  - change poison and unpriv to bool in struct bpf_map_ptr_state, also the
    return value in bpf_map_ptr_poisoned() and bpf_map_ptr_unpriv()
- PATCH 2/3:
  - change the comments in set_map_elem_callback_state()
- PATCH 3/3:
  - remove the "skipping the last element" logic during map updating
  - change if() to ASSERT_OK()

Please review, thanks.
====================

Link: https://lore.kernel.org/r/20240405025536.18113-1-lulie@linux.alibaba.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

27095479

selftests/bpf: add test for bpf_for_each_map_elem() with different maps · fecb1597

Philo Lu authored Apr 05, 2024

A test is added for bpf_for_each_map_elem() with either an arraymap or a
hashmap.
$ tools/testing/selftests/bpf/test_progs -t for_each
 #93/1    for_each/hash_map:OK
 #93/2    for_each/array_map:OK
 #93/3    for_each/write_map_key:OK
 #93/4    for_each/multi_maps:OK
 #93      for_each:OK
Summary: 1/4 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240405025536.18113-4-lulie@linux.alibaba.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

fecb1597