Commits · eed807f626101f6a4227bd53942892c5983b95a7 · Kirill Smelkov / linux

22 Sep, 2022 20 commits

bpf: Tweak definition of KF_TRUSTED_ARGS · eed807f6

Kumar Kartikeya Dwivedi authored Sep 21, 2022

Instead of forcing all arguments to be referenced pointers with non-zero
reg->ref_obj_id, tweak the definition of KF_TRUSTED_ARGS to mean that
only PTR_TO_BTF_ID (and socket types translated to PTR_TO_BTF_ID) have
that constraint, and require their offset to be set to 0.

The rest of pointer types are also accomodated in this definition of
trusted pointers, but with more relaxed rules regarding offsets.

The inherent meaning of setting this flag is that all kfunc pointer
arguments have a guranteed lifetime, and kernel object pointers
(PTR_TO_BTF_ID, PTR_TO_CTX) are passed in their unmodified form (with
offset 0). In general, this is not true for PTR_TO_BTF_ID as it can be
obtained using pointer walks.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/cdede0043c47ed7a357f0a915d16f9ce06a1d589.1663778601.git.lorenzo@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

eed807f6

bpf: Always use raw spinlock for hash bucket lock · 1d8b82c6

Hou Tao authored Sep 21, 2022

For a non-preallocated hash map on RT kernel, regular spinlock instead
of raw spinlock is used for bucket lock. The reason is that on RT kernel
memory allocation is forbidden under atomic context and regular spinlock
is sleepable under RT.

Now hash map has been fully converted to use bpf_map_alloc, and there
will be no synchronous memory allocation for non-preallocated hash map,
so it is safe to always use raw spinlock for bucket lock on RT. So
removing the usage of htab_use_raw_lock() and updating the comments
accordingly.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220921073826.2365800-1-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

1d8b82c6

bpf: Prevent bpf program recursion for raw tracepoint probes · 05b24ff9

Jiri Olsa authored Sep 16, 2022

We got report from sysbot [1] about warnings that were caused by
bpf program attached to contention_begin raw tracepoint triggering
the same tracepoint by using bpf_trace_printk helper that takes
trace_printk_lock lock.

 Call Trace:
  <TASK>
  ? trace_event_raw_event_bpf_trace_printk+0x5f/0x90
  bpf_trace_printk+0x2b/0xe0
  bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
  bpf_trace_run2+0x26/0x90
  native_queued_spin_lock_slowpath+0x1c6/0x2b0
  _raw_spin_lock_irqsave+0x44/0x50
  bpf_trace_printk+0x3f/0xe0
  bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
  bpf_trace_run2+0x26/0x90
  native_queued_spin_lock_slowpath+0x1c6/0x2b0
  _raw_spin_lock_irqsave+0x44/0x50
  bpf_trace_printk+0x3f/0xe0
  bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
  bpf_trace_run2+0x26/0x90
  native_queued_spin_lock_slowpath+0x1c6/0x2b0
  _raw_spin_lock_irqsave+0x44/0x50
  bpf_trace_printk+0x3f/0xe0
  bpf_prog_a9aec6167c091eef_prog+0x1f/0x24
  bpf_trace_run2+0x26/0x90
  native_queued_spin_lock_slowpath+0x1c6/0x2b0
  _raw_spin_lock_irqsave+0x44/0x50
  __unfreeze_partials+0x5b/0x160
  ...

The can be reproduced by attaching bpf program as raw tracepoint on
contention_begin tracepoint. The bpf prog calls bpf_trace_printk
helper. Then by running perf bench the spin lock code is forced to
take slow path and call contention_begin tracepoint.

Fixing this by skipping execution of the bpf program if it's
already running, Using bpf prog 'active' field, which is being
currently used by trampoline programs for the same reason.

Moving bpf_prog_inc_misses_counter to syscall.c because
trampoline.c is compiled in just for CONFIG_BPF_JIT option.
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Reported-by: syzbot+2251879aa068ad9c960d@syzkaller.appspotmail.com
[1] https://lore.kernel.org/bpf/YxhFe3EwqchC%2FfYf@krava/T/#tSigned-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220916071914.7156-1-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

05b24ff9

Merge branch 'bpf: Add kfuncs for PKCS#7 signature verification' · 66d6a4bf

Alexei Starovoitov authored Sep 21, 2022

Roberto Sassu says:

====================
One of the desirable features in security is the ability to restrict import
of data to a given system based on data authenticity. If data import can be
restricted, it would be possible to enforce a system-wide policy based on
the signing keys the system owner trusts.

This feature is widely used in the kernel. For example, if the restriction
is enabled, kernel modules can be plugged in only if they are signed with a
key whose public part is in the primary or secondary keyring.

For eBPF, it can be useful as well. For example, it might be useful to
authenticate data an eBPF program makes security decisions on.

After a discussion in the eBPF mailing list, it was decided that the stated
goal should be accomplished by introducing four new kfuncs:
bpf_lookup_user_key() and bpf_lookup_system_key(), for retrieving a keyring
with keys trusted for signature verification, respectively from its serial
and from a pre-determined ID; bpf_key_put(), to release the reference
obtained with the former two kfuncs, bpf_verify_pkcs7_signature(), for
verifying PKCS#7 signatures.

Other than the key serial, bpf_lookup_user_key() also accepts key lookup
flags, that influence the behavior of the lookup. bpf_lookup_system_key()
accepts pre-determined IDs defined in include/linux/verification.h.

bpf_key_put() accepts the new bpf_key structure, introduced to tell whether
the other structure member, a key pointer, is valid or not. The reason is
that verify_pkcs7_signature() also accepts invalid pointers, set with the
pre-determined ID, to select a system-defined keyring. key_put() must be
called only for valid key pointers.

Since the two key lookup functions allocate memory and one increments a key
reference count, they must be used in conjunction with bpf_key_put(). The
latter must be called only if the lookup functions returned a non-NULL
pointer. The verifier denies the execution of eBPF programs that don't
respect this rule.

The two key lookup functions should be used in alternative, depending on
the use case. While bpf_lookup_user_key() provides great flexibility, it
seems suboptimal in terms of security guarantees, as even if the eBPF
program is assumed to be trusted, the serial used to obtain the key pointer
might come from untrusted user space not choosing one that the system
administrator approves to enforce a mandatory policy.

bpf_lookup_system_key() instead provides much stronger guarantees,
especially if the pre-determined ID is not passed by user space but is
hardcoded in the eBPF program, and that program is signed. In this case,
bpf_verify_pkcs7_signature() will always perform signature verification
with a key that the system administrator approves, i.e. the primary,
secondary or platform keyring.

Nevertheless, key permission checks need to be done accurately. Since
bpf_lookup_user_key() cannot determine how a key will be used by other
kfuncs, it has to defer the permission check to the actual kfunc using the
key. It does it by calling lookup_user_key() with KEY_DEFER_PERM_CHECK as
needed permission. Later, bpf_verify_pkcs7_signature(), if called,
completes the permission check by calling key_validate(). It does not need
to call key_task_permission() with permission KEY_NEED_SEARCH, as it is
already done elsewhere by the key subsystem. Future kfuncs using the
bpf_key structure need to implement the proper checks as well.

Finally, the last kfunc, bpf_verify_pkcs7_signature(), accepts the data and
signature to verify as eBPF dynamic pointers, to minimize the number of
kfunc parameters, and the keyring with keys for signature verification as a
bpf_key structure, returned by one of the two key lookup functions.

bpf_lookup_user_key() and bpf_verify_pkcs7_signature() can be called only
from sleepable programs, because of memory allocation and crypto
operations. For example, the lsm.s/bpf attach point is suitable,
fexit/array_map_update_elem is not.

The correctness of implementation of the new kfuncs and of their usage is
checked with the introduced tests.

The patch set includes a patch from another author (dependency) for sake of
completeness. It is organized as follows.

Patch 1 from KP Singh allows kfuncs to be used by LSM programs. Patch 2
exports the bpf_dynptr definition through BTF. Patch 3 splits
is_dynptr_reg_valid_init() and introduces is_dynptr_type_expected(), to
know more precisely the cause of a negative result of a dynamic pointer
check. Patch 4 allows dynamic pointers to be used as kfunc parameters.
Patch 5 exports bpf_dynptr_get_size(), to obtain the real size of data
carried by a dynamic pointer. Patch 6 makes available for new eBPF kfuncs
and programs some key-related definitions. Patch 7 introduces the
bpf_lookup_*_key() and bpf_key_put() kfuncs. Patch 8 introduces the
bpf_verify_pkcs7_signature() kfunc. Patch 9 changes the testing kernel
configuration to compile everything as built-in. Finally, patches 10-13
introduce the tests.

Changelog

v17:
 - Remove unnecessary typedefs in test_verify_pkcs7_sig.c (suggested by KP)
 - Add patch to export bpf_dynptr through BTF (reported by KP)
 - Rename u{8,16,32,64} variables to __u{8,16,32,64} in the tests, for
   consistency with other eBPF programs (suggested by Yonghong)

v16:
 - Remove comments in include/linux/key.h for KEY_LOOKUP_*
 - Change kmalloc() flag from GFP_ATOMIC to GFP_KERNEL in
   bpf_lookup_user_key(), as the kfunc needs anyway to be sleepable
   (suggested by Kumar)
 - Test passing a dynamic pointer with NULL data to
   bpf_verify_pkcs7_signature() (suggested by Kumar)

v15:
 - Add kfunc_dynptr_param test to deny list for s390x

v14:
 - Explain that is_dynptr_type_expected() will be useful also for BTF
   (suggested by Joanne)
 - Rename KEY_LOOKUP_FLAGS_ALL to KEY_LOOKUP_ALL (suggested by Jarkko)
 - Swap declaration of spi and dynptr_type in is_dynptr_type_expected()
   (suggested by Joanne)
 - Reimplement kfunc dynptr tests with a regular eBPF program instead of
   executing them with test_verifier (suggested by Joanne)
 - Make key lookup flags as enum so that they are automatically exported
   through BTF (suggested by Alexei)

v13:
 - Split is_dynptr_reg_valid_init() and introduce is_dynptr_type_expected()
   to see if the dynamic pointer type passed as argument to a kfunc is
   supported (suggested by Kumar)
 - Add forward declaration of struct key in include/linux/bpf.h (suggested
   by Song)
 - Declare mask for key lookup flags, remove key_lookup_flags_check()
   (suggested by Jarkko and KP)
 - Allow only certain dynamic pointer types (currently, local) to be passed
   as argument to kfuncs (suggested by Kumar)
 - For each dynamic pointer parameter in kfunc, additionally check if the
   passed pointer is to the stack (suggested by Kumar)
 - Split the validity/initialization and dynamic pointer type check also in
   the verifier, and adjust the expected error message in the test (a test
   for an unexpected dynptr type passed to a helper cannot be added due to
   missing suitable helpers, but this case has been tested manually)
 - Add verifier tests to check the dynamic pointers passed as argument to
   kfuncs (suggested by Kumar)

v12:
 - Put lookup_key and verify_pkcs7_sig tests in deny list for s390x (JIT
   does not support calling kernel function)

v11:
 - Move stringify_struct() macro to include/linux/btf.h (suggested by
   Daniel)
 - Change kernel configuration options in
   tools/testing/selftests/bpf/config* from =m to =y

v10:
 - Introduce key_lookup_flags_check() and system_keyring_id_check() inline
   functions to check parameters (suggested by KP)
 - Fix descriptions and comment of key-related kfuncs (suggested by KP)
 - Register kfunc set only once (suggested by Alexei)
 - Move needed kernel options to the architecture-independent configuration
   for testing

v9:
 - Drop patch to introduce KF_SLEEPABLE kfunc flag (already merged)
 - Rename valid_ptr member of bpf_key to has_ref (suggested by Daniel)
 - Check dynamic pointers in kfunc definition with bpf_dynptr_kern struct
   definition instead of string, to detect structure renames (suggested by
   Daniel)
 - Explicitly say that we permit initialized dynamic pointers in kfunc
   definition (suggested by Daniel)
 - Remove noinline __weak from kfuncs definition (reported by Daniel)
 - Simplify key lookup flags check in bpf_lookup_user_key() (suggested by
   Daniel)
 - Explain the reason for deferring key permission check (suggested by
   Daniel)
 - Allocate memory with GFP_ATOMIC in bpf_lookup_system_key(), and remove
   KF_SLEEPABLE kfunc flag from kfunc declaration (suggested by Daniel)
 - Define only one kfunc set and remove the loop for registration
   (suggested by Alexei)

v8:
 - Define the new bpf_key structure to carry the key pointer and whether
   that pointer is valid or not (suggested by Daniel)
 - Drop patch to mark a kfunc parameter with the __maybe_null suffix
 - Improve documentation of kfuncs
 - Introduce bpf_lookup_system_key() to obtain a key pointer suitable for
   verify_pkcs7_signature() (suggested by Daniel)
 - Use the new kfunc registration API
 - Drop patch to test the __maybe_null suffix
 - Add tests for bpf_lookup_system_key()

v7:
 - Add support for using dynamic and NULL pointers in kfunc (suggested by
   Alexei)
 - Add new kfunc-related tests

v6:
 - Switch back to key lookup helpers + signature verification (until v5),
   and defer permission check from bpf_lookup_user_key() to
   bpf_verify_pkcs7_signature()
 - Add additional key lookup test to illustrate the usage of the
   KEY_LOOKUP_CREATE flag and validate the flags (suggested by Daniel)
 - Make description of flags of bpf_lookup_user_key() more user-friendly
   (suggested by Daniel)
 - Fix validation of flags parameter in bpf_lookup_user_key() (reported by
   Daniel)
 - Rename bpf_verify_pkcs7_signature() keyring-related parameters to
   user_keyring and system_keyring to make their purpose more clear
 - Accept keyring-related parameters of bpf_verify_pkcs7_signature() as
   alternatives (suggested by KP)
 - Replace unsigned long type with u64 in helper declaration (suggested by
   Daniel)
 - Extend the bpf_verify_pkcs7_signature() test by calling the helper
   without data, by ensuring that the helper enforces the keyring-related
   parameters as alternatives, by ensuring that the helper rejects
   inaccessible and expired keyrings, and by checking all system keyrings
 - Move bpf_lookup_user_key() and bpf_key_put() usage tests to
   ref_tracking.c (suggested by John)
 - Call bpf_lookup_user_key() and bpf_key_put() only in sleepable programs

v5:
 - Move KEY_LOOKUP_ to include/linux/key.h
   for validation of bpf_verify_pkcs7_signature() parameter
 - Remove bpf_lookup_user_key() and bpf_key_put() helpers, and the
   corresponding tests
 - Replace struct key parameter of bpf_verify_pkcs7_signature() with the
   keyring serial and lookup flags
 - Call lookup_user_key() and key_put() in bpf_verify_pkcs7_signature()
   code, to ensure that the retrieved key is used according to the
   permission requested at lookup time
 - Clarified keyring precedence in the description of
   bpf_verify_pkcs7_signature() (suggested by John)
 - Remove newline in the second argument of ASSERT_
 - Fix helper prototype regular expression in bpf_doc.py

v4:
 - Remove bpf_request_key_by_id(), don't return an invalid pointer that
   other helpers can use
 - Pass the keyring ID (without ULONG_MAX, suggested by Alexei) to
   bpf_verify_pkcs7_signature()
 - Introduce bpf_lookup_user_key() and bpf_key_put() helpers (suggested by
   Alexei)
 - Add lookup_key_norelease test, to ensure that the verifier blocks eBPF
   programs which don't decrement the key reference count
 - Parse raw PKCS#7 signature instead of module-style signature in the
   verify_pkcs7_signature test (suggested by Alexei)
 - Parse kernel module in user space and pass raw PKCS#7 signature to the
   eBPF program for signature verification

v3:
 - Rename bpf_verify_signature() back to bpf_verify_pkcs7_signature() to
   avoid managing different parameters for each signature verification
   function in one helper (suggested by Daniel)
 - Use dynamic pointers and export bpf_dynptr_get_size() (suggested by
   Alexei)
 - Introduce bpf_request_key_by_id() to give more flexibility to the caller
   of bpf_verify_pkcs7_signature() to retrieve the appropriate keyring
   (suggested by Alexei)
 - Fix test by reordering the gcc command line, always compile sign-file
 - Improve helper support check mechanism in the test

v2:
 - Rename bpf_verify_pkcs7_signature() to a more generic
   bpf_verify_signature() and pass the signature type (suggested by KP)
 - Move the helper and prototype declaration under #ifdef so that user
   space can probe for support for the helper (suggested by Daniel)
 - Describe better the keyring types (suggested by Daniel)
 - Include linux/bpf.h instead of vmlinux.h to avoid implicit or
   redeclaration
 - Make the test selfcontained (suggested by Alexei)

v1:
 - Don't define new map flag but introduce simple wrapper of
   verify_pkcs7_signature() (suggested by Alexei and KP)
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

66d6a4bf

selftests/bpf: Add tests for dynamic pointers parameters in kfuncs · b94fa9f9

Roberto Sassu authored Sep 20, 2022

Add tests to ensure that only supported dynamic pointer types are accepted,
that the passed argument is actually a dynamic pointer, that the passed
argument is a pointer to the stack, and that bpf_verify_pkcs7_signature()
correctly handles dynamic pointers with data set to NULL.

The tests are currently in the deny list for s390x (JIT does not support
calling kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-14-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b94fa9f9

selftests/bpf: Add test for bpf_verify_pkcs7_signature() kfunc · fc975906

Roberto Sassu authored Sep 20, 2022

Perform several tests to ensure the correct implementation of the
bpf_verify_pkcs7_signature() kfunc.

Do the tests with data signed with a generated testing key (by using
sign-file from scripts/) and with the tcp_bic.ko kernel module if it is
found in the system. The test does not fail if tcp_bic.ko is not found.

First, perform an unsuccessful signature verification without data.

Second, perform a successful signature verification with the session
keyring and a new one created for testing.

Then, ensure that permission and validation checks are done properly on the
keyring provided to bpf_verify_pkcs7_signature(), despite those checks were
deferred at the time the keyring was retrieved with bpf_lookup_user_key().
The tests expect to encounter an error if the Search permission is removed
from the keyring, or the keyring is expired.

Finally, perform a successful and unsuccessful signature verification with
the keyrings with pre-determined IDs (the last test fails because the key
is not in the platform keyring).

The test is currently in the deny list for s390x (JIT does not support
calling kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/20220920075951.929132-13-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

fc975906

selftests/bpf: Add additional tests for bpf_lookup_*_key() · ecce368d

Roberto Sassu authored Sep 20, 2022

Add a test to ensure that bpf_lookup_user_key() creates a referenced
special keyring when the KEY_LOOKUP_CREATE flag is passed to this function.

Ensure that the kfunc rejects invalid flags.

Ensure that a keyring can be obtained from bpf_lookup_system_key() when one
of the pre-determined keyring IDs is provided.

The test is currently blacklisted for s390x (JIT does not support calling
kernel function).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://lore.kernel.org/r/20220920075951.929132-12-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ecce368d

selftests/bpf: Add verifier tests for bpf_lookup_*_key() and bpf_key_put() · 7c036ed9

Roberto Sassu authored Sep 20, 2022

Add verifier tests for bpf_lookup_*_key() and bpf_key_put(), to ensure that
acquired key references stored in the bpf_key structure are released, that
a non-NULL bpf_key pointer is passed to bpf_key_put(), and that key
references are not leaked.

Also, slightly modify test_verifier.c, to find the BTF ID of the attach
point for the LSM program type (currently, it is done only for TRACING).
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-11-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

7c036ed9

selftests/bpf: Compile kernel with everything as built-in · 94fd7420

Roberto Sassu authored Sep 20, 2022

Since the eBPF CI does not support kernel modules, change the kernel config
to compile everything as built-in.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Daniel Müller <deso@posteo.net>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-10-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

94fd7420

bpf: Add bpf_verify_pkcs7_signature() kfunc · 865b0566

Roberto Sassu authored Sep 20, 2022

Add the bpf_verify_pkcs7_signature() kfunc, to give eBPF security modules
the ability to check the validity of a signature against supplied data, by
using user-provided or system-provided keys as trust anchor.

The new kfunc makes it possible to enforce mandatory policies, as eBPF
programs might be allowed to make security decisions only based on data
sources the system administrator approves.

The caller should provide the data to be verified and the signature as eBPF
dynamic pointers (to minimize the number of parameters) and a bpf_key
structure containing a reference to the keyring with keys trusted for
signature verification, obtained from bpf_lookup_user_key() or
bpf_lookup_system_key().

For bpf_key structures obtained from the former lookup function,
bpf_verify_pkcs7_signature() completes the permission check deferred by
that function by calling key_validate(). key_task_permission() is already
called by the PKCS#7 code.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20220920075951.929132-9-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

865b0566

bpf: Add bpf_lookup_*_key() and bpf_key_put() kfuncs · f3cf4134

Roberto Sassu authored Sep 20, 2022

Add the bpf_lookup_user_key(), bpf_lookup_system_key() and bpf_key_put()
kfuncs, to respectively search a key with a given key handle serial number
and flags, obtain a key from a pre-determined ID defined in
include/linux/verification.h, and cleanup.

Introduce system_keyring_id_check() to validate the keyring ID parameter of
bpf_lookup_system_key().
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20220920075951.929132-8-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f3cf4134

KEYS: Move KEY_LOOKUP_ to include/linux/key.h and define KEY_LOOKUP_ALL · 90fd8f26

Roberto Sassu authored Sep 20, 2022

In preparation for the patch that introduces the bpf_lookup_user_key() eBPF
kfunc, move KEY_LOOKUP_ definitions to include/linux/key.h, to be able to
validate the kfunc parameters. Add them to enum key_lookup_flag, so that
all the current ones and the ones defined in the future are automatically
exported through BTF and available to eBPF programs.

Also, add KEY_LOOKUP_ALL to the enum, with the logical OR of currently
defined flags as value, to facilitate checking whether a variable contains
only those flags.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Link: https://lore.kernel.org/r/20220920075951.929132-7-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

90fd8f26

bpf: Export bpf_dynptr_get_size() · 51df4865

Roberto Sassu authored Sep 20, 2022

Export bpf_dynptr_get_size(), so that kernel code dealing with eBPF dynamic
pointers can obtain the real size of data carried by this data structure.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-6-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

51df4865

btf: Allow dynamic pointer parameters in kfuncs · b8d31762

Roberto Sassu authored Sep 20, 2022

Allow dynamic pointers (struct bpf_dynptr_kern *) to be specified as
parameters in kfuncs. Also, ensure that dynamic pointers passed as argument
are valid and initialized, are a pointer to the stack, and of the type
local. More dynamic pointer types can be supported in the future.

To properly detect whether a parameter is of the desired type, introduce
the stringify_struct() macro to compare the returned structure name with
the desired name. In addition, protect against structure renames, by
halting the build with BUILD_BUG_ON(), so that developers have to revisit
the code.

To check if a dynamic pointer passed to the kfunc is valid and initialized,
and if its type is local, export the existing functions
is_dynptr_reg_valid_init() and is_dynptr_type_expected().

Cc: Joanne Koong <joannelkoong@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-5-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

b8d31762

bpf: Move dynptr type check to is_dynptr_type_expected() · e9e315b4

Roberto Sassu authored Sep 20, 2022

Move dynptr type check to is_dynptr_type_expected() from
is_dynptr_reg_valid_init(), so that callers can better determine the cause
of a negative result (dynamic pointer not valid/initialized, dynamic
pointer of the wrong type). It will be useful for example for BTF, to
restrict which dynamic pointer types can be passed to kfuncs, as initially
only the local type will be supported.

Also, splitting makes the code more readable, since checking the dynamic
pointer type is not necessarily related to validity and initialization.

Split the validity/initialization and dynamic pointer type check also in
the verifier, and adjust the expected error message in the test (a test for
an unexpected dynptr type passed to a helper cannot be added due to missing
suitable helpers, but this case has been tested manually).

Cc: Joanne Koong <joannelkoong@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-4-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e9e315b4

btf: Export bpf_dynptr definition · 00f14641

Roberto Sassu authored Sep 20, 2022

eBPF dynamic pointers is a new feature recently added to upstream. It binds
together a pointer to a memory area and its size. The internal kernel
structure bpf_dynptr_kern is not accessible by eBPF programs in user space.
They instead see bpf_dynptr, which is then translated to the internal
kernel structure by the eBPF verifier.

The problem is that it is not possible to include at the same time the uapi
include linux/bpf.h and the vmlinux BTF vmlinux.h, as they both contain the
definition of some structures/enums. The compiler complains saying that the
structures/enums are redefined.

As bpf_dynptr is defined in the uapi include linux/bpf.h, this makes it
impossible to include vmlinux.h. However, in some cases, e.g. when using
kfuncs, vmlinux.h has to be included. The only option until now was to
include vmlinux.h and add the definition of bpf_dynptr directly in the eBPF
program source code from linux/bpf.h.

Solve the problem by using the same approach as for bpf_timer (which also
follows the same scheme with the _kern suffix for the internal kernel
structure).

Add the following line in one of the dynamic pointer helpers,
bpf_dynptr_from_mem():

BTF_TYPE_EMIT(struct bpf_dynptr);

Cc: stable@vger.kernel.org
Cc: Joanne Koong <joannelkoong@gmail.com>
Fixes: 97e03f52 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Tested-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/r/20220920075951.929132-3-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

00f14641

bpf: Allow kfuncs to be used in LSM programs · d15bf150

KP Singh authored Sep 20, 2022

In preparation for the addition of new kfuncs, allow kfuncs defined in the
tracing subsystem to be used in LSM programs by mapping the LSM program
type to the TRACING hook.
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220920075951.929132-2-roberto.sassu@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d15bf150

libbpf: Support raw BTF placed in the default search path · 01f2e36c

Tao Chen authored Sep 13, 2022

Currently, the default vmlinux files at '/boot/vmlinux-*',
'/lib/modules/*/vmlinux-*' etc. are parsed with 'btf__parse_elf()' to
extract BTF. It is possible that these files are actually raw BTF files
similar to /sys/kernel/btf/vmlinux. So parse these files with
'btf__parse' which tries both raw format and ELF format.

This might be useful in some scenarios where users put their custom BTF
into known locations and don't want to specify btf_custom_path option.
Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/3f59fb5a345d2e4f10e16fe9e35fbc4c03ecaa3e.1662999860.git.chentao.kernel@linux.alibaba.com

01f2e36c

selftests: bpf: test_kmod.sh: Pass parameters to the module · 272d1f4c

Yauheni Kaliuta authored Sep 08, 2022

It's possible to specify particular tests for test_bpf.ko with
module parameters. Make it possible to pass the module parameters,
example:

test_kmod.sh test_range=1,3

Since magnitude tests take long time it can be reasonable to skip
them.
Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220908120146.381218-1-ykaliuta@redhat.com

272d1f4c

libbpf: Improve BPF_PROG2 macro code quality and description · 9f2f5d78

Yonghong Song authored Sep 09, 2022

Commit 34586d29 ("libbpf: Add new BPF_PROG2 macro") added BPF_PROG2
macro for trampoline based programs with struct arguments. Andrii
made a few suggestions to improve code quality and description.
This patch implemented these suggestions including better internal
macro name, consistent usage pattern for __builtin_choose_expr(),
simpler macro definition for always-inline func arguments and
better macro description.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20220910025214.1536510-1-yhs@fb.com

9f2f5d78

21 Sep, 2022 8 commits

Merge branch 'bpf: Add user-space-publisher ring buffer map type' · c12a0376

Andrii Nakryiko authored Sep 21, 2022

David Vernet says:

====================
This patch set defines a new map type, BPF_MAP_TYPE_USER_RINGBUF, which
provides single-user-space-producer / single-kernel-consumer semantics over
a ring buffer.  Along with the new map type, a helper function called
bpf_user_ringbuf_drain() is added which allows a BPF program to specify a
callback with the following signature, to which samples are posted by the
helper:

void (struct bpf_dynptr *dynptr, void *context);

The program can then use the bpf_dynptr_read() or bpf_dynptr_data() helper
functions to safely read the sample from the dynptr. There are currently no
helpers available to determine the size of the sample, but one could easily
be added if required.

On the user-space side, libbpf has been updated to export a new
'struct ring_buffer_user' type, along with the following symbols:

struct ring_buffer_user *
ring_buffer_user__new(int map_fd,
                      const struct ring_buffer_user_opts *opts);
void ring_buffer_user__free(struct ring_buffer_user *rb);
void *ring_buffer_user__reserve(struct ring_buffer_user *rb,
				uint32_t size);
void *ring_buffer_user__poll(struct ring_buffer_user *rb, uint32_t size,
			     int timeout_ms);
void ring_buffer_user__discard(struct ring_buffer_user *rb, void *sample);
void ring_buffer_user__submit(struct ring_buffer_user *rb, void *sample);

These symbols are exported for inclusion in libbpf version 1.0.0.
Signed-off-by: David Vernet <void@manifault.com>
---
v5 -> v6:
- Fixed s/BPF_MAP_TYPE_RINGBUF/BPF_MAP_TYPE_USER_RINGBUF typo in the
  libbpf user ringbuf doxygen header comment for ring_buffer_user__new()
  (Andrii).
- Specify that pointer returned from ring_buffer_user__reserve() and its
  blocking counterpart is 8-byte aligned (Andrii).
- Renamed user_ringbuf__commit() to user_ringbuf_commit(), as it's static
  (Andrii).
- Another slight reworking of user_ring_buffer__reserve_blocking() to
  remove some extraneous nanosecond variables + checking (Andrii).
- Add a final check of user_ring_buffer__reserve() in
  user_ring_buffer__reserve_blocking().
- Moved busy bit lock / unlock logic from __bpf_user_ringbuf_peek() to
  bpf_user_ringbuf_drain() (Andrii).
- -ENOSPC -> -ENODATA for an empty ring buffer in
  __bpf_user_ringbuf_peek() (Andrii).
- Updated BPF_RB_FORCE_WAKEUP to only force a wakeup notification to be
  sent if even if no sample was drained.
- Changed a bit of the wording in the UAPI header for
  bpf_user_ringbuf_drain() to mention the BPF_RB_FORCE_WAKEUP behavior.
- Remove extra space after return in ringbuf_map_poll_user() (Andrii).
- Removed now-extraneous paragraph from the commit summary of patch 2/4
  (Andrii).
v4 -> v5:
- DENYLISTed the user-ringbuf test suite on s390x. We have a number of
  functions in the progs/user_ringbuf_success.c prog that user-space
  fires by invoking a syscall. Not all of these syscalls are available
  on s390x. If and when we add the ability to kick the kernel from
  user-space, or if we end up using iterators for that per Hao's
  suggestion, we could re-enable this test suite on s390x.
- Fixed a few more places that needed ringbuffer -> ring buffer.
v3 -> v4:
- Update BPF_MAX_USER_RINGBUF_SAMPLES to not specify a bit, and instead
  just specify a number of samples. (Andrii)
- Update "ringbuffer" in comments and commit summaries to say "ring
  buffer". (Andrii)
- Return -E2BIG from bpf_user_ringbuf_drain() both when a sample can't
  fit into the ring buffer, and when it can't fit into a dynptr. (Andrii)
- Don't loop over samples in __bpf_user_ringbuf_peek() if a sample was
  discarded. Instead, return -EAGAIN so the caller can deal with it. Also
  updated the caller to detect -EAGAIN and skip over it when iterating.
  (Andrii)
- Removed the heuristic for notifying user-space when a sample is drained,
  causing the ring buffer to no longer be full. This may be useful in the
  future, but is being removed now because it's strictly a heuristic.
- Re-add BPF_RB_FORCE_WAKEUP flag to bpf_user_ringbuf_drain(). (Andrii)
- Remove helper_allocated_dynptr tracker from verifier. (Andrii)
- Add libbpf function header comments to tools/lib/bpf/libbpf.h, so that
  they will be included in rendered libbpf docs. (Andrii)
- Add symbols to a new LIBBPF_1.1.0 section in linker version script,
  rather than including them in LIBBPF_1.0.0. (Andrii)
- Remove libbpf_err() calls from static libbpf functions. (Andrii)
- Check user_ring_buffer_opts instead of ring_buffer_opts in
  user_ring_buffer__new(). (Andrii)
- Avoid an extra if in the hot path in user_ringbuf__commit(). (Andrii)
- Use ENOSPC rather than ENODATA if no space is available in the ring
  buffer. (Andrii)
- Don't round sample size in header to 8, but still round size that is
  reserved and written to 8, and validate positions are multiples of 8
  (Andrii).
- Use nanoseconds for most calculations in
  user_ring_buffer__reserve_blocking(). (Andrii)
- Don't use CHECK() in testcases, instead use ASSERT_*. (Andrii)
- Use SEC("?raw_tp") instead of SEC("?raw_tp/sys_nanosleep") in negative
  test. (Andrii)
- Move test_user_ringbuf.h header to live next to BPF program instead of
  a directory up from both it and the user-space test program. (Andrii)
- Update bpftool help message / docs to also include user_ringbuf.
v2 -> v3:
- Lots of formatting fixes, such as keeping things on one line if they fit
  within 100 characters, and removing some extraneous newlines. Applies
  to all diffs in the patch-set. (Andrii)
- Renamed ring_buffer_user__* symbols to user_ring_buffer__*. (Andrii)
- Added a missing smb_mb__before_atomic() in
  __bpf_user_ringbuf_sample_release(). (Hao)
- Restructure how and when notification events are sent from the kernel to
  the user-space producers via the .map_poll() callback for the
  BPF_MAP_TYPE_USER_RINGBUF map. Before, we only sent a notification when
  the ringbuffer was fully drained. Now, we guarantee user-space that
  we'll send an event at least once per bpf_user_ringbuf_drain(), as long
  as at least one sample was drained, and BPF_RB_NO_WAKEUP was not passed.
  As a heuristic, we also send a notification event any time a sample being
  drained causes the ringbuffer to no longer be full. (Andrii)
- Continuing on the above point, updated
  user_ring_buffer__reserve_blocking() to loop around epoll_wait() until a
  sufficiently large sample is found. (Andrii)
- Communicate BPF_RINGBUF_BUSY_BIT and BPF_RINGBUF_DISCARD_BIT in sample
  headers. The ringbuffer implementation still only supports
  single-producer semantics, but we can now add synchronization support in
  user_ring_buffer__reserve(), and will automatically get multi-producer
  semantics. (Andrii)
- Updated some commit summaries, specifically adding more details where
  warranted. (Andrii)
- Improved function documentation for bpf_user_ringbuf_drain(), more
  clearly explaining all function arguments and return types, as well as
  the semantics for waking up user-space producers.
- Add function header comments for user_ring_buffer__reserve{_blocking}().
  (Andrii)
- Rounding-up all samples to 8-bytes in the user-space producer, and
  enforcing that all samples are properly aligned in the kernel. (Andrii)
- Added testcases that verify that bpf_user_ringbuf_drain() properly
  validates samples, and returns error conditions if any invalid samples
  are encountered. (Andrii)
- Move atomic_t busy field out of the consumer page, and into the
  struct bpf_ringbuf. (Andrii)
- Split ringbuf_map_{mmap, poll}_{kern, user}() into separate
  implementations. (Andrii)
- Don't silently consume errors in bpf_user_ringbuf_drain(). (Andrii)
- Remove magic number of samples (4096) from bpf_user_ringbuf_drain(),
  and instead use BPF_MAX_USER_RINGBUF_SAMPLES macro, which allows
  128k samples. (Andrii)
- Remove MEM_ALLOC modifier from PTR_TO_DYNPTR register in verifier, and
  instead rely solely on the register being PTR_TO_DYNPTR. (Andrii)
- Move freeing of atomic_t busy bit to before we invoke irq_work_queue() in
  __bpf_user_ringbuf_sample_release(). (Andrii)
- Only check for BPF_RB_NO_WAKEUP flag in bpf_ringbuf_drain().
- Remove libbpf function names from kernel smp_{load, store}* comments in
  the kernel. (Andrii)
- Don't use double-underscore naming convention in libbpf functions.
  (Andrii)
- Use proper __u32 and __u64 for types where we need to guarantee their
  size. (Andrii)

v1 -> v2:
- Following Joanne landing 88374342 ("bpf: Fix ref_obj_id for dynptr
  data slices in verifier") [0], removed [PATCH 1/5] bpf: Clear callee
  saved regs after updating REG0 [1]. (Joanne)
- Following the above adjustment, updated check_helper_call() to not store
  a reference for bpf_dynptr_data() if the register containing the dynptr
  is of type MEM_ALLOC. (Joanne)
- Fixed casting issue pointed out by kernel test robot by adding a missing
  (uintptr_t) cast. (lkp)

[0] https://lore.kernel.org/all/20220809214055.4050604-1-joannelkoong@gmail.com/
[1] https://lore.kernel.org/all/20220808155341.2479054-1-void@manifault.com/
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

c12a0376

selftests/bpf: Add selftests validating the user ringbuf · e5a9df51

David Vernet authored Sep 19, 2022

This change includes selftests that validate the expected behavior and
APIs of the new BPF_MAP_TYPE_USER_RINGBUF map type.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-5-void@manifault.com

e5a9df51

bpf: Add libbpf logic for user-space ring buffer · b66ccae0

David Vernet authored Sep 19, 2022

Now that all of the logic is in place in the kernel to support user-space
produced ring buffers, we can add the user-space logic to libbpf. This
patch therefore adds the following public symbols to libbpf:

struct user_ring_buffer *
user_ring_buffer__new(int map_fd,
		      const struct user_ring_buffer_opts *opts);
void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);
void user_ring_buffer__submit(struct user_ring_buffer *rb, void *sample);
void user_ring_buffer__discard(struct user_ring_buffer *rb,
void user_ring_buffer__free(struct user_ring_buffer *rb);

A user-space producer must first create a struct user_ring_buffer * object
with user_ring_buffer__new(), and can then reserve samples in the
ring buffer using one of the following two symbols:

void *user_ring_buffer__reserve(struct user_ring_buffer *rb, __u32 size);
void *user_ring_buffer__reserve_blocking(struct user_ring_buffer *rb,
                                         __u32 size, int timeout_ms);

With user_ring_buffer__reserve(), a pointer to a 'size' region of the ring
buffer will be returned if sufficient space is available in the buffer.
user_ring_buffer__reserve_blocking() provides similar semantics, but will
block for up to 'timeout_ms' in epoll_wait if there is insufficient space
in the buffer. This function has the guarantee from the kernel that it will
receive at least one event-notification per invocation to
bpf_ringbuf_drain(), provided that at least one sample is drained, and the
BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain().

Once a sample is reserved, it must either be committed to the ring buffer
with user_ring_buffer__submit(), or discarded with
user_ring_buffer__discard().
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-4-void@manifault.com

b66ccae0

bpf: Add bpf_user_ringbuf_drain() helper · 20571567

David Vernet authored Sep 19, 2022

In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which
will allow user-space applications to publish messages to a ring buffer
that is consumed by a BPF program in kernel-space. In order for this
map-type to be useful, it will require a BPF helper function that BPF
programs can invoke to drain samples from the ring buffer, and invoke
callbacks on those samples. This change adds that capability via a new BPF
helper function:

bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void *ctx,
                       u64 flags)

BPF programs may invoke this function to run callback_fn() on a series of
samples in the ring buffer. callback_fn() has the following signature:

long callback_fn(struct bpf_dynptr *dynptr, void *context);

Samples are provided to the callback in the form of struct bpf_dynptr *'s,
which the program can read using BPF helper functions for querying
struct bpf_dynptr's.

In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register
type is added to the verifier to reflect a dynptr that was allocated by
a helper function and passed to a BPF program. Unlike PTR_TO_STACK
dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR
dynptrs need not use reference tracking, as the BPF helper is trusted to
properly free the dynptr before returning. The verifier currently only
supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL.

Note that while the corresponding user-space libbpf logic will be added
in a subsequent patch, this patch does contain an implementation of the
.map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This
.map_poll() callback guarantees that an epoll-waiting user-space
producer will receive at least one event notification whenever at least
one sample is drained in an invocation of bpf_user_ringbuf_drain(),
provided that the function is not invoked with the BPF_RB_NO_WAKEUP
flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup
notification is sent even if no sample was drained.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-3-void@manifault.com

20571567

bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type · 583c1f42

David Vernet authored Sep 19, 2022

We want to support a ringbuf map type where samples are published from
user-space, to be consumed by BPF programs. BPF currently supports a
kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF
map type.  We'll need to define a new map type for user-space -> kernel,
as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply
to a user-space producer ring buffer, and we'll want to add one or
more helper functions that would not apply for a kernel-producer
ring buffer.

This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
definition. The map type is useless in its current form, as there is no
way to access or use it for anything until we one or more BPF helpers. A
follow-on patch will therefore add a new helper function that allows BPF
programs to run callbacks on samples that are published to the ring
buffer.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com

583c1f42

bpf: simplify code in btf_parse_hdr · 3a74904c

William Dean authored Sep 17, 2022

It could directly return 'btf_check_sec_info' to simplify code.
Signed-off-by: William Dean <williamsukatube@163.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220917084248.3649-1-williamsukatube@163.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

3a74904c

libbpf: Fix NULL pointer exception in API btf_dump__dump_type_data · 7620bffb

Xin Liu authored Sep 17, 2022

We found that function btf_dump__dump_type_data can be called by the
user as an API, but in this function, the `opts` parameter may be used
as a null pointer.This causes `opts->indent_str` to trigger a NULL
pointer exception.

Fixes: 2ce8450e ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts")
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Weibin Kong <kongweibin2@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220917084809.30770-1-liuxin350@huawei.com

7620bffb

samples/bpf: Replace blk_account_io_done() with __blk_account_io_done() · bc069da6

Rong Tao authored Sep 11, 2022

Since commit be6bfe36 ("block: inline hot paths of blk_account_io_*()")
blk_account_io_*() become inline functions.
Signed-off-by: Rong Tao <rtoax@foxmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/tencent_1CC476835C219FACD84B6715F0D785517E07@qq.com

bc069da6

20 Sep, 2022 5 commits

Merge branch 'bpf: Small nf_conn cleanups' · bfa8fe95

Martin KaFai Lau authored Sep 20, 2022

Daniel Xu says:

====================

This patchset cleans up a few small things:

* Delete unused stub
* Rename variable to be more descriptive
* Fix some `extern` declaration warnings

Past discussion:
- v2: https://lore.kernel.org/bpf/cover.1663616584.git.dxu@dxuuu.xyz/

Changes since v2:
- Remove unused #include's
- Move #include <linux/filter.h> to .c
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

bfa8fe95

bpf: Move nf_conn extern declarations to filter.h · fdf21497

Daniel Xu authored Sep 20, 2022

We're seeing the following new warnings on netdev/build_32bit and
netdev/build_allmodconfig_warn CI jobs:

    ../net/core/filter.c:8608:1: warning: symbol
    'nf_conn_btf_access_lock' was not declared. Should it be static?
    ../net/core/filter.c:8611:5: warning: symbol 'nfct_bsa' was not
    declared. Should it be static?

Fix by ensuring extern declaration is present while compiling filter.o.

Fixes: 864b656f ("bpf: Add support for writing to nf_conn:mark")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/2bd2e0283df36d8a4119605878edb1838d144174.1663683114.git.dxu@dxuuu.xyzSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

fdf21497

bpf: Rename nfct_bsa to nfct_btf_struct_access · 5a090aa3

Daniel Xu authored Sep 20, 2022

The former name was a little hard to guess.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/73adc72385c8b162391fbfb404f0b6d4c5cc55d7.1663683114.git.dxu@dxuuu.xyzSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

5a090aa3

bpf: Remove unused btf_struct_access stub · 52bdae37

Daniel Xu authored Sep 20, 2022

This stub was not being used anywhere.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/590e7bd6172ffe0f3d7b51cd40e8ded941aaf7e8.1663683114.git.dxu@dxuuu.xyzSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

52bdae37

bpf: Check whether or not node is NULL before free it in free_bulk · c31b38cb

Hou Tao authored Sep 19, 2022

llnode could be NULL if there are new allocations after the checking of
c-free_cnt > c->high_watermark in bpf_mem_refill() and before the
calling of __llist_del_first() in free_bulk (e.g. a PREEMPT_RT kernel
or allocation in NMI context). And it will incur oops as shown below:

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT_RT SMP
 CPU: 39 PID: 373 Comm: irq_work/39 Tainted: G        W          6.0.0-rc6-rt9+ #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 RIP: 0010:bpf_mem_refill+0x66/0x130
 ......
 Call Trace:
  <TASK>
  irq_work_single+0x24/0x60
  irq_work_run_list+0x24/0x30
  run_irq_workd+0x18/0x20
  smpboot_thread_fn+0x13f/0x2c0
  kthread+0x121/0x140
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

Simply fixing it by checking whether or not llnode is NULL in free_bulk().

Fixes: 8d5a8011 ("bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220919144811.3570825-1-houtao@huaweicloud.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c31b38cb

19 Sep, 2022 1 commit

selftests/bpf: Add test result messages for test_task_storage_map_stress_lookup · a7e85406

Hou Tao authored Sep 19, 2022

Add test result message when test_task_storage_map_stress_lookup()
succeeds or is skipped. The test case can be skipped due to the choose
of preemption model in kernel config, so export skips in test_maps.c and
increase it when needed.

The following is the output of test_maps when the test case succeeds or
is skipped:

  test_task_storage_map_stress_lookup:PASS
  test_maps: OK, 0 SKIPPED

  test_task_storage_map_stress_lookup SKIP (no CONFIG_PREEMPT)
  test_maps: OK, 1 SKIPPED

Fixes: 73b97bc7 ("selftests/bpf: Test concurrent updates on bpf_task_storage_busy")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20220919035714.2195144-1-houtao@huaweicloud.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

a7e85406

17 Sep, 2022 1 commit

bpf/btf: Use btf_type_str() whenever possible · 571f9738

Peilin Ye authored Sep 16, 2022

We have btf_type_str(). Use it whenever possible in btf.c, instead of
"btf_kind_str[BTF_INFO_KIND(t->info)]".
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Link: https://lore.kernel.org/r/20220916202800.31421-1-yepeilin.cs@gmail.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

571f9738

16 Sep, 2022 5 commits

libbpf: Clean up legacy bpf maps declaration in bpf_helpers · dc567045

Xin Liu authored Sep 13, 2022

Legacy BPF map declarations are no longer supported in libbpf v1.0 [0].
Only BTF-defined maps are supported starting from v1.0, so it is time to
remove the definition of bpf_map_def in bpf_helpers.h.

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20220913073643.19960-1-liuxin350@huawei.com

dc567045

selftests/bpf: Add veristat tool for mass-verifying BPF object files · c8bc5e05

Andrii Nakryiko authored Sep 09, 2022

Add a small tool, veristat, that allows mass-verification of
a set of *libbpf-compatible* BPF ELF object files. For each such object
file, veristat will attempt to verify each BPF program *individually*.
Regardless of success or failure, it parses BPF verifier stats and
outputs them in human-readable table format. In the future we can also
add CSV and JSON output for more scriptable post-processing, if necessary.

veristat allows to specify a set of stats that should be output and
ordering between multiple objects and files (e.g., so that one can
easily order by total instructions processed, instead of default file
name, prog name, verdict, total instructions order).

This tool should be useful for validating various BPF verifier changes
or even validating different kernel versions for regressions.

Here's an example for some of the heaviest selftests/bpf BPF object
files:

$ sudo ./veristat -s insns,file,prog {pyperf,loop,test_verif_scale,strobemeta,test_cls_redirect,profiler}*.linked3.o
File Program Verdict Duration, us Total insns Total states Peak states
------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ -----------
loop3.linked3.o while_true failure 350990 1000001 9663 9663
test_verif_scale3.linked3.o balancer_ingress success 115244 845499 8636 2141
test_verif_scale2.linked3.o balancer_ingress success 77688 773445 3048 788
pyperf600.linked3.o on_event success 2079872 624585 30335 30241
pyperf600_nounroll.linked3.o on_event success 353972 568128 37101 2115
strobemeta.linked3.o on_event success 455230 557149 15915 13537
test_verif_scale1.linked3.o balancer_ingress success 89880 554754 8636 2141
strobemeta_nounroll2.linked3.o on_event success 433906 501725 17087 1912
loop6.linked3.o trace_virtqueue_add_sgs success 282205 398057 8717 919
loop1.linked3.o nested_loops success 125630 361349 5504 5504
pyperf180.linked3.o on_event success 2511740 160398 11470 11446
pyperf100.linked3.o on_event success 744329 87681 6213 6191
test_cls_redirect.linked3.o cls_redirect success 54087 78925 4782 903
strobemeta_subprogs.linked3.o on_event success 57898 65420 1954 403
test_cls_redirect_subprogs.linked3.o cls_redirect success 54522 64965 4619 958
strobemeta_nounroll1.linked3.o on_event success 43313 57240 1757 382
pyperf50.linked3.o on_event success 194355 46378 3263 3241
profiler2.linked3.o tracepoint__syscalls__sys_enter_kill success 23869 43372 1423 542
pyperf_subprogs.linked3.o on_event success 29179 36358 2499 2499
profiler1.linked3.o tracepoint__syscalls__sys_enter_kill success 13052 27036 1946 936
profiler3.linked3.o tracepoint__syscalls__sys_enter_kill success 21023 26016 2186 915
profiler2.linked3.o kprobe__vfs_link success 5255 13896 303 271
profiler1.linked3.o kprobe__vfs_link success 7792 12687 1042 1041
profiler3.linked3.o kprobe__vfs_link success 7332 10601 865 865
profiler2.linked3.o kprobe_ret__do_filp_open success 3417 8900 216 199
profiler2.linked3.o kprobe__vfs_symlink success 3548 8775 203 186
pyperf_global.linked3.o on_event success 10007 7563 520 520
profiler3.linked3.o kprobe_ret__do_filp_open success 4708 6464 532 532
profiler1.linked3.o kprobe_ret__do_filp_open success 3090 6445 508 508
profiler3.linked3.o kprobe__vfs_symlink success 4477 6358 521 521
profiler1.linked3.o kprobe__vfs_symlink success 3381 6347 507 507
profiler2.linked3.o raw_tracepoint__sched_process_exec success 2464 5874 292 189
profiler3.linked3.o raw_tracepoint__sched_process_exec success 2677 4363 397 283
profiler2.linked3.o kprobe__proc_sys_write success 1800 4355 143 138
profiler1.linked3.o raw_tracepoint__sched_process_exec success 1649 4019 333 240
pyperf600_bpf_loop.linked3.o on_event success 2711 3966 306 306
profiler2.linked3.o raw_tracepoint__sched_process_exit success 1234 3138 83 66
profiler3.linked3.o kprobe__proc_sys_write success 1755 2623 223 223
profiler1.linked3.o kprobe__proc_sys_write success 1222 2456 193 193
loop2.linked3.o while_true success 608 1783 57 30
profiler3.linked3.o raw_tracepoint__sched_process_exit success 789 1680 146 146
profiler1.linked3.o raw_tracepoint__sched_process_exit success 592 1526 133 133
strobemeta_bpf_loop.linked3.o on_event success 1015 1512 106 106
loop4.linked3.o combinations success 165 524 18 17
profiler3.linked3.o raw_tracepoint__sched_process_fork success 196 299 25 25
profiler1.linked3.o raw_tracepoint__sched_process_fork success 109 265 19 19
profiler2.linked3.o raw_tracepoint__sched_process_fork success 111 265 19 19
loop5.linked3.o while_true success 47 84 9 9
------------------------------------ ------------------------------------ ------- ------------ ----------- ------------ -----------
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-4-andrii@kernel.org

c8bc5e05

libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set · 749c202c

Andrii Nakryiko authored Sep 09, 2022

Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF
for freplace programs. It's wrong to search in vmlinux BTF and libbpf
doesn't even mark vmlinux BTF as required for freplace programs. So
trying to search anything in obj->vmlinux_btf might cause NULL
dereference if nothing else in BPF object requires vmlinux BTF.

Instead, error out if freplace (EXT) program doesn't specify
attach_prog_fd during at the load time.

Fixes: 91abb4a6 ("libbpf: Support attachment of BPF tracing programs to kernel modules")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org

749c202c

selftests/bpf: Fix test_verif_scale{1,3} SEC() annotations · cf060c2c

Andrii Nakryiko authored Sep 09, 2022

Use proper SEC("tc") for test_verif_scale{1,3} programs. It's not
a problem for selftests right now because we manually set type
programmatically, but not having correct SEC() definitions makes it
harded to generically load BPF object files.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-2-andrii@kernel.org

cf060c2c

bpf: Move bpf_dispatcher function out of ftrace locations · ceea991a

Jiri Olsa authored Sep 03, 2022

The dispatcher function is attached/detached to trampoline by
dispatcher update function. At the same time it's available as
ftrace attachable function.

After discussion [1] the proposed solution is to use compiler
attributes to alter bpf_dispatcher_##name##_func function:

  - remove it from being instrumented with __no_instrument_function__
    attribute, so ftrace has no track of it

  - but still generate 5 nop instructions with patchable_function_entry(5)
    attribute, which are expected by bpf_arch_text_poke used by
    dispatcher update function

Enabling HAVE_DYNAMIC_FTRACE_NO_PATCHABLE option for x86, so
__patchable_function_entries functions are not part of ftrace/mcount
locations.

Adding attributes to bpf_dispatcher_XXX function on x86_64 so it's
kept out of ftrace locations and has 5 byte nop generated at entry.

These attributes need to be arch specific as pointed out by Ilya
Leoshkevic in here [2].

The dispatcher image is generated only for x86_64 arch, so the
code can stay as is for other archs.

  [1] https://lore.kernel.org/bpf/20220722110811.124515-1-jolsa@kernel.org/
  [2] https://lore.kernel.org/bpf/969a14281a7791c334d476825863ee449964dd0c.camel@linux.ibm.com/Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20220903131154.420467-3-jolsa@kernel.org

ceea991a