Commits · 89bbc53a4dbbbdd65962d4bbaeba6c7775ea0bf7 · Kirill Smelkov / linux

23 Jan, 2023 4 commits

bpf: Reshuffle some parts of bpf/offload.c · 89bbc53a

Stanislav Fomichev authored Jan 19, 2023

To avoid adding forward declarations in the main patch, shuffle
some code around. No functional changes.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-5-sdf@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

89bbc53a

bpf: Move offload initialization into late_initcall · f1fc43d0

Stanislav Fomichev authored Jan 19, 2023

So we don't have to initialize it manually from several paths.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-4-sdf@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

f1fc43d0

bpf: Rename bpf_{prog,map}_is_dev_bound to is_offloaded · 9d03ebc7

Stanislav Fomichev authored Jan 19, 2023

BPF offloading infra will be reused to implement
bound-but-not-offloaded bpf programs. Rename existing
helpers for clarity. No functional changes.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-3-sdf@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

9d03ebc7

bpf: Document XDP RX metadata · a4aeb9d6

Stanislav Fomichev authored Jan 19, 2023

Document all current use-cases and assumptions.

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230119221536.3349901-2-sdf@google.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

a4aeb9d6

21 Jan, 2023 13 commits

Merge branch 'Dynptr fixes' · 84150795

Alexei Starovoitov authored Jan 20, 2023

Kumar Kartikeya Dwivedi says:

====================

This is part 2 of https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com.

Changelog:
----------
v4 -> v5
v5: https://lore.kernel.org/bpf/20230120070355.1983560-1-memxor@gmail.com

 * Add comments, tests from Joanne
 * Add Joanne's acks

v3 -> v4
v3: https://lore.kernel.org/bpf/20230120034314.1921848-1-memxor@gmail.com

 * Adopt BPF ASM tests to more readable style (Alexei)

v2 -> v3
v2: https://lore.kernel.org/bpf/20230119021442.1465269-1-memxor@gmail.com

 * Fix slice invalidation logic for unreferenced dynptrs (Joanne)
 * Add selftests for precise slice invalidation on destruction
 * Add Joanne's acks

v1 -> v2
v1: https://lore.kernel.org/bpf/20230101083403.332783-1-memxor@gmail.com

 * Return error early in case of overwriting referenced dynptr slots (Andrii, Joanne)
 * Rename destroy_stack_slots_dynptr to destroy_if_dynptr_stack_slot (Joanne)
 * Invalidate dynptr slices associated with dynptr in destroy_if_dynptr_stack_slot (Joanne)
 * Combine both dynptr_get_spi and is_spi_bounds_valid (Joanne)
 * Compute spi once in process_dynptr_func and pass it as parameter instead of recomputing (Joanne)
 * Add comments expanding REG_LIVE_WRITTEN marking in unmark_stack_slots_dynptr (Joanne)
 * Add comments explaining why destroy_if_dynptr_stack_slot call needs to be done for both spi
   and spi - 1 (Joanne)
 * Port BPF assembly tests from test_verifier to test_progs framework (Andrii)
 * Address misc feedback, rebase to bpf-next

Old v1 -> v1
Old v1: https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com

 * Allow overwriting dynptr stack slots from dynptr init helpers
 * Fix a bug in alignment check where reg->var_off.value was still not included
 * Address other minor nits

Eduard Zingerman (1):
  selftests/bpf: convenience macro for use with 'asm volatile' blocks
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

84150795

selftests/bpf: Add dynptr helper tests · ae8e354c

Kumar Kartikeya Dwivedi authored Jan 21, 2023

First test that we allow overwriting dynptr slots and reinitializing
them in unreferenced case, and disallow overwriting for referenced case.
Include tests to ensure slices obtained from destroyed dynptrs are being
invalidated on their destruction. The destruction needs to be scoped, as
in slices of dynptr A should not be invalidated when dynptr B is
destroyed. Next, test that MEM_UNINIT doesn't allow writing dynptr stack
slots.
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-13-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ae8e354c

selftests/bpf: Add dynptr partial slot overwrite tests · 011edc8e

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Try creating a dynptr, then overwriting second slot with first slot of
another dynptr. Then, the first slot of first dynptr should also be
invalidated, but without our fix that does not happen. As a consequence,
the unfixed case allows passing first dynptr (as the kernel check only
checks for slot_type and then first_slot == true).
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-12-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

011edc8e

selftests/bpf: Add dynptr var_off tests · ef481013

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Ensure that variable offset is handled correctly, and verifier takes
both fixed and variable part into account. Also ensures that only
constant var_off is allowed.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-11-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ef481013

selftests/bpf: Add dynptr pruning tests · f4d24edf

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Add verifier tests that verify the new pruning behavior for STACK_DYNPTR
slots, and ensure that state equivalence takes into account changes to
the old and current verifier state correctly. Also ensure that the
stacksafe changes are actually enabling pruning in case states are
equivalent from pruning PoV.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-10-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f4d24edf

selftests/bpf: convenience macro for use with 'asm volatile' blocks · 91b875a5

Eduard Zingerman authored Jan 21, 2023

A set of macros useful for writing naked BPF functions using inline
assembly. E.g. as follows:

struct map_struct {
	...
} map SEC(".maps");

SEC(...)
__naked int foo_test(void)
{
	asm volatile(
		"r0 = 0;"
		"*(u64*)(r10 - 8) = r0;"
		"r1 = %[map] ll;"
		"r2 = r10;"
		"r2 += -8;"
		"call %[bpf_map_lookup_elem];"
		"r0 = 0;"
		"exit;"
		:
		: __imm(bpf_map_lookup_elem),
		  __imm_addr(map)
		: __clobber_all);
}
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
[ Kartikeya: Add acks, include __clobber_common from Andrii ]
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-9-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

91b875a5

bpf: Avoid recomputing spi in process_dynptr_func · 1ee72bcb

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Currently, process_dynptr_func first calls dynptr_get_spi and then
is_dynptr_reg_valid_init and is_dynptr_reg_valid_uninit have to call it
again to obtain the spi value. Instead of doing this twice, reuse the
already obtained value (which is by default 0, and is only set for
PTR_TO_STACK, and only used in that case in aforementioned functions).
The input value for these two functions will either be -ERANGE or >= 1,
and can either be permitted or rejected based on the respective check.
Suggested-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-8-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

1ee72bcb

bpf: Combine dynptr_get_spi and is_spi_bounds_valid · f5b625e5

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Currently, a check on spi resides in dynptr_get_spi, while others
checking its validity for being within the allocated stack slots happens
in is_spi_bounds_valid. Almost always barring a couple of cases (where
being beyond allocated stack slots is not an error as stack slots need
to be populated), both are used together to make checks. Hence, subsume
the is_spi_bounds_valid check in dynptr_get_spi, and return -ERANGE to
specially distinguish the case where spi is valid but not within
allocated slots in the stack state.

The is_spi_bounds_valid function is still kept around as it is a generic
helper that will be useful for other objects on stack similar to dynptr
in the future.
Suggested-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-7-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f5b625e5

bpf: Allow reinitializing unreferenced dynptr stack slots · 379d4ba8

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Consider a program like below:

void prog(void)
{
	{
		struct bpf_dynptr ptr;
		bpf_dynptr_from_mem(...);
	}
	...
	{
		struct bpf_dynptr ptr;
		bpf_dynptr_from_mem(...);
	}
}

Here, the C compiler based on lifetime rules in the C standard would be
well within in its rights to share stack storage for dynptr 'ptr' as
their lifetimes do not overlap in the two distinct scopes. Currently,
such an example would be rejected by the verifier, but this is too
strict. Instead, we should allow reinitializing over dynptr stack slots
and forget information about the old dynptr object.

The destroy_if_dynptr_stack_slot function already makes necessary checks
to avoid overwriting referenced dynptr slots. This is done to present a
better error message instead of forgetting dynptr information on stack
and preserving reference state, leading to an inevitable but
undecipherable error at the end about an unreleased reference which has
to be associated back to its allocating call instruction to make any
sense to the user.
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-6-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

379d4ba8

bpf: Invalidate slices on destruction of dynptrs on stack · f8064ab9

Kumar Kartikeya Dwivedi authored Jan 21, 2023

The previous commit implemented destroy_if_dynptr_stack_slot. It
destroys the dynptr which given spi belongs to, but still doesn't
invalidate the slices that belong to such a dynptr. While for the case
of referenced dynptr, we don't allow their overwrite and return an error
early, we still allow it and destroy the dynptr for unreferenced dynptr.

To be able to enable precise and scoped invalidation of dynptr slices in
this case, we must be able to associate the source dynptr of slices that
have been obtained using bpf_dynptr_data. When doing destruction, only
slices belonging to the dynptr being destructed should be invalidated,
and nothing else. Currently, dynptr slices belonging to different
dynptrs are indistinguishible.

Hence, allocate a unique id to each dynptr (CONST_PTR_TO_DYNPTR and
those on stack). This will be stored as part of reg->id. Whenever using
bpf_dynptr_data, transfer this unique dynptr id to the returned
PTR_TO_MEM_OR_NULL slice pointer, and store it in a new per-PTR_TO_MEM
dynptr_id register state member.

Finally, after establishing such a relationship between dynptrs and
their slices, implement precise invalidation logic that only invalidates
slices belong to the destroyed dynptr in destroy_if_dynptr_stack_slot.
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-5-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f8064ab9

bpf: Fix partial dynptr stack slot reads/writes · ef8fc7a0

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Currently, while reads are disallowed for dynptr stack slots, writes are
not. Reads don't work from both direct access and helpers, while writes
do work in both cases, but have the effect of overwriting the slot_type.

While this is fine, handling for a few edge cases is missing. Firstly,
a user can overwrite the stack slots of dynptr partially.

Consider the following layout:
spi: [d][d][?]
2 1 0

First slot is at spi 2, second at spi 1.
Now, do a write of 1 to 8 bytes for spi 1.

This will essentially either write STACK_MISC for all slot_types or
STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
of zeroes). The end result is that slot is scrubbed.

Now, the layout is:
spi: [d][m][?]
2 1 0

Suppose if user initializes spi = 1 as dynptr.
We get:
spi: [d][d][d]
2 1 0

But this time, both spi 2 and spi 1 have first_slot = true.

Now, when passing spi 2 to dynptr helper, it will consider it as
initialized as it does not check whether second slot has first_slot ==
false. And spi 1 should already work as normal.

This effectively replaced size + offset of first dynptr, hence allowing
invalid OOB reads and writes.

Make a few changes to protect against this:
When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
STACK_DYNPTR type, mark both first and second slot (regardless of which
slot we touch) as STACK_INVALID. Reads are already prevented.

Second, prevent writing to stack memory from helpers if the range may
contain any STACK_DYNPTR slots. Reads are already prevented.

For helpers, we cannot allow it to destroy dynptrs from the writes as
depending on arguments, helper may take uninit_mem and dynptr both at
the same time. This would mean that helper may write to uninit_mem
before it reads the dynptr, which would be bad.

PTR_TO_MEM: [?????dd]

Depending on the code inside the helper, it may end up overwriting the
dynptr contents first and then read those as the dynptr argument.

Verifier would only simulate destruction when it does byte by byte
access simulation in check_helper_call for meta.access_size, and
fail to catch this case, as it happens after argument checks.

The same would need to be done for any other non-trivial objects created
on the stack in the future, such as bpf_list_head on stack, or
bpf_rb_root on stack.

A common misunderstanding in the current code is that MEM_UNINIT means
writes, but note that writes may also be performed even without
MEM_UNINIT in case of helpers, in that case the code after handling meta
&& meta->raw_mode will complain when it sees STACK_DYNPTR. So that
invalid read case also covers writes to potential STACK_DYNPTR slots.
The only loophole was in case of meta->raw_mode which simulated writes
through instructions which could overwrite them.

A future series sequenced after this will focus on the clean up of
helper access checks and bugs around that.

Fixes: 97e03f52 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-4-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

ef8fc7a0

bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR · 79168a66

Kumar Kartikeya Dwivedi authored Jan 21, 2023

Currently, the dynptr function is not checking the variable offset part
of PTR_TO_STACK that it needs to check. The fixed offset is considered
when computing the stack pointer index, but if the variable offset was
not a constant (such that it could not be accumulated in reg->off), we
will end up a discrepency where runtime pointer does not point to the
actual stack slot we mark as STACK_DYNPTR.

It is impossible to precisely track dynptr state when variable offset is
not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
simply reject the case where reg->var_off is not constant. Then,
consider both reg->off and reg->var_off.value when computing the stack
pointer index.

A new helper dynptr_get_spi is introduced to hide over these details
since the dynptr needs to be located in multiple places outside the
process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
need to enforce these checks in all places.

Note that it is disallowed for unprivileged users to have a non-constant
var_off, so this problem should only be possible to trigger from
programs having CAP_PERFMON. However, its effects can vary.

Without the fix, it is possible to replace the contents of the dynptr
arbitrarily by making verifier mark different stack slots than actual
location and then doing writes to the actual stack address of dynptr at
runtime.

Fixes: 97e03f52 ("bpf: Add verifier support for dynptrs")
Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-3-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

79168a66

bpf: Fix state pruning for STACK_DYNPTR stack slots · d6fefa11

Kumar Kartikeya Dwivedi authored Jan 21, 2023

The root of the problem is missing liveness marking for STACK_DYNPTR
slots. This leads to all kinds of problems inside stacksafe.

The verifier by default inside stacksafe ignores spilled_ptr in stack
slots which do not have REG_LIVE_READ marks. Since this is being checked
in the 'old' explored state, it must have already done clean_live_states
for this old bpf_func_state. Hence, it won't be receiving any more
liveness marks from to be explored insns (it has received REG_LIVE_DONE
marking from liveness point of view).

What this means is that verifier considers that it's safe to not compare
the stack slot if was never read by children states. While liveness
marks are usually propagated correctly following the parentage chain for
spilled registers (SCALAR_VALUE and PTR_* types), the same is not the
case for STACK_DYNPTR.

clean_live_states hence simply rewrites these stack slots to the type
STACK_INVALID since it sees no REG_LIVE_READ marks.

The end result is that we will never see STACK_DYNPTR slots in explored
state. Even if verifier was conservatively matching !REG_LIVE_READ
slots, very next check continuing the stacksafe loop on seeing
STACK_INVALID would again prevent further checks.

Now as long as verifier stores an explored state which we can compare to
when reaching a pruning point, we can abuse this bug to make verifier
prune search for obviously unsafe paths using STACK_DYNPTR slots
thinking they are never used hence safe.

Doing this in unprivileged mode is a bit challenging. add_new_state is
only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges)
or when jmps_processed difference is >= 2 and insn_processed difference
is >= 8. So coming up with the unprivileged case requires a little more
work, but it is still totally possible. The test case being discussed
below triggers the heuristic even in unprivileged mode.

However, it no longer works since commit
8addbfc7 ("bpf: Gate dynptr API behind CAP_BPF").

Let's try to study the test step by step.

Consider the following program (C style BPF ASM):

0  r0 = 0;
1  r6 = &ringbuf_map;
3  r1 = r6;
4  r2 = 8;
5  r3 = 0;
6  r4 = r10;
7  r4 -= -16;
8  call bpf_ringbuf_reserve_dynptr;
9  if r0 == 0 goto pc+1;
10 goto pc+1;
11 *(r10 - 16) = 0xeB9F;
12 r1 = r10;
13 r1 -= -16;
14 r2 = 0;
15 call bpf_ringbuf_discard_dynptr;
16 r0 = 0;
17 exit;

We know that insn 12 will be a pruning point, hence if we force
add_new_state for it, it will first verify the following path as
safe in straight line exploration:
0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17

Then, when we arrive at insn 12 from the following path:
0 1 3 4 5 6 7 8 9 -> 11 (12)

We will find a state that has been verified as safe already at insn 12.
Since register state is same at this point, regsafe will pass. Next, in
stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped
seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true.
Next, refsafe is also true as reference state is unchanged in both
states.

The states are considered equivalent and search is pruned.

Hence, we are able to construct a dynptr with arbitrary contents and use
the dynptr API to operate on this arbitrary pointer and arbitrary size +
offset.

To fix this, first define a mark_dynptr_read function that propagates
liveness marks whenever a valid initialized dynptr is accessed by dynptr
helpers. REG_LIVE_WRITTEN is marked whenever we initialize an
uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows
screening off mark_reg_read and not propagating marks upwards from that
point.

This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or
none, so clean_live_states either sets both slots to STACK_INVALID or
none of them. This is the invariant the checks inside stacksafe rely on.

Next, do a complete comparison of both stack slots whenever they have
STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and
also whether both form the same first_slot. Only then is the later path
safe.

Fixes: 97e03f52 ("bpf: Add verifier support for dynptrs")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20230121002241.2113993-2-memxor@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d6fefa11

20 Jan, 2023 4 commits

Merge branch 'kallsyms: Optimize the search for module symbols by livepatch and bpf' · 00b8f39f

Alexei Starovoitov authored Jan 19, 2023

Jiri Olsa says:

====================

hi,
sending new version of [1] patchset posted originally by Zhen Lei.
It contains 2 changes that improove search performance for livepatch
and bpf.

v3 changes:
  - fixed off by 1 issue, simplified condition, added acks [Song]
  - added module attach as subtest [Andrii]

v2 changes:
  - reworked the bpf change and meassured the performance
  - adding new selftest to benchmark kprobe multi module attachment
  - skipping patch 3 as requested by Zhen Lei
  - added Reviewed-by for patch 1 [Petr Mladek]

thanks,
jirka

[1] https://lore.kernel.org/bpf/20221230112729.351-1-thunder.leizhen@huawei.com/
---
Jiri Olsa (2):
      selftests/bpf: Add serial_test_kprobe_multi_bench_attach_kernel/module tests
      bpf: Change modules resolving for kprobe multi link
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

00b8f39f

bpf: Change modules resolving for kprobe multi link · 6a5f2d6e

Jiri Olsa authored Jan 16, 2023

We currently use module_kallsyms_on_each_symbol that iterates all
modules/symbols and we try to lookup each such address in user
provided symbols/addresses to get list of used modules.

This fix instead only iterates provided kprobe addresses and calls
__module_address on each to get list of used modules. This turned
out to be simpler and also bit faster.

On my setup with workload (executed 10 times):

   # test_progs -t kprobe_multi_bench_attach/modules

Current code:

 Performance counter stats for './test.sh' (5 runs):

    76,081,161,596      cycles:k                   ( +-  0.47% )

           18.3867 +- 0.0992 seconds time elapsed  ( +-  0.54% )

With the fix:

 Performance counter stats for './test.sh' (5 runs):

    74,079,889,063      cycles:k                   ( +-  0.04% )

           17.8514 +- 0.0218 seconds time elapsed  ( +-  0.12% )
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20230116101009.23694-4-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

6a5f2d6e

selftests/bpf: Add serial_test_kprobe_multi_bench_attach_kernel/module tests · edac4b5b

Jiri Olsa authored Jan 16, 2023

Add bench test for module portion of the symbols as well.

  # ./test_progs -v -t kprobe_multi_bench_attach_module
  bpf_testmod.ko is already unloaded.
  Loading bpf_testmod.ko...
  Successfully loaded bpf_testmod.ko.
  test_kprobe_multi_bench_attach:PASS:get_syms 0 nsec
  test_kprobe_multi_bench_attach:PASS:kprobe_multi_empty__open_and_load 0 nsec
  test_kprobe_multi_bench_attach:PASS:bpf_program__attach_kprobe_multi_opts 0 nsec
  test_kprobe_multi_bench_attach: found 26620 functions
  test_kprobe_multi_bench_attach: attached in   0.182s
  test_kprobe_multi_bench_attach: detached in   0.082s
  #96      kprobe_multi_bench_attach_module:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
  Successfully unloaded bpf_testmod.ko.

It's useful for testing kprobe multi link modules resolving.
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230116101009.23694-3-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

edac4b5b

livepatch: Improve the search performance of module_kallsyms_on_each_symbol() · 07cc2c93

Zhen Lei authored Jan 16, 2023

Currently we traverse all symbols of all modules to find the specified
function for the specified module. But in reality, we just need to find
the given module and then traverse all the symbols in it.

Let's add a new parameter 'const char *modname' to function
module_kallsyms_on_each_symbol(), then we can compare the module names
directly in this function and call hook 'fn' after matching. If 'modname'
is NULL, the symbols of all modules are still traversed for compatibility
with other usage cases.

Phase1: mod1-->mod2..(subsequent modules do not need to be compared)
                |
Phase2:          -->f1-->f2-->f3

Assuming that there are m modules, each module has n symbols on average,
then the time complexity is reduced from O(m * n) to O(m) + O(n).
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230116101009.23694-2-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

07cc2c93

18 Jan, 2023 3 commits

selftests/bpf: Fix build errors if CONFIG_NF_CONNTRACK=m · 92afc532

Tiezhu Yang authored Jan 18, 2023

If CONFIG_NF_CONNTRACK=m, there are no definitions of NF_NAT_MANIP_SRC
and NF_NAT_MANIP_DST in vmlinux.h, build test_bpf_nf.c failed.

$ make -C tools/testing/selftests/bpf/

  CLNG-BPF [test_maps] test_bpf_nf.bpf.o
progs/test_bpf_nf.c:160:42: error: use of undeclared identifier 'NF_NAT_MANIP_SRC'
                bpf_ct_set_nat_info(ct, &saddr, sport, NF_NAT_MANIP_SRC);
                                                       ^
progs/test_bpf_nf.c:163:42: error: use of undeclared identifier 'NF_NAT_MANIP_DST'
                bpf_ct_set_nat_info(ct, &daddr, dport, NF_NAT_MANIP_DST);
                                                       ^
2 errors generated.

Copy the definitions in include/net/netfilter/nf_nat.h to test_bpf_nf.c,
in order to avoid redefinitions if CONFIG_NF_CONNTRACK=y, rename them with
___local suffix. This is similar with commit 1058b6a7 ("selftests/bpf:
Do not fail build if CONFIG_NF_CONNTRACK=m/n").

Fixes: b06b45e8 ("selftests/bpf: add tests for bpf_ct_set_nat_info kfunc")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1674028604-7113-1-git-send-email-yangtiezhu@loongson.cnSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

92afc532

bpf/selftests: Add verifier tests for loading sleepable programs · c0f264e4

Jiri Olsa authored Jan 17, 2023

Adding verifier tests for loading all types od allowed
sleepable programs plus reject for tp_btf type.
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230117223705.440975-2-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c0f264e4

bpf: Do not allow to load sleepable BPF_TRACE_RAW_TP program · 700e6f85

Jiri Olsa authored Jan 17, 2023

Currently we allow to load any tracing program as sleepable,
but BPF_TRACE_RAW_TP can't sleep. Making the check explicit
for tracing programs attach types, so sleepable BPF_TRACE_RAW_TP
will fail to load.

Updating the verifier error to mention iter programs as well.
Acked-by: Song Liu <song@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230117223705.440975-1-jolsa@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

700e6f85

17 Jan, 2023 1 commit

btf, scripts: Exclude Rust CUs with pahole · c1177979

Martin Rodriguez Reboredo authored Jan 11, 2023

Version 1.24 of pahole has the capability to exclude compilation units (CUs)
of specific languages [1] [2]. Rust, as of writing, is not currently supported
by pahole and if it's used with a build that has BTF debugging enabled it
results in malformed kernel and module binaries [3]. So it's better for pahole
to exclude Rust CUs until support for it arrives.
Co-developed-by: Eric Curtin <ecurtin@redhat.com>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
Signed-off-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Eric Curtin <ecurtin@redhat.com>
Reviewed-by: Neal Gompa <neal@gompa.dev>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=49358dfe2aaae4e90b072332c3e324019826783f [1]
Link: https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=8ee363790b7437283c53090a85a9fec2f0b0fbc4 [2]
Link: https://github.com/Rust-for-Linux/linux/issues/735 [3]
Link: https://lore.kernel.org/bpf/20230111152050.559334-1-yakoyoku@gmail.com

c1177979

15 Jan, 2023 14 commits

Merge branch 'samples/bpf: modernize BPF functionality test programs' · dfff86f8

Alexei Starovoitov authored Jan 15, 2023

"Daniel T. Lee" says:

====================

Currently, there are many programs under samples/bpf to test the
various functionality of BPF that have been developed for a long time.
However, the kernel (BPF) has changed a lot compared to the 2016 when
some of these test programs were first introduced.

Therefore, some of these programs use the deprecated function of BPF,
and some programs no longer work normally due to changes in the API.

To list some of the kernel changes that this patch set is focusing on,
- legacy BPF map declaration syntax support had been dropped [1]
- bpf_trace_printk() always append newline at the end [2]
- deprecated styled BPF section header (bpf_load style) [3]
- urandom_read tracepoint is removed (used for testing overhead) [4]
- ping sends packet with SOCK_DGRAM instead of SOCK_RAW [5]*
- use "vmlinux.h" instead of including individual headers

In addition to this, this patchset tries to modernize the existing
testing scripts a bit. And for network-related testing programs,
a separate header file was created and applied. (To use the
Endianness conversion function from xdp_sample and bunch of constants)

[1]: https://github.com/libbpf/libbpf/issues/282
[2]: commit ac5a72ea ("bpf: Use dedicated bpf_trace_printk event instead of trace_printk()")
[3]: commit ceb5dea5 ("samples: bpf: Remove bpf_load loader completely")
[4]: commit 14c17463 ("random: remove unused tracepoints")
[5]: https://lwn.net/Articles/422330/

*: This is quite old, but I'm not sure why the code was initially
   developed to filter only SOCK_RAW.
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

dfff86f8

samples/bpf: change _kern suffix to .bpf with BPF test programs · e04946f5

Daniel T. Lee authored Jan 15, 2023

This commit changes the _kern suffix to .bpf with the BPF test programs.
With this modification, test programs will inherit the benefit of the
new CLANG-BPF compile target.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-11-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e04946f5

samples/bpf: use vmlinux.h instead of implicit headers in BPF test program · e8acf8f4

Daniel T. Lee authored Jan 15, 2023

This commit applies vmlinux.h to BPF functionality testing program.
Macros that were not defined despite migration to "vmlinux.h" were
defined separately in individual files.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-10-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e8acf8f4

samples/bpf: replace BPF programs header with net_shared.h · c2f4f559

Daniel T. Lee authored Jan 15, 2023

This commit applies "net_shared.h" to BPF programs to remove existing
network related header dependencies. Also, this commit removes
unnecessary headers before applying "vmlinux.h" to the BPF programs.

Mostly, endianness conversion function has been applied to the source.
In addition, several macros have been defined to fulfill the INET,
TC-related constants.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-9-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c2f4f559

samples/bpf: split common macros to net_shared.h · e69fe845

Daniel T. Lee authored Jan 15, 2023

Currently, many programs under sample/bpf often include individual
macros by directly including the header under "linux/" rather than
using the "vmlinux.h" header.

However, there are some problems with migrating to "vmlinux.h" because
there is no definition for utility functions such as endianness
conversion (ntohs/htons). Fortunately, the xdp_sample program already
has a function that can be replaced to solve this problem.

Therefore, this commit attempts to separate these functions into a file
called net_shared.h to make them universally available. Additionally,
this file includes network-related macros that are not defined in
"vmlinux.h". (inspired by 'selftests' bpf_tracing_net.h)
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-8-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

e69fe845

samples/bpf: replace legacy map with the BTF-defined map · a1f93c8f

Daniel T. Lee authored Jan 15, 2023

With libbpf 1.0 release, support for legacy BPF map declaration syntax
had been dropped. If you run a program using legacy BPF in the latest
libbpf, the following error will be output.

libbpf: map 'lwt_len_hist_map' (legacy): legacy map definitions are deprecated, use BTF-defined maps instead
libbpf: Use of BPF_ANNOTATE_KV_PAIR is deprecated, use BTF-defined maps in .maps section instead

This commit replaces legacy map with the BTF-defined map.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-7-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a1f93c8f

samples/bpf: replace broken overhead microbenchmark with fib_table_lookup · 58e975d0

Daniel T. Lee authored Jan 15, 2023

The test_overhead bpf program is designed to compare performance
between tracepoint and kprobe. Initially it used task_rename and
urandom_read tracepoint.

However, commit 14c17463 ("random: remove unused tracepoints")
removed urandom_read tracepoint, and for this reason the test_overhead
got broken.

This commit introduces new microbenchmark using fib_table_lookup.
This microbenchmark sends UDP packets to localhost in order to invoke
fib_table_lookup.

In a nutshell:
fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
addr.sin_addr.s_addr = inet_addr(DUMMY_IP);
addr.sin_port = htons(DUMMY_PORT);
for() {
    sendto(fd, buf, strlen(buf), 0,
            (struct sockaddr *)&addr, sizeof(addr));
}

on 4 cpus in parallel:
                                            lookup per sec
base (no tracepoints, no kprobes)               381k
with kprobe at fib_table_lookup()               325k
with tracepoint at fib:fib_table_lookup         330k
with raw_tracepoint at fib:fib_table_lookup     365k

Fixes: 14c17463 ("random: remove unused tracepoints")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-6-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

58e975d0

samples/bpf: fix broken cgroup socket testing · 31b12a41

Daniel T. Lee authored Jan 15, 2023

Currently, executing test_cgrp2_sock2 fails due to wrong section
header. This 'cgroup/sock1' style section is previously used at
'samples/bpf_load' (deprecated) BPF loader. Because this style isn't
supported in libbpf, this commit fixes this problem by correcting the
section header.

    $ sudo ./test_cgrp2_sock2.sh
    libbpf: prog 'bpf_prog1': missing BPF prog type, check ELF section name 'cgroup/sock1'
    libbpf: prog 'bpf_prog1': failed to load: -22
    libbpf: failed to load object './sock_flags_kern.o'
    ERROR: loading BPF object file failed

In addition, this BPF program filters ping packets by comparing whether
the socket type uses SOCK_RAW. However, after the ICMP socket[1] was
developed, ping sends ICMP packets using SOCK_DGRAM. Therefore, in this
commit, the packet filtering is changed to use SOCK_DGRAM instead of
SOCK_RAW.

    $ strace --trace socket ping -6 -c1 -w1 ::1
    socket(AF_INET6, SOCK_DGRAM, IPPROTO_ICMPV6) = 3

[1]: https://lwn.net/Articles/422330/Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-5-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

31b12a41

samples/bpf: fix broken lightweight tunnel testing · dac808c9

Daniel T. Lee authored Jan 15, 2023

The test_lwt_bpf is a script that tests the functionality of BPF through
the output of the ftrace with bpf_trace_printk. Currently, this program
is not operating normally for several reasons.

First of all, this test script can't parse the ftrace results properly.
GNU sed tries to be as greedy as possible when attempting pattern
matching. Due to this, cutting metadata (such as timestamp) from the
log entry of ftrace doesn't work properly, and also desired log isn't
extracted properly. To make sed stripping clearer, 'nocontext-info'
option with the ftrace has been used to remove metadata from the log.
Also, instead of using unclear pattern matching, this commit specifies
an explicit parse pattern.

Also, unlike before when this test was introduced, the way
bpf_trace_printk behaves has changed[1]. The previous bpf_trace_printk
had to always have '\n' in order to print newline, but now that the
bpf_trace_printk call includes newline by default, so '\n' is no longer
needed.

Lastly with the lwt ENCAP_BPF out, the context information with the
sk_buff protocol is preserved. Therefore, this commit changes the
previous test result from 'protocol 0' to 'protocol 8', which means
ETH_P_IP.

[1]: commit ac5a72ea ("bpf: Use dedicated bpf_trace_printk event instead of trace_printk()")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-4-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

dac808c9

samples/bpf: refactor BPF functionality testing scripts · f20f064e

Daniel T. Lee authored Jan 15, 2023

Currently, some test scripts are experiencing minor errors related to
executing tests.

    $ sudo ./test_cgrp2_sock.sh
    ./test_cgrp2_sock.sh: 22: test_cgrp2_sock: not found

This problem occurs because the path to the execution target is not
properly specified. Therefore, this commit solves this problem by
specifying a relative path to its executables. This commit also makes
a concise refactoring of hard-coded BPF program names.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-3-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f20f064e

samples/bpf: ensure ipv6 is enabled before running tests · d982a2e3

Daniel T. Lee authored Jan 15, 2023

Currently, a few of BPF tests use ipv6 functionality. The problem here
is that if ipv6 is disabled, these tests will fail, and even if the
test fails, it will not tell you why it failed.

    $ sudo ./test_cgrp2_sock2.sh
    RTNETLINK answers: Permission denied

In order to fix this, this commit ensures ipv6 is enabled prior to
running tests.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-2-danieltimlee@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d982a2e3

Merge branch 'bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()' · 81bbbb69

Martin KaFai Lau authored Jan 15, 2023

Ziyang Xuan says:

====================

Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.

And add ipip6 and ip6ip decap testcases to verify that
bpf_skb_adjust_room() correctly decapsulate ipip6 and ip6ip
tunnel packets.

$./test_tc_tunnel.sh
ipip
encap 192.168.1.1 to 192.168.1.2, type ipip, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ipip6
encap 192.168.1.1 to 192.168.1.2, type ipip6, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
ip6ip6
encap fd::1 to fd::2, type ip6tnl, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
sit
encap fd::1 to fd::2, type sit, mac none len 100
test basic connectivity
0
test bpf encap without decap (expect failure)
Ncat: TIMEOUT.
1
test bpf encap with tunnel device decap
0
test bpf encap with bpf decap
0
OK
...
OK. All tests passed

v3:
  - Fix compilation failure of selftests/bpf.
  - Combine two new branches in bpf_skb_adjust_room().
  - Simplify description for new flags BPF_F_ADJ_ROOM_DECAP_L3_IP*.

v2:
  - Use decap flags to indicate the new IP header.
    Do not rely on skb->encapsulation.
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

81bbbb69

selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel · 7105f76f

Ziyang Xuan authored Jan 13, 2023

Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room()
correctly decapsulate ipip6 and ip6ip tunnel packets.
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/dfd2d8cfdf9111bd129170d4345296f53bee6a67.1673574419.git.william.xuanziyang@huawei.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

7105f76f

bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() · d219df60

Ziyang Xuan authored Jan 13, 2023

Add ipip6 and ip6ip decap support for bpf_skb_adjust_room().
Main use case is for using cls_bpf on ingress hook to decapsulate
IPv4 over IPv6 and IPv6 over IPv4 tunnel packets.

Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the
new IP header version after decapsulating the outer IP header.
Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/b268ec7f0ff9431f4f43b1b40ab856ebb28cb4e1.1673574419.git.william.xuanziyang@huawei.comSigned-off-by: Martin KaFai Lau <martin.lau@kernel.org>

d219df60

13 Jan, 2023 1 commit

selftests/bpf: Fix missing space error · 1c48391b

Roberto Valenzuela authored Jan 13, 2023

Add the missing space after 'dest' variable assignment.
This change will resolve the following checkpatch.pl
script error:

ERROR: spaces required around that '+=' (ctx:VxW)
Signed-off-by: Roberto Valenzuela <valenzuelarober@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230113180257.39769-1-valenzuelarober@gmail.com

1c48391b