Commit 6e62702f authored by Jakub Kicinski's avatar Jakub Kicinski

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-13

We've added 119 non-merge commits during the last 14 day(s) which contain
a total of 134 files changed, 9462 insertions(+), 4742 deletions(-).

The main changes are:

1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi.

2) Add BPF range computation improvements to the verifier in particular
   around XOR and OR operators, refactoring of checks for range computation
   and relaxing MUL range computation so that src_reg can also be an unknown
   scalar, from Cupertino Miranda.

3) Add support to attach kprobe BPF programs through kprobe_multi link in
   a session mode, meaning, a BPF program is attached to both function entry
   and return, the entry program can decide if the return program gets
   executed and the entry program can share u64 cookie value with return
   program. Session mode is a common use-case for tetragon and bpftrace,
   from Jiri Olsa.

4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf
   as well as BPF selftest's struct_ops handling, from Andrii Nakryiko.

5) Improvements to BPF selftests in context of BPF gcc backend,
   from Jose E. Marchesi & David Faust.

6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test-
   -style in order to retire the old test, run it in BPF CI and additionally
   expand test coverage, from Jordan Rife.

7) Big batch for BPF selftest refactoring in order to remove duplicate code
   around common network helpers, from Geliang Tang.

8) Another batch of improvements to BPF selftests to retire obsolete
   bpf_tcp_helpers.h as everything is available vmlinux.h,
   from Martin KaFai Lau.

9) Fix BPF map tear-down to not walk the map twice on free when both timer
   and wq is used, from Benjamin Tissoires.

10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL,
    from Alexei Starovoitov.

11) Change BTF build scripts to using --btf_features for pahole v1.26+,
    from Alan Maguire.

12) Small improvements to BPF reusing struct_size() and krealloc_array(),
    from Andy Shevchenko.

13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions,
    from Ilya Leoshkevich.

14) Extend TCP ->cong_control() callback in order to feed in ack and
    flag parameters and allow write-access to tp->snd_cwnd_stamp
    from BPF program, from Miao Xu.

15) Add support for internal-only per-CPU instructions to inline
    bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs,
    from Puranjay Mohan.

16) Follow-up to remove the redundant ethtool.h from tooling infrastructure,
    from Tushar Vyavahare.

17) Extend libbpf to support "module:<function>" syntax for tracing
    programs, from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
  bpf: make list_for_each_entry portable
  bpf: ignore expected GCC warning in test_global_func10.c
  bpf: disable strict aliasing in test_global_func9.c
  selftests/bpf: Free strdup memory in xdp_hw_metadata
  selftests/bpf: Fix a few tests for GCC related warnings.
  bpf: avoid gcc overflow warning in test_xdp_vlan.c
  tools: remove redundant ethtool.h from tooling infra
  selftests/bpf: Expand ATTACH_REJECT tests
  selftests/bpf: Expand getsockname and getpeername tests
  sefltests/bpf: Expand sockaddr hook deny tests
  selftests/bpf: Expand sockaddr program return value tests
  selftests/bpf: Retire test_sock_addr.(c|sh)
  selftests/bpf: Remove redundant sendmsg test cases
  selftests/bpf: Migrate ATTACH_REJECT test cases
  selftests/bpf: Migrate expected_attach_type tests
  selftests/bpf: Migrate wildcard destination rewrite test
  selftests/bpf: Migrate sendmsg6 v4 mapped address tests
  selftests/bpf: Migrate sendmsg deny test cases
  selftests/bpf: Migrate WILDCARD_IP test
  selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
  ...
====================

Link: https://lore.kernel.org/r/20240513134114.17575-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents afd29f36 ba39486d
...@@ -72,6 +72,7 @@ two flavors of JITs, the newer eBPF JIT currently supported on: ...@@ -72,6 +72,7 @@ two flavors of JITs, the newer eBPF JIT currently supported on:
- riscv64 - riscv64
- riscv32 - riscv32
- loongarch64 - loongarch64
- arc
And the older cBPF JIT supported on the following archs: And the older cBPF JIT supported on the following archs:
......
...@@ -513,7 +513,7 @@ JIT compiler ...@@ -513,7 +513,7 @@ JIT compiler
------------ ------------
The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC, The Linux kernel has a built-in BPF JIT compiler for x86_64, SPARC,
PowerPC, ARM, ARM64, MIPS, RISC-V and s390 and can be enabled through PowerPC, ARM, ARM64, MIPS, RISC-V, s390, and ARC and can be enabled through
CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each CONFIG_BPF_JIT. The JIT compiler is transparently invoked for each
attached filter from user space or for internal kernel users if it has attached filter from user space or for internal kernel users if it has
been previously enabled by root:: been previously enabled by root::
...@@ -650,7 +650,7 @@ before a conversion to the new layout is being done behind the scenes! ...@@ -650,7 +650,7 @@ before a conversion to the new layout is being done behind the scenes!
Currently, the classic BPF format is being used for JITing on most Currently, the classic BPF format is being used for JITing on most
32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64, 32-bit architectures, whereas x86-64, aarch64, s390x, powerpc64,
sparc64, arm32, riscv64, riscv32, loongarch64 perform JIT compilation sparc64, arm32, riscv64, riscv32, loongarch64, arc perform JIT compilation
from eBPF instruction set. from eBPF instruction set.
Testing Testing
......
...@@ -3712,6 +3712,12 @@ S: Maintained ...@@ -3712,6 +3712,12 @@ S: Maintained
F: Documentation/devicetree/bindings/iio/imu/bosch,bmi323.yaml F: Documentation/devicetree/bindings/iio/imu/bosch,bmi323.yaml
F: drivers/iio/imu/bmi323/ F: drivers/iio/imu/bmi323/
BPF JIT for ARC
M: Shahab Vahedi <shahab@synopsys.com>
L: bpf@vger.kernel.org
S: Maintained
F: arch/arc/net/
BPF JIT for ARM BPF JIT for ARM
M: Russell King <linux@armlinux.org.uk> M: Russell King <linux@armlinux.org.uk>
M: Puranjay Mohan <puranjay@kernel.org> M: Puranjay Mohan <puranjay@kernel.org>
......
# SPDX-License-Identifier: GPL-2.0 # SPDX-License-Identifier: GPL-2.0
obj-y += kernel/ obj-y += kernel/
obj-y += mm/ obj-y += mm/
obj-y += net/
# for cleaning # for cleaning
subdir- += boot subdir- += boot
...@@ -51,6 +51,7 @@ config ARC ...@@ -51,6 +51,7 @@ config ARC
select PCI_SYSCALL if PCI select PCI_SYSCALL if PCI
select HAVE_ARCH_JUMP_LABEL if ISA_ARCV2 && !CPU_ENDIAN_BE32 select HAVE_ARCH_JUMP_LABEL if ISA_ARCV2 && !CPU_ENDIAN_BE32
select TRACE_IRQFLAGS_SUPPORT select TRACE_IRQFLAGS_SUPPORT
select HAVE_EBPF_JIT if ISA_ARCV2
config LOCKDEP_SUPPORT config LOCKDEP_SUPPORT
def_bool y def_bool y
......
# SPDX-License-Identifier: GPL-2.0-only
ifeq ($(CONFIG_ISA_ARCV2),y)
obj-$(CONFIG_BPF_JIT) += bpf_jit_core.o
obj-$(CONFIG_BPF_JIT) += bpf_jit_arcv2.o
endif
/* SPDX-License-Identifier: GPL-2.0 */
/*
* The interface that a back-end should provide to bpf_jit_core.c.
*
* Copyright (c) 2024 Synopsys Inc.
* Author: Shahab Vahedi <shahab@synopsys.com>
*/
#ifndef _ARC_BPF_JIT_H
#define _ARC_BPF_JIT_H
#include <linux/bpf.h>
#include <linux/filter.h>
/* Print debug info and assert. */
//#define ARC_BPF_JIT_DEBUG
/* Determine the address type of the target. */
#ifdef CONFIG_ISA_ARCV2
#define ARC_ADDR u32
#endif
/*
* For the translation of some BPF instructions, a temporary register
* might be needed for some interim data.
*/
#define JIT_REG_TMP MAX_BPF_JIT_REG
/*
* Buffer access: If buffer "b" is not NULL, advance by "n" bytes.
*
* This macro must be used in any place that potentially requires a
* "buf + len". This way, we make sure that the "buf" argument for
* the underlying "arc_*(buf, ...)" ends up as NULL instead of something
* like "0+4" or "0+8", etc. Those "arc_*()" functions check their "buf"
* value to decide if instructions should be emitted or not.
*/
#define BUF(b, n) (((b) != NULL) ? ((b) + (n)) : (b))
/************** Functions that the back-end must provide **************/
/* Extension for 32-bit operations. */
inline u8 zext(u8 *buf, u8 rd);
/***** Moves *****/
u8 mov_r32(u8 *buf, u8 rd, u8 rs, u8 sign_ext);
u8 mov_r32_i32(u8 *buf, u8 reg, s32 imm);
u8 mov_r64(u8 *buf, u8 rd, u8 rs, u8 sign_ext);
u8 mov_r64_i32(u8 *buf, u8 reg, s32 imm);
u8 mov_r64_i64(u8 *buf, u8 reg, u32 lo, u32 hi);
/***** Loads and stores *****/
u8 load_r(u8 *buf, u8 rd, u8 rs, s16 off, u8 size, bool sign_ext);
u8 store_r(u8 *buf, u8 rd, u8 rs, s16 off, u8 size);
u8 store_i(u8 *buf, s32 imm, u8 rd, s16 off, u8 size);
/***** Addition *****/
u8 add_r32(u8 *buf, u8 rd, u8 rs);
u8 add_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 add_r64(u8 *buf, u8 rd, u8 rs);
u8 add_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Subtraction *****/
u8 sub_r32(u8 *buf, u8 rd, u8 rs);
u8 sub_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 sub_r64(u8 *buf, u8 rd, u8 rs);
u8 sub_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Multiplication *****/
u8 mul_r32(u8 *buf, u8 rd, u8 rs);
u8 mul_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 mul_r64(u8 *buf, u8 rd, u8 rs);
u8 mul_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Division *****/
u8 div_r32(u8 *buf, u8 rd, u8 rs, bool sign_ext);
u8 div_r32_i32(u8 *buf, u8 rd, s32 imm, bool sign_ext);
/***** Remainder *****/
u8 mod_r32(u8 *buf, u8 rd, u8 rs, bool sign_ext);
u8 mod_r32_i32(u8 *buf, u8 rd, s32 imm, bool sign_ext);
/***** Bitwise AND *****/
u8 and_r32(u8 *buf, u8 rd, u8 rs);
u8 and_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 and_r64(u8 *buf, u8 rd, u8 rs);
u8 and_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Bitwise OR *****/
u8 or_r32(u8 *buf, u8 rd, u8 rs);
u8 or_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 or_r64(u8 *buf, u8 rd, u8 rs);
u8 or_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Bitwise XOR *****/
u8 xor_r32(u8 *buf, u8 rd, u8 rs);
u8 xor_r32_i32(u8 *buf, u8 rd, s32 imm);
u8 xor_r64(u8 *buf, u8 rd, u8 rs);
u8 xor_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Bitwise Negate *****/
u8 neg_r32(u8 *buf, u8 r);
u8 neg_r64(u8 *buf, u8 r);
/***** Bitwise left shift *****/
u8 lsh_r32(u8 *buf, u8 rd, u8 rs);
u8 lsh_r32_i32(u8 *buf, u8 rd, u8 imm);
u8 lsh_r64(u8 *buf, u8 rd, u8 rs);
u8 lsh_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Bitwise right shift (logical) *****/
u8 rsh_r32(u8 *buf, u8 rd, u8 rs);
u8 rsh_r32_i32(u8 *buf, u8 rd, u8 imm);
u8 rsh_r64(u8 *buf, u8 rd, u8 rs);
u8 rsh_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Bitwise right shift (arithmetic) *****/
u8 arsh_r32(u8 *buf, u8 rd, u8 rs);
u8 arsh_r32_i32(u8 *buf, u8 rd, u8 imm);
u8 arsh_r64(u8 *buf, u8 rd, u8 rs);
u8 arsh_r64_i32(u8 *buf, u8 rd, s32 imm);
/***** Frame related *****/
u32 mask_for_used_regs(u8 bpf_reg, bool is_call);
u8 arc_prologue(u8 *buf, u32 usage, u16 frame_size);
u8 arc_epilogue(u8 *buf, u32 usage, u16 frame_size);
/***** Jumps *****/
/*
* Different sorts of conditions (ARC enum as opposed to BPF_*).
*
* Do not change the order of enums here. ARC_CC_SLE+1 is used
* to determine the number of JCCs.
*/
enum ARC_CC {
ARC_CC_UGT = 0, /* unsigned > */
ARC_CC_UGE, /* unsigned >= */
ARC_CC_ULT, /* unsigned < */
ARC_CC_ULE, /* unsigned <= */
ARC_CC_SGT, /* signed > */
ARC_CC_SGE, /* signed >= */
ARC_CC_SLT, /* signed < */
ARC_CC_SLE, /* signed <= */
ARC_CC_AL, /* always */
ARC_CC_EQ, /* == */
ARC_CC_NE, /* != */
ARC_CC_SET, /* test */
ARC_CC_LAST
};
/*
* A few notes:
*
* - check_jmp_*() are prerequisites before calling the gen_jmp_*().
* They return "true" if the jump is possible and "false" otherwise.
*
* - The notion of "*_off" is to emphasize that these parameters are
* merely offsets in the JIT stream and not absolute addresses. One
* can look at them as addresses if the JIT code would start from
* address 0x0000_0000. Nonetheless, since the buffer address for the
* JIT is on a word-aligned address, this works and actually makes
* things simpler (offsets are in the range of u32 which is more than
* enough).
*/
bool check_jmp_32(u32 curr_off, u32 targ_off, u8 cond);
bool check_jmp_64(u32 curr_off, u32 targ_off, u8 cond);
u8 gen_jmp_32(u8 *buf, u8 rd, u8 rs, u8 cond, u32 c_off, u32 t_off);
u8 gen_jmp_64(u8 *buf, u8 rd, u8 rs, u8 cond, u32 c_off, u32 t_off);
/***** Miscellaneous *****/
u8 gen_func_call(u8 *buf, ARC_ADDR func_addr, bool external_func);
u8 arc_to_bpf_return(u8 *buf);
/*
* - Perform byte swaps on "rd" based on the "size".
* - If "force" is set, do it unconditionally. Otherwise, consider the
* desired "endian"ness and the host endianness.
* - For data "size"s up to 32 bits, perform a zero-extension if asked
* by the "do_zext" boolean.
*/
u8 gen_swap(u8 *buf, u8 rd, u8 size, u8 endian, bool force, bool do_zext);
#endif /* _ARC_BPF_JIT_H */
This diff is collapsed.
This diff is collapsed.
...@@ -135,6 +135,12 @@ enum aarch64_insn_special_register { ...@@ -135,6 +135,12 @@ enum aarch64_insn_special_register {
AARCH64_INSN_SPCLREG_SP_EL2 = 0xF210 AARCH64_INSN_SPCLREG_SP_EL2 = 0xF210
}; };
enum aarch64_insn_system_register {
AARCH64_INSN_SYSREG_TPIDR_EL1 = 0x4684,
AARCH64_INSN_SYSREG_TPIDR_EL2 = 0x6682,
AARCH64_INSN_SYSREG_SP_EL0 = 0x4208,
};
enum aarch64_insn_variant { enum aarch64_insn_variant {
AARCH64_INSN_VARIANT_32BIT, AARCH64_INSN_VARIANT_32BIT,
AARCH64_INSN_VARIANT_64BIT AARCH64_INSN_VARIANT_64BIT
...@@ -686,6 +692,8 @@ u32 aarch64_insn_gen_cas(enum aarch64_insn_register result, ...@@ -686,6 +692,8 @@ u32 aarch64_insn_gen_cas(enum aarch64_insn_register result,
} }
#endif #endif
u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type); u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type);
u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result,
enum aarch64_insn_system_register sysreg);
s32 aarch64_get_branch_offset(u32 insn); s32 aarch64_get_branch_offset(u32 insn);
u32 aarch64_set_branch_offset(u32 insn, s32 offset); u32 aarch64_set_branch_offset(u32 insn, s32 offset);
......
...@@ -1515,3 +1515,14 @@ u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type) ...@@ -1515,3 +1515,14 @@ u32 aarch64_insn_gen_dmb(enum aarch64_insn_mb_type type)
return insn; return insn;
} }
u32 aarch64_insn_gen_mrs(enum aarch64_insn_register result,
enum aarch64_insn_system_register sysreg)
{
u32 insn = aarch64_insn_get_mrs_value();
insn &= ~GENMASK(19, 0);
insn |= sysreg << 5;
return aarch64_insn_encode_register(AARCH64_INSN_REGTYPE_RT,
insn, result);
}
...@@ -297,4 +297,12 @@ ...@@ -297,4 +297,12 @@
#define A64_ADR(Rd, offset) \ #define A64_ADR(Rd, offset) \
aarch64_insn_gen_adr(0, offset, Rd, AARCH64_INSN_ADR_TYPE_ADR) aarch64_insn_gen_adr(0, offset, Rd, AARCH64_INSN_ADR_TYPE_ADR)
/* MRS */
#define A64_MRS_TPIDR_EL1(Rt) \
aarch64_insn_gen_mrs(Rt, AARCH64_INSN_SYSREG_TPIDR_EL1)
#define A64_MRS_TPIDR_EL2(Rt) \
aarch64_insn_gen_mrs(Rt, AARCH64_INSN_SYSREG_TPIDR_EL2)
#define A64_MRS_SP_EL0(Rt) \
aarch64_insn_gen_mrs(Rt, AARCH64_INSN_SYSREG_SP_EL0)
#endif /* _BPF_JIT_H */ #endif /* _BPF_JIT_H */
...@@ -494,21 +494,27 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx) ...@@ -494,21 +494,27 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
static int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx) static int emit_lse_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
{ {
const u8 code = insn->code; const u8 code = insn->code;
const u8 arena_vm_base = bpf2a64[ARENA_VM_START];
const u8 dst = bpf2a64[insn->dst_reg]; const u8 dst = bpf2a64[insn->dst_reg];
const u8 src = bpf2a64[insn->src_reg]; const u8 src = bpf2a64[insn->src_reg];
const u8 tmp = bpf2a64[TMP_REG_1]; const u8 tmp = bpf2a64[TMP_REG_1];
const u8 tmp2 = bpf2a64[TMP_REG_2]; const u8 tmp2 = bpf2a64[TMP_REG_2];
const bool isdw = BPF_SIZE(code) == BPF_DW; const bool isdw = BPF_SIZE(code) == BPF_DW;
const bool arena = BPF_MODE(code) == BPF_PROBE_ATOMIC;
const s16 off = insn->off; const s16 off = insn->off;
u8 reg; u8 reg = dst;
if (!off) { if (off || arena) {
reg = dst; if (off) {
} else {
emit_a64_mov_i(1, tmp, off, ctx); emit_a64_mov_i(1, tmp, off, ctx);
emit(A64_ADD(1, tmp, tmp, dst), ctx); emit(A64_ADD(1, tmp, tmp, dst), ctx);
reg = tmp; reg = tmp;
} }
if (arena) {
emit(A64_ADD(1, tmp, reg, arena_vm_base), ctx);
reg = tmp;
}
}
switch (insn->imm) { switch (insn->imm) {
/* lock *(u32/u64 *)(dst_reg + off) <op>= src_reg */ /* lock *(u32/u64 *)(dst_reg + off) <op>= src_reg */
...@@ -576,6 +582,12 @@ static int emit_ll_sc_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx) ...@@ -576,6 +582,12 @@ static int emit_ll_sc_atomic(const struct bpf_insn *insn, struct jit_ctx *ctx)
u8 reg; u8 reg;
s32 jmp_offset; s32 jmp_offset;
if (BPF_MODE(code) == BPF_PROBE_ATOMIC) {
/* ll_sc based atomics don't support unsafe pointers yet. */
pr_err_once("unknown atomic opcode %02x\n", code);
return -EINVAL;
}
if (!off) { if (!off) {
reg = dst; reg = dst;
} else { } else {
...@@ -777,7 +789,8 @@ static int add_exception_handler(const struct bpf_insn *insn, ...@@ -777,7 +789,8 @@ static int add_exception_handler(const struct bpf_insn *insn,
if (BPF_MODE(insn->code) != BPF_PROBE_MEM && if (BPF_MODE(insn->code) != BPF_PROBE_MEM &&
BPF_MODE(insn->code) != BPF_PROBE_MEMSX && BPF_MODE(insn->code) != BPF_PROBE_MEMSX &&
BPF_MODE(insn->code) != BPF_PROBE_MEM32) BPF_MODE(insn->code) != BPF_PROBE_MEM32 &&
BPF_MODE(insn->code) != BPF_PROBE_ATOMIC)
return 0; return 0;
if (!ctx->prog->aux->extable || if (!ctx->prog->aux->extable ||
...@@ -877,6 +890,15 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, ...@@ -877,6 +890,15 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
emit(A64_ORR(1, tmp, dst, tmp), ctx); emit(A64_ORR(1, tmp, dst, tmp), ctx);
emit(A64_MOV(1, dst, tmp), ctx); emit(A64_MOV(1, dst, tmp), ctx);
break; break;
} else if (insn_is_mov_percpu_addr(insn)) {
if (dst != src)
emit(A64_MOV(1, dst, src), ctx);
if (cpus_have_cap(ARM64_HAS_VIRT_HOST_EXTN))
emit(A64_MRS_TPIDR_EL2(tmp), ctx);
else
emit(A64_MRS_TPIDR_EL1(tmp), ctx);
emit(A64_ADD(1, dst, dst, tmp), ctx);
break;
} }
switch (insn->off) { switch (insn->off) {
case 0: case 0:
...@@ -1206,6 +1228,21 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, ...@@ -1206,6 +1228,21 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
const u8 r0 = bpf2a64[BPF_REG_0]; const u8 r0 = bpf2a64[BPF_REG_0];
bool func_addr_fixed; bool func_addr_fixed;
u64 func_addr; u64 func_addr;
u32 cpu_offset;
/* Implement helper call to bpf_get_smp_processor_id() inline */
if (insn->src_reg == 0 && insn->imm == BPF_FUNC_get_smp_processor_id) {
cpu_offset = offsetof(struct thread_info, cpu);
emit(A64_MRS_SP_EL0(tmp), ctx);
if (is_lsi_offset(cpu_offset, 2)) {
emit(A64_LDR32I(r0, tmp, cpu_offset), ctx);
} else {
emit_a64_mov_i(1, tmp2, cpu_offset, ctx);
emit(A64_LDR32(r0, tmp, tmp2), ctx);
}
break;
}
ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass, ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass,
&func_addr, &func_addr_fixed); &func_addr, &func_addr_fixed);
...@@ -1474,12 +1511,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, ...@@ -1474,12 +1511,18 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
case BPF_STX | BPF_ATOMIC | BPF_W: case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW: case BPF_STX | BPF_ATOMIC | BPF_DW:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_W:
case BPF_STX | BPF_PROBE_ATOMIC | BPF_DW:
if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS)) if (cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
ret = emit_lse_atomic(insn, ctx); ret = emit_lse_atomic(insn, ctx);
else else
ret = emit_ll_sc_atomic(insn, ctx); ret = emit_ll_sc_atomic(insn, ctx);
if (ret) if (ret)
return ret; return ret;
ret = add_exception_handler(insn, ctx, dst);
if (ret)
return ret;
break; break;
default: default:
...@@ -2527,6 +2570,34 @@ bool bpf_jit_supports_arena(void) ...@@ -2527,6 +2570,34 @@ bool bpf_jit_supports_arena(void)
return true; return true;
} }
bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena)
{
if (!in_arena)
return true;
switch (insn->code) {
case BPF_STX | BPF_ATOMIC | BPF_W:
case BPF_STX | BPF_ATOMIC | BPF_DW:
if (!cpus_have_cap(ARM64_HAS_LSE_ATOMICS))
return false;
}
return true;
}
bool bpf_jit_supports_percpu_insn(void)
{
return true;
}
bool bpf_jit_inlines_helper_call(s32 imm)
{
switch (imm) {
case BPF_FUNC_get_smp_processor_id:
return true;
default:
return false;
}
}
void bpf_jit_free(struct bpf_prog *prog) void bpf_jit_free(struct bpf_prog *prog)
{ {
if (prog->jited) { if (prog->jited) {
......
...@@ -608,7 +608,7 @@ static inline u32 rv_nop(void) ...@@ -608,7 +608,7 @@ static inline u32 rv_nop(void)
return rv_i_insn(0, 0, 0, 0, 0x13); return rv_i_insn(0, 0, 0, 0, 0x13);
} }
/* RVC instrutions. */ /* RVC instructions. */
static inline u16 rvc_addi4spn(u8 rd, u32 imm10) static inline u16 rvc_addi4spn(u8 rd, u32 imm10)
{ {
...@@ -737,7 +737,7 @@ static inline u16 rvc_swsp(u32 imm8, u8 rs2) ...@@ -737,7 +737,7 @@ static inline u16 rvc_swsp(u32 imm8, u8 rs2)
return rv_css_insn(0x6, imm, rs2, 0x2); return rv_css_insn(0x6, imm, rs2, 0x2);
} }
/* RVZBB instrutions. */ /* RVZBB instructions. */
static inline u32 rvzbb_sextb(u8 rd, u8 rs1) static inline u32 rvzbb_sextb(u8 rd, u8 rs1)
{ {
return rv_i_insn(0x604, rs1, 1, rd, 0x13); return rv_i_insn(0x604, rs1, 1, rd, 0x13);
......
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
#include <linux/stop_machine.h> #include <linux/stop_machine.h>
#include <asm/patch.h> #include <asm/patch.h>
#include <asm/cfi.h> #include <asm/cfi.h>
#include <asm/percpu.h>
#include "bpf_jit.h" #include "bpf_jit.h"
#define RV_FENTRY_NINSNS 2 #define RV_FENTRY_NINSNS 2
...@@ -503,33 +504,33 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64, ...@@ -503,33 +504,33 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
break; break;
/* src_reg = atomic_fetch_<op>(dst_reg + off16, src_reg) */ /* src_reg = atomic_fetch_<op>(dst_reg + off16, src_reg) */
case BPF_ADD | BPF_FETCH: case BPF_ADD | BPF_FETCH:
emit(is64 ? rv_amoadd_d(rs, rs, rd, 0, 0) : emit(is64 ? rv_amoadd_d(rs, rs, rd, 1, 1) :
rv_amoadd_w(rs, rs, rd, 0, 0), ctx); rv_amoadd_w(rs, rs, rd, 1, 1), ctx);
if (!is64) if (!is64)
emit_zextw(rs, rs, ctx); emit_zextw(rs, rs, ctx);
break; break;
case BPF_AND | BPF_FETCH: case BPF_AND | BPF_FETCH:
emit(is64 ? rv_amoand_d(rs, rs, rd, 0, 0) : emit(is64 ? rv_amoand_d(rs, rs, rd, 1, 1) :
rv_amoand_w(rs, rs, rd, 0, 0), ctx); rv_amoand_w(rs, rs, rd, 1, 1), ctx);
if (!is64) if (!is64)
emit_zextw(rs, rs, ctx); emit_zextw(rs, rs, ctx);
break; break;
case BPF_OR | BPF_FETCH: case BPF_OR | BPF_FETCH:
emit(is64 ? rv_amoor_d(rs, rs, rd, 0, 0) : emit(is64 ? rv_amoor_d(rs, rs, rd, 1, 1) :
rv_amoor_w(rs, rs, rd, 0, 0), ctx); rv_amoor_w(rs, rs, rd, 1, 1), ctx);
if (!is64) if (!is64)
emit_zextw(rs, rs, ctx); emit_zextw(rs, rs, ctx);
break; break;
case BPF_XOR | BPF_FETCH: case BPF_XOR | BPF_FETCH:
emit(is64 ? rv_amoxor_d(rs, rs, rd, 0, 0) : emit(is64 ? rv_amoxor_d(rs, rs, rd, 1, 1) :
rv_amoxor_w(rs, rs, rd, 0, 0), ctx); rv_amoxor_w(rs, rs, rd, 1, 1), ctx);
if (!is64) if (!is64)
emit_zextw(rs, rs, ctx); emit_zextw(rs, rs, ctx);
break; break;
/* src_reg = atomic_xchg(dst_reg + off16, src_reg); */ /* src_reg = atomic_xchg(dst_reg + off16, src_reg); */
case BPF_XCHG: case BPF_XCHG:
emit(is64 ? rv_amoswap_d(rs, rs, rd, 0, 0) : emit(is64 ? rv_amoswap_d(rs, rs, rd, 1, 1) :
rv_amoswap_w(rs, rs, rd, 0, 0), ctx); rv_amoswap_w(rs, rs, rd, 1, 1), ctx);
if (!is64) if (!is64)
emit_zextw(rs, rs, ctx); emit_zextw(rs, rs, ctx);
break; break;
...@@ -1089,6 +1090,24 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, ...@@ -1089,6 +1090,24 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
emit_or(RV_REG_T1, rd, RV_REG_T1, ctx); emit_or(RV_REG_T1, rd, RV_REG_T1, ctx);
emit_mv(rd, RV_REG_T1, ctx); emit_mv(rd, RV_REG_T1, ctx);
break; break;
} else if (insn_is_mov_percpu_addr(insn)) {
if (rd != rs)
emit_mv(rd, rs, ctx);
#ifdef CONFIG_SMP
/* Load current CPU number in T1 */
emit_ld(RV_REG_T1, offsetof(struct thread_info, cpu),
RV_REG_TP, ctx);
/* << 3 because offsets are 8 bytes */
emit_slli(RV_REG_T1, RV_REG_T1, 3, ctx);
/* Load address of __per_cpu_offset array in T2 */
emit_addr(RV_REG_T2, (u64)&__per_cpu_offset, extra_pass, ctx);
/* Add offset of current CPU to __per_cpu_offset */
emit_add(RV_REG_T1, RV_REG_T2, RV_REG_T1, ctx);
/* Load __per_cpu_offset[cpu] in T1 */
emit_ld(RV_REG_T1, 0, RV_REG_T1, ctx);
/* Add the offset to Rd */
emit_add(rd, rd, RV_REG_T1, ctx);
#endif
} }
if (imm == 1) { if (imm == 1) {
/* Special mov32 for zext */ /* Special mov32 for zext */
...@@ -1474,6 +1493,22 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx, ...@@ -1474,6 +1493,22 @@ int bpf_jit_emit_insn(const struct bpf_insn *insn, struct rv_jit_context *ctx,
bool fixed_addr; bool fixed_addr;
u64 addr; u64 addr;
/* Inline calls to bpf_get_smp_processor_id()
*
* RV_REG_TP holds the address of the current CPU's task_struct and thread_info is
* at offset 0 in task_struct.
* Load cpu from thread_info:
* Set R0 to ((struct thread_info *)(RV_REG_TP))->cpu
*
* This replicates the implementation of raw_smp_processor_id() on RISCV
*/
if (insn->src_reg == 0 && insn->imm == BPF_FUNC_get_smp_processor_id) {
/* Load current CPU number in R0 */
emit_ld(bpf_to_rv_reg(BPF_REG_0, ctx), offsetof(struct thread_info, cpu),
RV_REG_TP, ctx);
break;
}
mark_call(ctx); mark_call(ctx);
ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass, ret = bpf_jit_get_func_addr(ctx->prog, insn, extra_pass,
&addr, &fixed_addr); &addr, &fixed_addr);
...@@ -2038,3 +2073,18 @@ bool bpf_jit_supports_arena(void) ...@@ -2038,3 +2073,18 @@ bool bpf_jit_supports_arena(void)
{ {
return true; return true;
} }
bool bpf_jit_supports_percpu_insn(void)
{
return true;
}
bool bpf_jit_inlines_helper_call(s32 imm)
{
switch (imm) {
case BPF_FUNC_get_smp_processor_id:
return true;
default:
return false;
}
}
...@@ -1427,8 +1427,12 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, ...@@ -1427,8 +1427,12 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp,
EMIT6_DISP_LH(0xeb000000, is32 ? (op32) : (op64), \ EMIT6_DISP_LH(0xeb000000, is32 ? (op32) : (op64), \
(insn->imm & BPF_FETCH) ? src_reg : REG_W0, \ (insn->imm & BPF_FETCH) ? src_reg : REG_W0, \
src_reg, dst_reg, off); \ src_reg, dst_reg, off); \
if (is32 && (insn->imm & BPF_FETCH)) \ if (insn->imm & BPF_FETCH) { \
/* bcr 14,0 - see atomic_fetch_{add,and,or,xor}() */ \
_EMIT2(0x07e0); \
if (is32) \
EMIT_ZERO(src_reg); \ EMIT_ZERO(src_reg); \
} \
} while (0) } while (0)
case BPF_ADD: case BPF_ADD:
case BPF_ADD | BPF_FETCH: case BPF_ADD | BPF_FETCH:
......
...@@ -3,6 +3,8 @@ ...@@ -3,6 +3,8 @@
#ifndef _LINUX_BTF_IDS_H #ifndef _LINUX_BTF_IDS_H
#define _LINUX_BTF_IDS_H #define _LINUX_BTF_IDS_H
#include <linux/types.h> /* for u32 */
struct btf_id_set { struct btf_id_set {
u32 cnt; u32 cnt;
u32 ids[]; u32 ids[];
......
...@@ -993,6 +993,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); ...@@ -993,6 +993,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog); struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
void bpf_jit_compile(struct bpf_prog *prog); void bpf_jit_compile(struct bpf_prog *prog);
bool bpf_jit_needs_zext(void); bool bpf_jit_needs_zext(void);
bool bpf_jit_inlines_helper_call(s32 imm);
bool bpf_jit_supports_subprog_tailcalls(void); bool bpf_jit_supports_subprog_tailcalls(void);
bool bpf_jit_supports_percpu_insn(void); bool bpf_jit_supports_percpu_insn(void);
bool bpf_jit_supports_kfunc_call(void); bool bpf_jit_supports_kfunc_call(void);
......
This diff is collapsed.
...@@ -1115,6 +1115,7 @@ enum bpf_attach_type { ...@@ -1115,6 +1115,7 @@ enum bpf_attach_type {
BPF_CGROUP_UNIX_GETSOCKNAME, BPF_CGROUP_UNIX_GETSOCKNAME,
BPF_NETKIT_PRIMARY, BPF_NETKIT_PRIMARY,
BPF_NETKIT_PEER, BPF_NETKIT_PEER,
BPF_TRACE_KPROBE_SESSION,
__MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
}; };
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment