An error occurred fetching the project authors.
  1. 17 Oct, 2019 3 commits
    • Alexei Starovoitov's avatar
      bpf: Check types of arguments passed into helpers · a7658e1a
      Alexei Starovoitov authored
      Introduce new helper that reuses existing skb perf_event output
      implementation, but can be called from raw_tracepoint programs
      that receive 'struct sk_buff *' as tracepoint argument or
      can walk other kernel data structures to skb pointer.
      
      In order to do that teach verifier to resolve true C types
      of bpf helpers into in-kernel BTF ids.
      The type of kernel pointer passed by raw tracepoint into bpf
      program will be tracked by the verifier all the way until
      it's passed into helper function.
      For example:
      kfree_skb() kernel function calls trace_kfree_skb(skb, loc);
      bpf programs receives that skb pointer and may eventually
      pass it into bpf_skb_output() bpf helper which in-kernel is
      implemented via bpf_skb_event_output() kernel function.
      Its first argument in the kernel is 'struct sk_buff *'.
      The verifier makes sure that types match all the way.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org
      a7658e1a
    • Alexei Starovoitov's avatar
      bpf: Implement accurate raw_tp context access via BTF · 9e15db66
      Alexei Starovoitov authored
      libbpf analyzes bpf C program, searches in-kernel BTF for given type name
      and stores it into expected_attach_type.
      The kernel verifier expects this btf_id to point to something like:
      typedef void (*btf_trace_kfree_skb)(void *, struct sk_buff *skb, void *loc);
      which represents signature of raw_tracepoint "kfree_skb".
      
      Then btf_ctx_access() matches ctx+0 access in bpf program with 'skb'
      and 'ctx+8' access with 'loc' arguments of "kfree_skb" tracepoint.
      In first case it passes btf_id of 'struct sk_buff *' back to the verifier core
      and 'void *' in second case.
      
      Then the verifier tracks PTR_TO_BTF_ID as any other pointer type.
      Like PTR_TO_SOCKET points to 'struct bpf_sock',
      PTR_TO_TCP_SOCK points to 'struct bpf_tcp_sock', and so on.
      PTR_TO_BTF_ID points to in-kernel structs.
      If 1234 is btf_id of 'struct sk_buff' in vmlinux's BTF
      then PTR_TO_BTF_ID#1234 points to one of in kernel skbs.
      
      When PTR_TO_BTF_ID#1234 is dereferenced (like r2 = *(u64 *)r1 + 32)
      the btf_struct_access() checks which field of 'struct sk_buff' is
      at offset 32. Checks that size of access matches type definition
      of the field and continues to track the dereferenced type.
      If that field was a pointer to 'struct net_device' the r2's type
      will be PTR_TO_BTF_ID#456. Where 456 is btf_id of 'struct net_device'
      in vmlinux's BTF.
      
      Such verifier analysis prevents "cheating" in BPF C program.
      The program cannot cast arbitrary pointer to 'struct sk_buff *'
      and access it. C compiler would allow type cast, of course,
      but the verifier will notice type mismatch based on BPF assembly
      and in-kernel BTF.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191016032505.2089704-7-ast@kernel.org
      9e15db66
    • Alexei Starovoitov's avatar
      bpf: Process in-kernel BTF · 8580ac94
      Alexei Starovoitov authored
      If in-kernel BTF exists parse it and prepare 'struct btf *btf_vmlinux'
      for further use by the verifier.
      In-kernel BTF is trusted just like kallsyms and other build artifacts
      embedded into vmlinux.
      Yet run this BTF image through BTF verifier to make sure
      that it is valid and it wasn't mangled during the build.
      If in-kernel BTF is incorrect it means either gcc or pahole or kernel
      are buggy. In such case disallow loading BPF programs.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20191016032505.2089704-4-ast@kernel.org
      8580ac94
  2. 26 Sep, 2019 1 commit
  3. 19 Sep, 2019 1 commit
  4. 20 Aug, 2019 2 commits
  5. 15 Jul, 2019 1 commit
    • Andrii Nakryiko's avatar
      bpf: fix BTF verifier size resolution logic · 1acc5d5c
      Andrii Nakryiko authored
      BTF verifier has a size resolution bug which in some circumstances leads to
      invalid size resolution for, e.g., TYPEDEF modifier.  This happens if we have
      [1] PTR -> [2] TYPEDEF -> [3] ARRAY, in which case due to being in pointer
      context ARRAY size won't be resolved (because for pointer it doesn't matter, so
      it's a sink in pointer context), but it will be permanently remembered as zero
      for TYPEDEF and TYPEDEF will be marked as RESOLVED. Eventually ARRAY size will
      be resolved correctly, but TYPEDEF resolved_size won't be updated anymore.
      This, subsequently, will lead to erroneous map creation failure, if that
      TYPEDEF is specified as either key or value, as key_size/value_size won't
      correspond to resolved size of TYPEDEF (kernel will believe it's zero).
      
      Note, that if BTF was ordered as [1] ARRAY <- [2] TYPEDEF <- [3] PTR, this
      won't be a problem, as by the time we get to TYPEDEF, ARRAY's size is already
      calculated and stored.
      
      This bug manifests itself in rejecting BTF-defined maps that use array
      typedef as a value type:
      
      typedef int array_t[16];
      
      struct {
          __uint(type, BPF_MAP_TYPE_ARRAY);
          __type(value, array_t); /* i.e., array_t *value; */
      } test_map SEC(".maps");
      
      The fix consists on not relying on modifier's resolved_size and instead using
      modifier's resolved_id (type ID for "concrete" type to which modifier
      eventually resolves) and doing size determination for that resolved type. This
      allow to preserve existing "early DFS termination" logic for PTR or
      STRUCT_OR_ARRAY contexts, but still do correct size determination for modifier
      types.
      
      Fixes: eb3f595d ("bpf: btf: Validate type reference")
      Cc: Martin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1acc5d5c
  6. 24 Jun, 2019 1 commit
  7. 10 Apr, 2019 2 commits
    • Daniel Borkmann's avatar
      bpf: allow for key-less BTF in array map · 2824ecb7
      Daniel Borkmann authored
      Given we'll be reusing BPF array maps for global data/bss/rodata
      sections, we need a way to associate BTF DataSec type as its map
      value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR()
      macro hack e.g. via 38d5d3b3 ("bpf: Introduce BPF_ANNOTATE_KV_PAIR")
      to get initial map to type association going. While more use cases
      for it are discouraged, this also won't work for global data since
      the use of array map is a BPF loader detail and therefore unknown
      at compilation time. For array maps with just a single entry we make
      an exception in terms of BTF in that key type is declared optional
      if value type is of DataSec type. The latter LLVM is guaranteed to
      emit and it also aligns with how we regard global data maps as just
      a plain buffer area reusing existing map facilities for allowing
      things like introspection with existing tools.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2824ecb7
    • Daniel Borkmann's avatar
      bpf: kernel side support for BTF Var and DataSec · 1dc92851
      Daniel Borkmann authored
      This work adds kernel-side verification, logging and seq_show dumping
      of BTF Var and DataSec kinds which are emitted with latest LLVM. The
      following constraints apply:
      
      BTF Var must have:
      
      - Its kind_flag is 0
      - Its vlen is 0
      - Must point to a valid type
      - Type must not resolve to a forward type
      - Size of underlying type must be > 0
      - Must have a valid name
      - Can only be a source type, not sink or intermediate one
      - Name may include dots (e.g. in case of static variables
        inside functions)
      - Cannot be a member of a struct/union
      - Linkage so far can either only be static or global/allocated
      
      BTF DataSec must have:
      
      - Its kind_flag is 0
      - Its vlen cannot be 0
      - Its size cannot be 0
      - Must have a valid name
      - Can only be a source type, not sink or intermediate one
      - Name may include dots (e.g. to represent .bss, .data, .rodata etc)
      - Cannot be a member of a struct/union
      - Inner btf_var_secinfo array with {type,offset,size} triple
        must be sorted by offset in ascending order
      - Type must always point to BTF Var
      - BTF resolved size of Var must be <= size provided by triple
      - DataSec size must be >= sum of triple sizes (thus holes
        are allowed)
      
      btf_var_resolve(), btf_ptr_resolve() and btf_modifier_resolve()
      are on a high level quite similar but each come with slight,
      subtle differences. They could potentially be a bit refactored
      in future which hasn't been done here to ease review.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1dc92851
  8. 01 Feb, 2019 1 commit
    • Alexei Starovoitov's avatar
      bpf: introduce bpf_spin_lock · d83525ca
      Alexei Starovoitov authored
      Introduce 'struct bpf_spin_lock' and bpf_spin_lock/unlock() helpers to let
      bpf program serialize access to other variables.
      
      Example:
      struct hash_elem {
          int cnt;
          struct bpf_spin_lock lock;
      };
      struct hash_elem * val = bpf_map_lookup_elem(&hash_map, &key);
      if (val) {
          bpf_spin_lock(&val->lock);
          val->cnt++;
          bpf_spin_unlock(&val->lock);
      }
      
      Restrictions and safety checks:
      - bpf_spin_lock is only allowed inside HASH and ARRAY maps.
      - BTF description of the map is mandatory for safety analysis.
      - bpf program can take one bpf_spin_lock at a time, since two or more can
        cause dead locks.
      - only one 'struct bpf_spin_lock' is allowed per map element.
        It drastically simplifies implementation yet allows bpf program to use
        any number of bpf_spin_locks.
      - when bpf_spin_lock is taken the calls (either bpf2bpf or helpers) are not allowed.
      - bpf program must bpf_spin_unlock() before return.
      - bpf program can access 'struct bpf_spin_lock' only via
        bpf_spin_lock()/bpf_spin_unlock() helpers.
      - load/store into 'struct bpf_spin_lock lock;' field is not allowed.
      - to use bpf_spin_lock() helper the BTF description of map value must be
        a struct and have 'struct bpf_spin_lock anyname;' field at the top level.
        Nested lock inside another struct is not allowed.
      - syscall map_lookup doesn't copy bpf_spin_lock field to user space.
      - syscall map_update and program map_update do not update bpf_spin_lock field.
      - bpf_spin_lock cannot be on the stack or inside networking packet.
        bpf_spin_lock can only be inside HASH or ARRAY map value.
      - bpf_spin_lock is available to root only and to all program types.
      - bpf_spin_lock is not allowed in inner maps of map-in-map.
      - ld_abs is not allowed inside spin_lock-ed region.
      - tracing progs and socket filter progs cannot use bpf_spin_lock due to
        insufficient preemption checks
      
      Implementation details:
      - cgroup-bpf class of programs can nest with xdp/tc programs.
        Hence bpf_spin_lock is equivalent to spin_lock_irqsave.
        Other solutions to avoid nested bpf_spin_lock are possible.
        Like making sure that all networking progs run with softirq disabled.
        spin_lock_irqsave is the simplest and doesn't add overhead to the
        programs that don't use it.
      - arch_spinlock_t is used when its implemented as queued_spin_lock
      - archs can force their own arch_spinlock_t
      - on architectures where queued_spin_lock is not available and
        sizeof(arch_spinlock_t) != sizeof(__u32) trivial lock is used.
      - presence of bpf_spin_lock inside map value could have been indicated via
        extra flag during map_create, but specifying it via BTF is cleaner.
        It provides introspection for map key/value and reduces user mistakes.
      
      Next steps:
      - allow bpf_spin_lock in other map types (like cgroup local storage)
      - introduce BPF_F_LOCK flag for bpf_map_update() syscall and helper
        to request kernel to grab bpf_spin_lock before rewriting the value.
        That will serialize access to map elements.
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d83525ca
  9. 30 Jan, 2019 1 commit
    • Yonghong Song's avatar
      bpf: btf: allow typedef func_proto · 81f5c6f5
      Yonghong Song authored
      Current implementation does not allow typedef func_proto.
      But it is actually allowed.
        -bash-4.4$ cat t.c
        typedef int (f) (int);
        f *g;
        -bash-4.4$ clang -O2 -g -c -target bpf t.c -Xclang -target-feature -Xclang +dwarfris
        -bash-4.4$ pahole -JV t.o
        File t.o:
        [1] PTR (anon) type_id=2
        [2] TYPEDEF f type_id=3
        [3] FUNC_PROTO (anon) return=4 args=(4 (anon))
        [4] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
        -bash-4.4$
      
      This patch related btf verifier to allow such (typedef func_proto)
      patterns.
      
      Fixes: 2667a262 ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO")
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      81f5c6f5
  10. 17 Jan, 2019 1 commit
  11. 16 Jan, 2019 1 commit
    • Yonghong Song's avatar
      bpf: btf: support 128 bit integer type · b1e8818c
      Yonghong Song authored
      Currently, btf only supports up to 64-bit integer.
      On the other hand, 128bit support for gcc and clang
      has existed for a long time. For example, both gcc 4.8
      and llvm 3.7 supports types "__int128" and
      "unsigned __int128" for virtually all 64bit architectures
      including bpf.
      
      The requirement for __int128 support comes from two areas:
        . bpf program may use __int128. For example, some bcc tools
          (https://github.com/iovisor/bcc/tree/master/tools),
          mostly tcp v6 related, tcpstates.py, tcpaccept.py, etc.,
          are using __int128 to represent the ipv6 addresses.
        . linux itself is using __int128 types. Hence supporting
          __int128 type in BTF is required for vmlinux BTF,
          which will be used by "compile once and run everywhere"
          and other projects.
      
      For 128bit integer, instead of base-10, hex numbers are pretty
      printed out as large decimal number is hard to decipher, e.g.,
      for ipv6 addresses.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      b1e8818c
  12. 11 Jan, 2019 1 commit
    • Yonghong Song's avatar
      bpf: fix bpffs bitfield pretty print · 17e3ac81
      Yonghong Song authored
      Commit 9d5f9f70 ("bpf: btf: fix struct/union/fwd types
      with kind_flag") introduced kind_flag and used bitfield_size
      in the btf_member to directly pretty print member values.
      
      The commit contained a bug where the incorrect parameters could be
      passed to function btf_bitfield_seq_show(). The bits_offset
      parameter in the function expects a value less than 8.
      Instead, the member offset in the structure is passed.
      
      The below is btf_bitfield_seq_show() func signature:
        void btf_bitfield_seq_show(void *data, u8 bits_offset,
                                   u8 nr_bits, struct seq_file *m)
      both bits_offset and nr_bits are u8 type. If the bitfield
      member offset is greater than 256, incorrect value will
      be printed.
      
      This patch fixed the issue by calculating correct proper
      data offset and bits_offset similar to non kind_flag case.
      
      Fixes: 9d5f9f70 ("bpf: btf: fix struct/union/fwd types with kind_flag")
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      17e3ac81
  13. 18 Dec, 2018 4 commits
    • Yonghong Song's avatar
      bpf: log struct/union attribute for forward type · 76c43ae8
      Yonghong Song authored
      Current btf internal verbose logger logs fwd type as
        [2] FWD A type_id=0
      where A is the type name.
      
      Commit 9d5f9f70 ("bpf: btf: fix struct/union/fwd types
      with kind_flag") introduced kind_flag which can be used
      to distinguish whether a forward type is a struct or
      union.
      
      Also, "type_id=0" does not carry any meaningful
      information for fwd type as btf_type.type = 0 is simply
      enforced during btf verification and is not used
      anywhere else.
      
      This commit changed the log to
        [2] FWD A struct
      if kind_flag = 0, or
        [2] FWD A union
      if kind_flag = 1.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      76c43ae8
    • Yonghong Song's avatar
      bpf: enable cgroup local storage map pretty print with kind_flag · ffa0c1cf
      Yonghong Song authored
      Commit 970289fc0a83 ("bpf: add bpffs pretty print for cgroup
      local storage maps") added bpffs pretty print for cgroup
      local storage maps. The commit worked for struct without kind_flag
      set.
      
      This patch refactored and made pretty print also work
      with kind_flag set for the struct.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ffa0c1cf
    • Yonghong Song's avatar
      bpf: btf: fix struct/union/fwd types with kind_flag · 9d5f9f70
      Yonghong Song authored
      This patch fixed two issues with BTF. One is related to
      struct/union bitfield encoding and the other is related to
      forward type.
      
      Issue #1 and solution:
      
      ======================
      
      Current btf encoding of bitfield follows what pahole generates.
      For each bitfield, pahole will duplicate the type chain and
      put the bitfield size at the final int or enum type.
      Since the BTF enum type cannot encode bit size,
      pahole workarounds the issue by generating
      an int type whenever the enum bit size is not 32.
      
      For example,
        -bash-4.4$ cat t.c
        typedef int ___int;
        enum A { A1, A2, A3 };
        struct t {
          int a[5];
          ___int b:4;
          volatile enum A c:4;
        } g;
        -bash-4.4$ gcc -c -O2 -g t.c
      The current kernel supports the following BTF encoding:
        $ pahole -JV t.o
        [1] TYPEDEF ___int type_id=2
        [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
        [3] ENUM A size=4 vlen=3
              A1 val=0
              A2 val=1
              A3 val=2
        [4] STRUCT t size=24 vlen=3
              a type_id=5 bits_offset=0
              b type_id=9 bits_offset=160
              c type_id=11 bits_offset=164
        [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
        [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
        [7] VOLATILE (anon) type_id=3
        [8] INT int size=1 bit_offset=0 nr_bits=4 encoding=(none)
        [9] TYPEDEF ___int type_id=8
        [10] INT (anon) size=1 bit_offset=0 nr_bits=4 encoding=SIGNED
        [11] VOLATILE (anon) type_id=10
      
      Two issues are in the above:
        . by changing enum type to int, we lost the original
          type information and this will not be ideal later
          when we try to convert BTF to a header file.
        . the type duplication for bitfields will cause
          BTF bloat. Duplicated types cannot be deduplicated
          later if the bitfield size is different.
      
      To fix this issue, this patch implemented a compatible
      change for BTF struct type encoding:
        . the bit 31 of struct_type->info, previously reserved,
          now is used to indicate whether bitfield_size is
          encoded in btf_member or not.
        . if bit 31 of struct_type->info is set,
          btf_member->offset will encode like:
            bit 0 - 23: bit offset
            bit 24 - 31: bitfield size
          if bit 31 is not set, the old behavior is preserved:
            bit 0 - 31: bit offset
      
      So if the struct contains a bit field, the maximum bit offset
      will be reduced to (2^24 - 1) instead of MAX_UINT. The maximum
      bitfield size will be 256 which is enough for today as maximum
      bitfield in compiler can be 128 where int128 type is supported.
      
      This kernel patch intends to support the new BTF encoding:
        $ pahole -JV t.o
        [1] TYPEDEF ___int type_id=2
        [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
        [3] ENUM A size=4 vlen=3
              A1 val=0
              A2 val=1
              A3 val=2
        [4] STRUCT t kind_flag=1 size=24 vlen=3
              a type_id=5 bitfield_size=0 bits_offset=0
              b type_id=1 bitfield_size=4 bits_offset=160
              c type_id=7 bitfield_size=4 bits_offset=164
        [5] ARRAY (anon) type_id=2 index_type_id=2 nr_elems=5
        [6] INT sizetype size=8 bit_offset=0 nr_bits=64 encoding=(none)
        [7] VOLATILE (anon) type_id=3
      
      Issue #2 and solution:
      ======================
      
      Current forward type in BTF does not specify whether the original
      type is struct or union. This will not work for type pretty print
      and BTF-to-header-file conversion as struct/union must be specified.
        $ cat tt.c
        struct t;
        union u;
        int foo(struct t *t, union u *u) { return 0; }
        $ gcc -c -g -O2 tt.c
        $ pahole -JV tt.o
        [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
        [2] FWD t type_id=0
        [3] PTR (anon) type_id=2
        [4] FWD u type_id=0
        [5] PTR (anon) type_id=4
      
      To fix this issue, similar to issue #1, type->info bit 31
      is used. If the bit is set, it is union type. Otherwise, it is
      a struct type.
      
        $ pahole -JV tt.o
        [1] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
        [2] FWD t kind_flag=0 type_id=0
        [3] PTR (anon) kind_flag=0 type_id=2
        [4] FWD u kind_flag=1 type_id=0
        [5] PTR (anon) kind_flag=0 type_id=4
      
      Pahole/LLVM change:
      ===================
      
      The new kind_flag functionality has been implemented in pahole
      and llvm:
        https://github.com/yonghong-song/pahole/tree/bitfield
        https://github.com/yonghong-song/llvm/tree/bitfield
      
      Note that pahole hasn't implemented func/func_proto kind
      and .BTF.ext. So to print function signature with bpftool,
      the llvm compiler should be used.
      
      Fixes: 69b693f0 ("bpf: btf: Introduce BPF Type Format (BTF)")
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9d5f9f70
    • Yonghong Song's avatar
      bpf: btf: refactor btf_int_bits_seq_show() · f97be3ab
      Yonghong Song authored
      Refactor function btf_int_bits_seq_show() by creating
      function btf_bitfield_seq_show() which has no dependence
      on btf and btf_type. The function btf_bitfield_seq_show()
      will be in later patch to directly dump bitfield member values.
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f97be3ab
  14. 14 Dec, 2018 1 commit
    • Martin KaFai Lau's avatar
      bpf: Create a new btf_name_by_offset() for non type name use case · 23127b33
      Martin KaFai Lau authored
      The current btf_name_by_offset() is returning "(anon)" type name for
      the offset == 0 case and "(invalid-name-offset)" for the out-of-bound
      offset case.
      
      It fits well for the internal BTF verbose log purpose which
      is focusing on type.  For example,
      offset == 0 => "(anon)" => anonymous type/name.
      Returning non-NULL for the bad offset case is needed
      during the BTF verification process because the BTF verifier may
      complain about another field first before discovering the name_off
      is invalid.
      
      However, it may not be ideal for the newer use case which does not
      necessary mean type name.  For example, when logging line_info
      in the BPF verifier in the next patch, it is better to log an
      empty src line instead of logging "(anon)".
      
      The existing bpf_name_by_offset() is renamed to __bpf_name_by_offset()
      and static to btf.c.
      
      A new bpf_name_by_offset() is added for generic context usage.  It
      returns "\0" for name_off == 0 (note that btf->strings[0] is "\0")
      and NULL for invalid offset.  It allows the caller to decide
      what is the best output in its context.
      
      The new btf_name_by_offset() is overlapped with btf_name_offset_valid().
      Hence, btf_name_offset_valid() is removed from btf.h to keep the btf.h API
      minimal.  The existing btf_name_offset_valid() usage in btf.c could also be
      replaced later.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      23127b33
  15. 12 Dec, 2018 1 commit
  16. 09 Dec, 2018 1 commit
    • Martin KaFai Lau's avatar
      bpf: Add bpf_line_info support · c454a46b
      Martin KaFai Lau authored
      This patch adds bpf_line_info support.
      
      It accepts an array of bpf_line_info objects during BPF_PROG_LOAD.
      The "line_info", "line_info_cnt" and "line_info_rec_size" are added
      to the "union bpf_attr".  The "line_info_rec_size" makes
      bpf_line_info extensible in the future.
      
      The new "check_btf_line()" ensures the userspace line_info is valid
      for the kernel to use.
      
      When the verifier is translating/patching the bpf_prog (through
      "bpf_patch_insn_single()"), the line_infos' insn_off is also
      adjusted by the newly added "bpf_adj_linfo()".
      
      If the bpf_prog is jited, this patch also provides the jited addrs (in
      aux->jited_linfo) for the corresponding line_info.insn_off.
      "bpf_prog_fill_jited_linfo()" is added to fill the aux->jited_linfo.
      It is currently called by the x86 jit.  Other jits can also use
      "bpf_prog_fill_jited_linfo()" and it will be done in the followup patches.
      In the future, if it deemed necessary, a particular jit could also provide
      its own "bpf_prog_fill_jited_linfo()" implementation.
      
      A few "*line_info*" fields are added to the bpf_prog_info such
      that the user can get the xlated line_info back (i.e. the line_info
      with its insn_off reflecting the translated prog).  The jited_line_info
      is available if the prog is jited.  It is an array of __u64.
      If the prog is not jited, jited_line_info_cnt is 0.
      
      The verifier's verbose log with line_info will be done in
      a follow up patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c454a46b
  17. 29 Nov, 2018 2 commits
  18. 26 Nov, 2018 1 commit
  19. 20 Nov, 2018 3 commits
    • Yonghong Song's avatar
      bpf: Introduce bpf_func_info · 838e9690
      Yonghong Song authored
      This patch added interface to load a program with the following
      additional information:
         . prog_btf_fd
         . func_info, func_info_rec_size and func_info_cnt
      where func_info will provide function range and type_id
      corresponding to each function.
      
      The func_info_rec_size is introduced in the UAPI to specify
      struct bpf_func_info size passed from user space. This
      intends to make bpf_func_info structure growable in the future.
      If the kernel gets a different bpf_func_info size from userspace,
      it will try to handle user request with part of bpf_func_info
      it can understand. In this patch, kernel can understand
        struct bpf_func_info {
             __u32   insn_offset;
             __u32   type_id;
        };
      If user passed a bpf func_info record size of 16 bytes, the
      kernel can still handle part of records with the above definition.
      
      If verifier agrees with function range provided by the user,
      the bpf_prog ksym for each function will use the func name
      provided in the type_id, which is supposed to provide better
      encoding as it is not limited by 16 bytes program name
      limitation and this is better for bpf program which contains
      multiple subprograms.
      
      The bpf_prog_info interface is also extended to
      return btf_id, func_info, func_info_rec_size and func_info_cnt
      to userspace, so userspace can print out the function prototype
      for each xlated function. The insn_offset in the returned
      func_info corresponds to the insn offset for xlated functions.
      With other jit related fields in bpf_prog_info, userspace can also
      print out function prototypes for each jited function.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      838e9690
    • Martin KaFai Lau's avatar
      bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO · 2667a262
      Martin KaFai Lau authored
      This patch adds BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO
      to support the function debug info.
      
      BTF_KIND_FUNC_PROTO must not have a name (i.e. !t->name_off)
      and it is followed by >= 0 'struct bpf_param' objects to
      describe the function arguments.
      
      The BTF_KIND_FUNC must have a valid name and it must
      refer back to a BTF_KIND_FUNC_PROTO.
      
      The above is the conclusion after the discussion between
      Edward Cree, Alexei, Daniel, Yonghong and Martin.
      
      By combining BTF_KIND_FUNC and BTF_LIND_FUNC_PROTO,
      a complete function signature can be obtained.  It will be
      used in the later patches to learn the function signature of
      a running bpf program.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2667a262
    • Martin KaFai Lau's avatar
      bpf: btf: Break up btf_type_is_void() · b47a0bd2
      Martin KaFai Lau authored
      This patch breaks up btf_type_is_void() into
      btf_type_is_void() and btf_type_is_fwd().
      
      It also adds btf_type_nosize() to better describe it is
      testing a type has nosize info.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b47a0bd2
  20. 25 Oct, 2018 1 commit
    • Martin Lau's avatar
      bpf, btf: fix a missing check bug in btf_parse · 4a6998af
      Martin Lau authored
      Wenwen Wang reported:
      
        In btf_parse(), the header of the user-space btf data 'btf_data'
        is firstly parsed and verified through btf_parse_hdr().
        In btf_parse_hdr(), the header is copied from user-space 'btf_data'
        to kernel-space 'btf->hdr' and then verified. If no error happens
        during the verification process, the whole data of 'btf_data',
        including the header, is then copied to 'data' in btf_parse(). It
        is obvious that the header is copied twice here. More importantly,
        no check is enforced after the second copy to make sure the headers
        obtained in these two copies are same. Given that 'btf_data' resides
        in the user space, a malicious user can race to modify the header
        between these two copies. By doing so, the user can inject
        inconsistent data, which can cause undefined behavior of the
        kernel and introduce potential security risk.
      
      This issue is similar to the one fixed in commit 8af03d1a ("bpf:
      btf: Fix a missing check bug"). To fix it, this patch copies the user
      'btf_data' *before* parsing / verifying the BTF header.
      
      Fixes: 69b693f0 ("bpf: btf: Introduce BPF Type Format (BTF)")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Co-developed-by: default avatarWenwen Wang <wang6495@umn.edu>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4a6998af
  21. 10 Oct, 2018 1 commit
    • Wenwen Wang's avatar
      bpf: btf: Fix a missing check bug · 8af03d1a
      Wenwen Wang authored
      In btf_parse_hdr(), the length of the btf data header is firstly copied
      from the user space to 'hdr_len' and checked to see whether it is larger
      than 'btf_data_size'. If yes, an error code EINVAL is returned. Otherwise,
      the whole header is copied again from the user space to 'btf->hdr'.
      However, after the second copy, there is no check between
      'btf->hdr->hdr_len' and 'hdr_len' to confirm that the two copies get the
      same value. Given that the btf data is in the user space, a malicious user
      can race to change the data between the two copies. By doing so, the user
      can provide malicious data to the kernel and cause undefined behavior.
      
      This patch adds a necessary check after the second copy, to make sure
      'btf->hdr->hdr_len' has the same value as 'hdr_len'. Otherwise, an error
      code EINVAL will be returned.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8af03d1a
  22. 12 Sep, 2018 1 commit
  23. 23 Jul, 2018 1 commit
  24. 20 Jul, 2018 1 commit
  25. 11 Jul, 2018 1 commit
  26. 12 Jun, 2018 1 commit
    • Kees Cook's avatar
      treewide: kvzalloc() -> kvcalloc() · 778e1cdd
      Kees Cook authored
      The kvzalloc() function has a 2-factor argument form, kvcalloc(). This
      patch replaces cases of:
      
              kvzalloc(a * b, gfp)
      
      with:
              kvcalloc(a * b, gfp)
      
      as well as handling cases of:
      
              kvzalloc(a * b * c, gfp)
      
      with:
      
              kvzalloc(array3_size(a, b, c), gfp)
      
      as it's slightly less ugly than:
      
              kvcalloc(array_size(a, b), c, gfp)
      
      This does, however, attempt to ignore constant size factors like:
      
              kvzalloc(4 * 1024, gfp)
      
      though any constants defined via macros get caught up in the conversion.
      
      Any factors with a sizeof() of "unsigned char", "char", and "u8" were
      dropped, since they're redundant.
      
      The Coccinelle script used for this was:
      
      // Fix redundant parens around sizeof().
      @@
      type TYPE;
      expression THING, E;
      @@
      
      (
        kvzalloc(
      -	(sizeof(TYPE)) * E
      +	sizeof(TYPE) * E
        , ...)
      |
        kvzalloc(
      -	(sizeof(THING)) * E
      +	sizeof(THING) * E
        , ...)
      )
      
      // Drop single-byte sizes and redundant parens.
      @@
      expression COUNT;
      typedef u8;
      typedef __u8;
      @@
      
      (
        kvzalloc(
      -	sizeof(u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(__u8) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(unsigned char) * (COUNT)
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(u8) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(__u8) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(char) * COUNT
      +	COUNT
        , ...)
      |
        kvzalloc(
      -	sizeof(unsigned char) * COUNT
      +	COUNT
        , ...)
      )
      
      // 2-factor product with sizeof(type/expression) and identifier or constant.
      @@
      type TYPE;
      expression THING;
      identifier COUNT_ID;
      constant COUNT_CONST;
      @@
      
      (
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (COUNT_ID)
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * COUNT_ID
      +	COUNT_ID, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * COUNT_CONST
      +	COUNT_CONST, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (COUNT_ID)
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * COUNT_ID
      +	COUNT_ID, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (COUNT_CONST)
      +	COUNT_CONST, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * COUNT_CONST
      +	COUNT_CONST, sizeof(THING)
        , ...)
      )
      
      // 2-factor product, only identifiers.
      @@
      identifier SIZE, COUNT;
      @@
      
      - kvzalloc
      + kvcalloc
        (
      -	SIZE * COUNT
      +	COUNT, SIZE
        , ...)
      
      // 3-factor product with 1 sizeof(type) or sizeof(expression), with
      // redundant parens removed.
      @@
      expression THING;
      identifier STRIDE, COUNT;
      type TYPE;
      @@
      
      (
        kvzalloc(
      -	sizeof(TYPE) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(TYPE))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * (COUNT) * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * (COUNT) * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * COUNT * (STRIDE)
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING) * COUNT * STRIDE
      +	array3_size(COUNT, STRIDE, sizeof(THING))
        , ...)
      )
      
      // 3-factor product with 2 sizeof(variable), with redundant parens removed.
      @@
      expression THING1, THING2;
      identifier COUNT;
      type TYPE1, TYPE2;
      @@
      
      (
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(THING1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * COUNT
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      |
        kvzalloc(
      -	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
      +	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
        , ...)
      )
      
      // 3-factor product, only identifiers, with redundant parens removed.
      @@
      identifier STRIDE, SIZE, COUNT;
      @@
      
      (
        kvzalloc(
      -	(COUNT) * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * (STRIDE) * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * STRIDE * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	(COUNT) * (STRIDE) * (SIZE)
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      |
        kvzalloc(
      -	COUNT * STRIDE * SIZE
      +	array3_size(COUNT, STRIDE, SIZE)
        , ...)
      )
      
      // Any remaining multi-factor products, first at least 3-factor products,
      // when they're not all constants...
      @@
      expression E1, E2, E3;
      constant C1, C2, C3;
      @@
      
      (
        kvzalloc(C1 * C2 * C3, ...)
      |
        kvzalloc(
      -	(E1) * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	(E1) * (E2) * E3
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	(E1) * (E2) * (E3)
      +	array3_size(E1, E2, E3)
        , ...)
      |
        kvzalloc(
      -	E1 * E2 * E3
      +	array3_size(E1, E2, E3)
        , ...)
      )
      
      // And then all remaining 2 factors products when they're not all constants,
      // keeping sizeof() as the second factor argument.
      @@
      expression THING, E1, E2;
      type TYPE;
      constant C1, C2, C3;
      @@
      
      (
        kvzalloc(sizeof(THING) * C2, ...)
      |
        kvzalloc(sizeof(TYPE) * C2, ...)
      |
        kvzalloc(C1 * C2 * C3, ...)
      |
        kvzalloc(C1 * C2, ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * (E2)
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(TYPE) * E2
      +	E2, sizeof(TYPE)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * (E2)
      +	E2, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	sizeof(THING) * E2
      +	E2, sizeof(THING)
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	(E1) * E2
      +	E1, E2
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	(E1) * (E2)
      +	E1, E2
        , ...)
      |
      - kvzalloc
      + kvcalloc
        (
      -	E1 * E2
      +	E1, E2
        , ...)
      )
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      778e1cdd
  27. 02 Jun, 2018 2 commits
  28. 28 May, 2018 1 commit
  29. 24 May, 2018 1 commit