1. 24 May, 2018 9 commits
    • Sandipan Das's avatar
      bpf: get JITed image lengths of functions via syscall · 815581c1
      Sandipan Das authored
      This adds new two new fields to struct bpf_prog_info. For
      multi-function programs, these fields can be used to pass
      a list of the JITed image lengths of each function for a
      given program to userspace using the bpf system call with
      the BPF_OBJ_GET_INFO_BY_FD command.
      
      This can be used by userspace applications like bpftool
      to split up the contiguous JITed dump, also obtained via
      the system call, into more relatable chunks corresponding
      to each function.
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      815581c1
    • Sandipan Das's avatar
      bpf: fix multi-function JITed dump obtained via syscall · 4d56a76e
      Sandipan Das authored
      Currently, for multi-function programs, we cannot get the JITed
      instructions using the bpf system call's BPF_OBJ_GET_INFO_BY_FD
      command. Because of this, userspace tools such as bpftool fail
      to identify a multi-function program as being JITed or not.
      
      With the JIT enabled and the test program running, this can be
      verified as follows:
      
        # cat /proc/sys/net/core/bpf_jit_enable
        1
      
      Before applying this patch:
      
        # bpftool prog list
        1: kprobe  name foo  tag b811aab41a39ad3d  gpl
                loaded_at 2018-05-16T11:43:38+0530  uid 0
                xlated 216B  not jited  memlock 65536B
        ...
      
        # bpftool prog dump jited id 1
        no instructions returned
      
      After applying this patch:
      
        # bpftool prog list
        1: kprobe  name foo  tag b811aab41a39ad3d  gpl
                loaded_at 2018-05-16T12:13:01+0530  uid 0
                xlated 216B  jited 308B  memlock 65536B
        ...
      
        # bpftool prog dump jited id 1
           0:   nop
           4:   nop
           8:   mflr    r0
           c:   std     r0,16(r1)
          10:   stdu    r1,-112(r1)
          14:   std     r31,104(r1)
          18:   addi    r31,r1,48
          1c:   li      r3,10
        ...
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4d56a76e
    • Sandipan Das's avatar
      tools: bpftool: resolve calls without using imm field · f84192ee
      Sandipan Das authored
      Currently, we resolve the callee's address for a JITed function
      call by using the imm field of the call instruction as an offset
      from __bpf_call_base. If bpf_jit_kallsyms is enabled, we further
      use this address to get the callee's kernel symbol's name.
      
      For some architectures, such as powerpc64, the imm field is not
      large enough to hold this offset. So, instead of assigning this
      offset to the imm field, the verifier now assigns the subprog
      id. Also, a list of kernel symbol addresses for all the JITed
      functions is provided in the program info. We now use the imm
      field as an index for this list to lookup a callee's symbol's
      address and resolve its name.
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f84192ee
    • Sandipan Das's avatar
      tools: bpf: sync bpf uapi header · dd0c5f07
      Sandipan Das authored
      Syncing the bpf.h uapi header with tools so that struct
      bpf_prog_info has the two new fields for passing on the
      addresses of the kernel symbols corresponding to each
      function in a program.
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      dd0c5f07
    • Sandipan Das's avatar
      bpf: get kernel symbol addresses via syscall · dbecd738
      Sandipan Das authored
      This adds new two new fields to struct bpf_prog_info. For
      multi-function programs, these fields can be used to pass
      a list of kernel symbol addresses for all functions in a
      given program to userspace using the bpf system call with
      the BPF_OBJ_GET_INFO_BY_FD command.
      
      When bpf_jit_kallsyms is enabled, we can get the address
      of the corresponding kernel symbol for a callee function
      and resolve the symbol's name. The address is determined
      by adding the value of the call instruction's imm field
      to __bpf_call_base. This offset gets assigned to the imm
      field by the verifier.
      
      For some architectures, such as powerpc64, the imm field
      is not large enough to hold this offset.
      
      We resolve this by:
      
      [1] Assigning the subprog id to the imm field of a call
          instruction in the verifier instead of the offset of
          the callee's symbol's address from __bpf_call_base.
      
      [2] Determining the address of a callee's corresponding
          symbol by using the imm field as an index for the
          list of kernel symbol addresses now available from
          the program info.
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      dbecd738
    • Sandipan Das's avatar
      bpf: powerpc64: add JIT support for multi-function programs · 8484ce83
      Sandipan Das authored
      This adds support for bpf-to-bpf function calls in the powerpc64
      JIT compiler. The JIT compiler converts the bpf call instructions
      to native branch instructions. After a round of the usual passes,
      the start addresses of the JITed images for the callee functions
      are known. Finally, to fixup the branch target addresses, we need
      to perform an extra pass.
      
      Because of the address range in which JITed images are allocated
      on powerpc64, the offsets of the start addresses of these images
      from __bpf_call_base are as large as 64 bits. So, for a function
      call, we cannot use the imm field of the instruction to determine
      the callee's address. Instead, we use the alternative method of
      getting it from the list of function addresses in the auxiliary
      data of the caller by using the off field as an index.
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      8484ce83
    • Sandipan Das's avatar
      bpf: powerpc64: pad function address loads with NOPs · 4ea69b2f
      Sandipan Das authored
      For multi-function programs, loading the address of a callee
      function to a register requires emitting instructions whose
      count varies from one to five depending on the nature of the
      address.
      
      Since we come to know of the callee's address only before the
      extra pass, the number of instructions required to load this
      address may vary from what was previously generated. This can
      make the JITed image grow or shrink.
      
      To avoid this, we should generate a constant five-instruction
      when loading function addresses by padding the optimized load
      sequence with NOPs.
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4ea69b2f
    • Sandipan Das's avatar
      bpf: support 64-bit offsets for bpf function calls · 2162fed4
      Sandipan Das authored
      The imm field of a bpf instruction is a signed 32-bit integer.
      For JITed bpf-to-bpf function calls, it holds the offset of the
      start address of the callee's JITed image from __bpf_call_base.
      
      For some architectures, such as powerpc64, this offset may be
      as large as 64 bits and cannot be accomodated in the imm field
      without truncation.
      
      We resolve this by:
      
      [1] Additionally using the auxiliary data of each function to
          keep a list of start addresses of the JITed images for all
          functions determined by the verifier.
      
      [2] Retaining the subprog id inside the off field of the call
          instructions and using it to index into the list mentioned
          above and lookup the callee's address.
      
      To make sure that the existing JIT compilers continue to work
      without requiring changes, we keep the imm field as it is.
      Signed-off-by: default avatarSandipan Das <sandipan@linux.vnet.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      2162fed4
    • Martin KaFai Lau's avatar
      bpf: btf: Avoid variable length array · a2889a4c
      Martin KaFai Lau authored
      Sparse warning:
      kernel/bpf/btf.c:1985:34: warning: Variable length array is used.
      
      This patch directly uses ARRAY_SIZE().
      
      Fixes: f80442a4 ("bpf: btf: Change how section is supported in btf_header")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a2889a4c
  2. 23 May, 2018 10 commits
    • Sirio Balmelli's avatar
      tools/lib/libbpf.c: fix string format to allow build on arm32 · a1c81810
      Sirio Balmelli authored
      On arm32, 'cd tools/testing/selftests/bpf && make' fails with:
      
      libbpf.c:80:10: error: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t {aka long long int}’ [-Werror=format=]
         (func)("libbpf: " fmt, ##__VA_ARGS__); \
                ^
      libbpf.c:83:30: note: in expansion of macro ‘__pr’
       #define pr_warning(fmt, ...) __pr(__pr_warning, fmt, ##__VA_ARGS__)
                                    ^~~~
      libbpf.c:1072:3: note: in expansion of macro ‘pr_warning’
         pr_warning("map:%s value_type:%s has BTF type_size:%ld != value_size:%u\n",
      
      To fix, typecast 'key_size' and amend format string.
      Signed-off-by: default avatarSirio Balmelli <sirio@b-ad.ch>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a1c81810
    • Sirio Balmelli's avatar
      selftests/bpf: Makefile fix "missing" headers on build with -idirafter · 167381f3
      Sirio Balmelli authored
      Selftests fail to build on several distros/architectures because of
      	missing headers files.
      
      On a Ubuntu/x86_64 some missing headers are:
      	asm/byteorder.h, asm/socket.h, asm/sockios.h
      
      On a Debian/arm32 build already fails at sys/cdefs.h
      
      In both cases, these already exist in /usr/include/<arch-specific-dir>,
      but Clang does not include these when using '-target bpf' flag,
      since it is no longer compiling against the host architecture.
      
      The solution is to:
      
      - run Clang without '-target bpf' and extract the include chain for the
      current system
      
      - add these to the bpf build with '-idirafter'
      
      The choice of -idirafter is to catch this error without injecting
      unexpected include behavior: if an arch-specific tree is built
      for bpf in the future, this will be correctly found by Clang.
      Signed-off-by: default avatarSirio Balmelli <sirio@b-ad.ch>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      167381f3
    • Daniel Borkmann's avatar
      Merge branch 'btf-uapi-cleanups' · ff4fb475
      Daniel Borkmann authored
      Martin KaFai Lau says:
      
      ====================
      This patch set makes some changes to cleanup the unused
      bits in BTF uapi.  It also makes the btf_header extensible.
      
      Please see individual patches for details.
      
      v2:
      - Remove NR_SECS from patch 2
      - Remove "unsigned" check on array->index_type from patch 3
      - Remove BTF_INT_VARARGS and further limit BTF_INT_ENCODING
        from 8 bits to 4 bits in patch 4
      - Adjustments in test_btf.c to reflect changes in v2
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ff4fb475
    • Martin KaFai Lau's avatar
      bpf: btf: Add tests for the btf uapi changes · 61746dbe
      Martin KaFai Lau authored
      This patch does the followings:
      1. Modify libbpf and test_btf to reflect the uapi changes in btf
      2. Add test for the btf_header changes
      3. Add tests for array->index_type
      4. Add err_str check to the tests
      5. Fix a 4 bytes hole in "struct test #1" by swapping "m" and "n"
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      61746dbe
    • Martin KaFai Lau's avatar
      bpf: btf: Sync bpf.h and btf.h to tools · f03b15d3
      Martin KaFai Lau authored
      This patch sync the uapi bpf.h and btf.h to tools.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f03b15d3
    • Martin KaFai Lau's avatar
      bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info · 9b2cf328
      Martin KaFai Lau authored
      In "struct bpf_map_info", the name "btf_id", "btf_key_id" and "btf_value_id"
      could cause confusion because the "id" of "btf_id" means the BPF obj id
      given to the BTF object while
      "btf_key_id" and "btf_value_id" means the BTF type id within
      that BTF object.
      
      To make it clear, btf_key_id and btf_value_id are
      renamed to btf_key_type_id and btf_value_type_id.
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9b2cf328
    • Martin KaFai Lau's avatar
      bpf: btf: Remove unused bits from uapi/linux/btf.h · aea2f7b8
      Martin KaFai Lau authored
      This patch does the followings:
      1. Limit BTF_MAX_TYPES and BTF_MAX_NAME_OFFSET to 64k.  We can
         raise it later.
      
      2. Remove the BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID.  They are
         currently encoded at the highest bit of a u32.
         It is because the current use case does not require supporting
         parent type (i.e type_id referring to a type in another BTF file).
         It also does not support referring to a string in ELF.
      
         The BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID checks are replaced
         by BTF_TYPE_ID_CHECK and BTF_STR_OFFSET_CHECK which are
         defined in btf.c instead of uapi/linux/btf.h.
      
      3. Limit the BTF_INFO_KIND from 5 bits to 4 bits which is enough.
         There is unused bits headroom if we ever needed it later.
      
      4. The root bit in BTF_INFO is also removed because it is not
         used in the current use case.
      
      5. Remove BTF_INT_VARARGS since func type is not supported now.
         The BTF_INT_ENCODING is limited to 4 bits instead of 8 bits.
      
      The above can be added back later because the verifier
      ensures the unused bits are zeros.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      aea2f7b8
    • Martin KaFai Lau's avatar
      bpf: btf: Check array->index_type · 4ef5f574
      Martin KaFai Lau authored
      Instead of ingoring the array->index_type field.  Enforce that
      it must be a BTF_KIND_INT in size 1/2/4/8 bytes.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4ef5f574
    • Martin KaFai Lau's avatar
      bpf: btf: Change how section is supported in btf_header · f80442a4
      Martin KaFai Lau authored
      There are currently unused section descriptions in the btf_header.  Those
      sections are here to support future BTF use cases.  For example, the
      func section (func_off) is to support function signature (e.g. the BPF
      prog function signature).
      
      Instead of spelling out all potential sections up-front in the btf_header.
      This patch makes changes to btf_header such that extending it (e.g. adding
      a section) is possible later.  The unused ones can be removed for now and
      they can be added back later.
      
      This patch:
      1. adds a hdr_len to the btf_header.  It will allow adding
      sections (and other info like parent_label and parent_name)
      later.  The check is similar to the existing bpf_attr.
      If a user passes in a longer hdr_len, the kernel
      ensures the extra tailing bytes are 0.
      
      2. allows the section order in the BTF object to be
      different from its sec_off order in btf_header.
      
      3. each sec_off is followed by a sec_len.  It must not have gap or
      overlapping among sections.
      
      The string section is ensured to be at the end due to the 4 bytes
      alignment requirement of the type section.
      
      The above changes will allow enough flexibility to
      add new sections (and other info) to the btf_header later.
      
      This patch also removes an unnecessary !err check
      at the end of btf_parse().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f80442a4
    • Martin KaFai Lau's avatar
      bpf: Expose check_uarg_tail_zero() · dcab51f1
      Martin KaFai Lau authored
      This patch exposes check_uarg_tail_zero() which will
      be reused by a later BTF patch.  Its name is changed to
      bpf_check_uarg_tail_zero().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      dcab51f1
  3. 22 May, 2018 13 commits
  4. 18 May, 2018 8 commits
    • Daniel Borkmann's avatar
      Merge branch 'bpf-sk-msg-fields' · d849f9f9
      Daniel Borkmann authored
      John Fastabend says:
      
      ====================
      In this series we add the ability for sk msg programs to read basic
      sock information about the sock they are attached to. The second
      patch adds the tests to the selftest test_verifier.
      
      One observation that I had from writing this seriess is lots of the
      ./net/core/filter.c code is almost duplicated across program types.
      I thought about building a template/macro that we could use as a
      single block of code to read sock data out for multiple programs,
      but I wasn't convinced it was worth it yet. The result was using a
      macro saved a couple lines of code per block but made the code
      a bit harder to read IMO. We can probably revisit the idea later
      if we get more duplication.
      
      v2: add errstr field to negative test_verifier test cases to ensure
          we get the expected err string back from the verifier.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d849f9f9
    • John Fastabend's avatar
      bpf: add sk_msg prog sk access tests to test_verifier · 4da0dcab
      John Fastabend authored
      Add tests for BPF_PROG_TYPE_SK_MSG to test_verifier for read access
      to new sk fields.
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4da0dcab
    • John Fastabend's avatar
      bpf: allow sk_msg programs to read sock fields · 303def35
      John Fastabend authored
      Currently sk_msg programs only have access to the raw data. However,
      it is often useful when building policies to have the policies specific
      to the socket endpoint. This allows using the socket tuple as input
      into filters, etc.
      
      This patch adds ctx access to the sock fields.
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      303def35
    • Daniel Borkmann's avatar
      Merge branch 'bpf-nfp-shift-insns' · 1cb61381
      Daniel Borkmann authored
      Jiong Wang says:
      
      ====================
      NFP eBPF JIT is missing logic indirect shifts (both left and right) and
      arithmetic right shift (both indirect shift and shift by constant).
      
      This patch adds support for them.
      
      For indirect shifts, shift amount is not specified as constant, NFP needs
      to get the shift amount through the low 5 bits of source A operand in
      PREV_ALU, therefore extra instructions are needed compared with shifts by
      constants.
      
      Because NFP is 32-bit, so we are using register pair for 64-bit shifts and
      therefore would need different instruction sequences depending on whether
      shift amount is less than 32 or not.
      
      NFP branch-on-bit-test instruction emitter is added by this patch set and
      is used for efficient runtime check on shift amount. We'd think the shift
      amount is less than 32 if bit 5 is clear and greater or equal then 32
      otherwise. Shift amount is greater than or equal to 64 will result in
      undefined behavior.
      
      This patch also use range info to avoid generating unnecessary runtime code
      if we are certain shift amount is less than 32 or not.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      1cb61381
    • Jiong Wang's avatar
      nfp: bpf: support arithmetic indirect right shift (BPF_ARSH | BPF_X) · c217abcc
      Jiong Wang authored
      Code logic is similar with arithmetic right shift by constant, and NFP
      get indirect shift amount through source A operand of PREV_ALU.
      
      It is possible to fall back to logic right shift if the MSB is known to be
      zero from range info, however there is no benefit to do this given logic
      indirect right shift use the same number and cycle of instruction sequence.
      
      Suppose the MSB of regX is the bit we want to replicate to fill in all the
      vacant positions, and regY contains the shift amount, then we could use
      single instruction to set up both.
      
        [alu, --, regY, OR, regX]
      
        --
        NOTE: the PREV_ALU result doesn't need to write to any destination
              register.
      Signed-off-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      c217abcc
    • Jiong Wang's avatar
      nfp: bpf: support arithmetic right shift by constant (BPF_ARSH | BPF_K) · f43d0f17
      Jiong Wang authored
      Code logic is similar with logic right shift except we also need to set
      PREV_ALU result properly, the MSB of which is the bit that will be
      replicated to fill in all the vacant positions.
      Signed-off-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f43d0f17
    • Jiong Wang's avatar
      nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X) · 991f5b36
      Jiong Wang authored
      For indirect shifts, shift amount is not specified as constant, NFP needs
      to get the shift amount through the low 5 bits of source A operand in
      PREV_ALU, therefore extra instructions are needed compared with shifts by
      constants.
      
      Because NFP is 32-bit, so we are using register pair for 64-bit shifts and
      therefore would need different instruction sequences depending on whether
      shift amount is less than 32 or not.
      
      NFP branch-on-bit-test instruction emitter is added by this patch and is
      used for efficient runtime check on shift amount. We'd think the shift
      amount is less than 32 if bit 5 is clear and greater or equal than 32
      otherwise. Shift amount is greater than or equal to 64 will result in
      undefined behavior.
      
      This patch also use range info to avoid generating unnecessary runtime code
      if we are certain shift amount is less than 32 or not.
      Signed-off-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      991f5b36
    • Daniel Borkmann's avatar
      Merge branch 'bpf-af-xdp-cleanups' · 82f9e2d5
      Daniel Borkmann authored
      Björn Töpel says:
      
      ====================
      This series contain "cosmetics only" follow-up patches for AF_XDP.
      
      Thanks to Daniel for suggesting them!
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      82f9e2d5