1. 13 Aug, 2019 6 commits
    • Daniel Borkmann's avatar
      Merge branch 'bpf-libbpf-read-sysfs-btf' · 72ef80b5
      Daniel Borkmann authored
      Andrii Nakryiko says:
      
      ====================
      Now that kernel's BTF is exposed through sysfs at well-known location, attempt
      to load it first as a target BTF for the purpose of BPF CO-RE relocations.
      
      Patch #1 is a follow-up patch to rename /sys/kernel/btf/kernel into
      /sys/kernel/btf/vmlinux.
      
      Patch #2 adds ability to load raw BTF contents from sysfs and expands the list
      of locations libbpf attempts to load vmlinux BTF from.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      72ef80b5
    • Andrii Nakryiko's avatar
      libbpf: attempt to load kernel BTF from sysfs first · a1916a15
      Andrii Nakryiko authored
      Add support for loading kernel BTF from sysfs (/sys/kernel/btf/vmlinux)
      as a target BTF. Also extend the list of on disk search paths for
      vmlinux ELF image with entries that perf is searching for.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a1916a15
    • Andrii Nakryiko's avatar
      btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux · 7fd78568
      Andrii Nakryiko authored
      Expose kernel's BTF under the name vmlinux to be more uniform with using
      kernel module names as file names in the future.
      
      Fixes: 341dfcf8 ("btf: expose BTF info through sysfs")
      Suggested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      7fd78568
    • Petar Penkov's avatar
      selftests/bpf: fix race in flow dissector tests · 9840a4ff
      Petar Penkov authored
      Since the "last_dissection" map holds only the flow keys for the most
      recent packet, there is a small race in the skb-less flow dissector
      tests if a new packet comes between transmitting the test packet, and
      reading its keys from the map. If this happens, the test packet keys
      will be overwritten and the test will fail.
      
      Changing the "last_dissection" map to a hash map, keyed on the
      source/dest port pair resolves this issue. Additionally, let's clear the
      last test results from the map between tests to prevent previous test
      cases from interfering with the following test cases.
      
      Fixes: 0905beec ("selftests/bpf: run flow dissector tests in skb-less mode")
      Signed-off-by: default avatarPetar Penkov <ppenkov@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9840a4ff
    • Peter Wu's avatar
      tools: bpftool: add feature check for zlib · d66fa3c7
      Peter Wu authored
      bpftool requires libelf, and zlib for decompressing /proc/config.gz.
      zlib is a transitive dependency via libelf, and became mandatory since
      elfutils 0.165 (Jan 2016). The feature check of libelf is already done
      in the elfdep target of tools/lib/bpf/Makefile, pulled in by bpftool via
      a dependency on libbpf.a. Add a similar feature check for zlib.
      Suggested-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarPeter Wu <peter@lekensteyn.nl>
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d66fa3c7
    • Andrii Nakryiko's avatar
      btf: expose BTF info through sysfs · 341dfcf8
      Andrii Nakryiko authored
      Make .BTF section allocated and expose its contents through sysfs.
      
      /sys/kernel/btf directory is created to contain all the BTFs present
      inside kernel. Currently there is only kernel's main BTF, represented as
      /sys/kernel/btf/kernel file. Once kernel modules' BTFs are supported,
      each module will expose its BTF as /sys/kernel/btf/<module-name> file.
      
      Current approach relies on a few pieces coming together:
      1. pahole is used to take almost final vmlinux image (modulo .BTF and
         kallsyms) and generate .BTF section by converting DWARF info into
         BTF. This section is not allocated and not mapped to any segment,
         though, so is not yet accessible from inside kernel at runtime.
      2. objcopy dumps .BTF contents into binary file and subsequently
         convert binary file into linkable object file with automatically
         generated symbols _binary__btf_kernel_bin_start and
         _binary__btf_kernel_bin_end, pointing to start and end, respectively,
         of BTF raw data.
      3. final vmlinux image is generated by linking this object file (and
         kallsyms, if necessary). sysfs_btf.c then creates
         /sys/kernel/btf/kernel file and exposes embedded BTF contents through
         it. This allows, e.g., libbpf and bpftool access BTF info at
         well-known location, without resorting to searching for vmlinux image
         on disk (location of which is not standardized and vmlinux image
         might not be even available in some scenarios, e.g., inside qemu
         during testing).
      
      Alternative approach using .incbin assembler directive to embed BTF
      contents directly was attempted but didn't work, because sysfs_proc.o is
      not re-compiled during link-vmlinux.sh stage. This is required, though,
      to update embedded BTF data (initially empty data is embedded, then
      pahole generates BTF info and we need to regenerate sysfs_btf.o with
      updated contents, but it's too late at that point).
      
      If BTF couldn't be generated due to missing or too old pahole,
      sysfs_btf.c handles that gracefully by detecting that
      _binary__btf_kernel_bin_start (weak symbol) is 0 and not creating
      /sys/kernel/btf at all.
      
      v2->v3:
      - added Documentation/ABI/testing/sysfs-kernel-btf (Greg K-H);
      - created proper kobject (btf_kobj) for btf directory (Greg K-H);
      - undo v2 change of reusing vmlinux, as it causes extra kallsyms pass
        due to initially missing  __binary__btf_kernel_bin_{start/end} symbols;
      
      v1->v2:
      - allow kallsyms stage to re-use vmlinux generated by gen_btf();
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      341dfcf8
  2. 12 Aug, 2019 1 commit
  3. 09 Aug, 2019 5 commits
  4. 08 Aug, 2019 1 commit
    • Yonghong Song's avatar
      tools/bpf: fix core_reloc.c compilation error · b7076592
      Yonghong Song authored
      On my local machine, I have the following compilation errors:
      =====
        In file included from prog_tests/core_reloc.c:3:0:
        ./progs/core_reloc_types.h:517:46: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘fancy_char_ptr_t’
       typedef const char * const volatile restrict fancy_char_ptr_t;
                                                    ^
        ./progs/core_reloc_types.h:527:2: error: unknown type name ‘fancy_char_ptr_t’
          fancy_char_ptr_t d;
          ^
      =====
      
      I am using gcc 4.8.5. Later compilers may change their behavior not emitting the
      error. Nevertheless, let us fix the issue. "restrict" can be tested
      without typedef.
      
      Fixes: 9654e2ae ("selftests/bpf: add CO-RE relocs modifiers/typedef tests")
      Cc: Andrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b7076592
  5. 07 Aug, 2019 19 commits
  6. 06 Aug, 2019 2 commits
  7. 01 Aug, 2019 3 commits
  8. 31 Jul, 2019 3 commits
    • Jakub Kicinski's avatar
      tools: bpftool: add support for reporting the effective cgroup progs · a98bf573
      Jakub Kicinski authored
      Takshak said in the original submission:
      
      With different bpf attach_flags available to attach bpf programs specially
      with BPF_F_ALLOW_OVERRIDE and BPF_F_ALLOW_MULTI, the list of effective
      bpf-programs available to any sub-cgroups really needs to be available for
      easy debugging.
      
      Using BPF_F_QUERY_EFFECTIVE flag, one can get the list of not only attached
      bpf-programs to a cgroup but also the inherited ones from parent cgroup.
      
      So a new option is introduced to use BPF_F_QUERY_EFFECTIVE query flag here
      to list all the effective bpf-programs available for execution at a specified
      cgroup.
      
      Reused modified test program test_cgroup_attach from tools/testing/selftests/bpf:
        # ./test_cgroup_attach
      
      With old bpftool:
      
       # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/
        ID       AttachType      AttachFlags     Name
        271      egress          multi           pkt_cntr_1
        272      egress          multi           pkt_cntr_2
      
      Attached new program pkt_cntr_4 in cg2 gives following:
      
       # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/cg2
        ID       AttachType      AttachFlags     Name
        273      egress          override        pkt_cntr_4
      
      And with new "effective" option it shows all effective programs for cg2:
      
       # bpftool cgroup show /sys/fs/cgroup/cgroup-test-work-dir/cg1/cg2 effective
        ID       AttachType      AttachFlags     Name
        273      egress          override        pkt_cntr_4
        271      egress          override        pkt_cntr_1
        272      egress          override        pkt_cntr_2
      
      Compared to original submission use a local flag instead of global
      option.
      
      We need to clear query_flags on every command, in case batch mode
      wants to use varying settings.
      
      v2: (Takshak)
       - forbid duplicated flags;
       - fix cgroup path freeing.
      Signed-off-by: default avatarTakshak Chahande <ctakshak@fb.com>
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Reviewed-by: default avatarTakshak Chahande <ctakshak@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a98bf573
    • Andrii Nakryiko's avatar
      selftests/bpf: fix clearing buffered output between tests/subtests · bf8ff0f8
      Andrii Nakryiko authored
      Clear buffered output once test or subtests finishes even if test was
      successful. Not doing this leads to accumulation of output from previous
      tests and on first failed tests lots of irrelevant output will be
      dumped, greatly confusing things.
      
      v1->v2: fix Fixes tag, add more context to patch
      
      Fixes: 3a516a0a ("selftests/bpf: add sub-tests support for test_progs")
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      bf8ff0f8
    • Alexei Starovoitov's avatar
      Merge branch 'gen-syn-cookie' · 116e7dbe
      Alexei Starovoitov authored
      Petar Penkov says:
      
      ====================
      This patch series introduces a BPF helper function that allows generating SYN
      cookies from BPF. Currently, this helper is enabled at both the TC hook and the
      XDP hook.
      
      The first two patches in the series add/modify several TCP helper functions to
      allow for SKB-less operation, as is the case at the XDP hook.
      
      The third patch introduces the bpf_tcp_gen_syncookie helper function which
      generates a SYN cookie for either XDP or TC programs. The return value of
      this function contains both the MSS value, encoded in the cookie, and the
      cookie itself.
      
      The last three patches sync tools/ and add a test.
      
      Performance evaluation:
      I sent 10Mpps to a fixed port on a host with 2 10G bonded Mellanox 4 NICs from
      random IPv6 source addresses. Without XDP I observed 7.2Mpps (syn-acks) being
      sent out if the IPv6 packets carry 20 bytes of TCP options or 7.6Mpps if they
      carry no options. If I attached a simple program that checks if a packet is
      IPv6/TCP/SYN, looks up the socket, issues a cookie, and sends it back out after
      swapping src/dest, recomputing the checksum, and setting the ACK flag, I
      observed 10Mpps being sent back out.
      
      Changes since v1:
      1/ Added performance numbers to the cover letter
      2/ Patch 2: Refactored a bit to fix compilation issues
      3/ Patch 3: Changed ENOTSUPP to EOPNOTSUPP at Toke's suggestion
      
      Changes since RFC:
      1/ Cookie is returned in host order at Alexei's suggestion
      2/ If cookies are not enabled via a sysctl, the helper function returns
         -ENOENT instead of -EINVAL at Lorenz's suggestion
      3/ Fixed documentation to properly reflect that MSS is 16 bits at
         Lorenz's suggestion
      4/ BPF helper requires TCP length to match ->doff field, rather than to simply
         be no more than 20 bytes at Eric and Alexei's suggestion
      5/ Packet type is looked up from the packet version field, rather than from the
         socket. v4 packets are rejected on v6-only sockets but should work with
         dual stack listeners at Eric's suggestion
      6/ Removed unnecessary `net` argument from helper function in patch 2 at
         Lorenz's suggestion
      7/ Changed test to only pass MSS option so we can convince the verifier that the
         memory access is not out of bounds
      
      Note that 7/ below illustrates the verifier might need to be extended to allow
      passing a variable tcph->doff to the helper function like below:
      
      __u32 thlen = tcph->doff * 4;
      if (thlen < sizeof(*tcph))
      	return;
      __s64 cookie = bpf_tcp_gen_syncookie(sk, ipv4h, 20, tcph, thlen);
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      116e7dbe