1. 25 Oct, 2022 14 commits
  2. 22 Oct, 2022 4 commits
  3. 21 Oct, 2022 15 commits
  4. 19 Oct, 2022 7 commits
    • Alexei Starovoitov's avatar
      Merge branch 'bpf,x64: Use BMI2 for shifts' · 04a8f9d7
      Alexei Starovoitov authored
      Jie Meng says:
      
      ====================
      
      With baseline x64 instruction set, shift count can only be an immediate
      or in %cl. The implicit dependency on %cl makes it necessary to shuffle
      registers around and/or add push/pop operations.
      
      BMI2 provides shift instructions that can use any general register as
      the shift count, saving us instructions and a few bytes in most cases.
      
      Suboptimal codegen when %ecx is source and/or destination is also
      addressed and unnecessary instructions are removed.
      
      test_progs: Summary: 267/1340 PASSED, 25 SKIPPED, 0 FAILED
      test_progs-no_alu32: Summary: 267/1333 PASSED, 26 SKIPPED, 0 FAILED
      test_verifier: Summary: 1367 PASSED, 636 SKIPPED, 0 FAILED (same result
       with or without BMI2)
      test_maps: OK, 0 SKIPPED
      lib/test_bpf:
        test_bpf: Summary: 1026 PASSED, 0 FAILED, [1014/1014 JIT'ed]
        test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
        test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED
      ---
      v4 -> v5:
      - More comments regarding instruction encoding
      v3 -> v4:
      - Fixed a regression when BMI2 isn't available
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      04a8f9d7
    • Jie Meng's avatar
      bpf: add selftests for lsh, rsh, arsh with reg operand · 8662de23
      Jie Meng authored
      Current tests cover only shifts with an immediate as the source
      operand/shift counts; add a new test case to cover register operand.
      Signed-off-by: default avatarJie Meng <jmeng@fb.com>
      Link: https://lore.kernel.org/r/20221007202348.1118830-4-jmeng@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8662de23
    • Jie Meng's avatar
      bpf,x64: use shrx/sarx/shlx when available · 77d8f5d4
      Jie Meng authored
      BMI2 provides 3 shift instructions (shrx, sarx and shlx) that use VEX
      encoding but target general purpose registers [1]. They allow the shift
      count in any general purpose register and have the same performance as
      non BMI2 shift instructions [2].
      
      Instead of shr/sar/shl that implicitly use %cl (lowest 8 bit of %rcx),
      emit their more flexible alternatives provided in BMI2 when advantageous;
      keep using the non BMI2 instructions when shift count is already in
      BPF_REG_4/%rcx as non BMI2 instructions are shorter.
      
      To summarize, when BMI2 is available:
      -------------------------------------------------
                  |   arbitrary dst
      =================================================
      src == ecx  |   shl dst, cl
      -------------------------------------------------
      src != ecx  |   shlx dst, dst, src
      -------------------------------------------------
      
      And no additional register shuffling is needed.
      
      A concrete example between non BMI2 and BMI2 codegen.  To shift %rsi by
      %rdi:
      
      Without BMI2:
      
       ef3:   push   %rcx
              51
       ef4:   mov    %rdi,%rcx
              48 89 f9
       ef7:   shl    %cl,%rsi
              48 d3 e6
       efa:   pop    %rcx
              59
      
      With BMI2:
      
       f0b:   shlx   %rdi,%rsi,%rsi
              c4 e2 c1 f7 f6
      
      [1] https://en.wikipedia.org/wiki/X86_Bit_manipulation_instruction_set
      [2] https://www.agner.org/optimize/instruction_tables.pdfSigned-off-by: default avatarJie Meng <jmeng@fb.com>
      Link: https://lore.kernel.org/r/20221007202348.1118830-3-jmeng@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      77d8f5d4
    • Jie Meng's avatar
      bpf,x64: avoid unnecessary instructions when shift dest is ecx · 81b35e7c
      Jie Meng authored
      x64 JIT produces redundant instructions when a shift operation's
      destination register is BPF_REG_4/ecx and this patch removes them.
      
      Specifically, when dest reg is BPF_REG_4 but the src isn't, we
      needn't push and pop ecx around shift only to get it overwritten
      by r11 immediately afterwards.
      
      In the rare case when both dest and src registers are BPF_REG_4,
      a single shift instruction is sufficient and we don't need the
      two MOV instructions around the shift.
      
      To summarize using shift left as an example, without patch:
      -------------------------------------------------
                  |   dst == ecx     |    dst != ecx
      =================================================
      src == ecx  |   mov r11, ecx   |    shl dst, cl
                  |   shl r11, ecx   |
                  |   mov ecx, r11   |
      -------------------------------------------------
      src != ecx  |   mov r11, ecx   |    push ecx
                  |   push ecx       |    mov ecx, src
                  |   mov ecx, src   |    shl dst, cl
                  |   shl r11, cl    |    pop ecx
                  |   pop ecx        |
                  |   mov ecx, r11   |
      -------------------------------------------------
      
      With patch:
      -------------------------------------------------
                  |   dst == ecx     |    dst != ecx
      =================================================
      src == ecx  |   shl ecx, cl    |    shl dst, cl
      -------------------------------------------------
      src != ecx  |   mov r11, ecx   |    push ecx
                  |   mov ecx, src   |    mov ecx, src
                  |   shl r11, cl    |    shl dst, cl
                  |   mov ecx, r11   |    pop ecx
      -------------------------------------------------
      Signed-off-by: default avatarJie Meng <jmeng@fb.com>
      Link: https://lore.kernel.org/r/20221007202348.1118830-2-jmeng@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      81b35e7c
    • Alexei Starovoitov's avatar
      Merge branch 'libbpf: support non-mmap()'able data sections' · 7d8d5355
      Alexei Starovoitov authored
      Andrii Nakryiko says:
      
      ====================
      
      Make libbpf more conservative in using BPF_F_MMAPABLE flag with internal BPF
      array maps that are backing global data sections. See patch #2 for full
      description and justification.
      
      Changes in this dataset support having bpf_spinlock, kptr, rb_tree nodes and
      other "special" variables as global variables. Combining this with libbpf's
      existing support for multiple custom .data.* sections allows BPF programs to
      utilize multiple spinlock/rbtree_node/kptr variables in a pretty natural way
      by just putting all such variables into separate data sections (and thus ARRAY
      maps).
      
      v1->v2:
        - address Stanislav's feedback, adds acks.
      ====================
      Acked-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7d8d5355
    • Andrii Nakryiko's avatar
      libbpf: add non-mmapable data section selftest · 2f968e9f
      Andrii Nakryiko authored
      Add non-mmapable data section to test_skeleton selftest and make sure it
      really isn't mmapable by trying to mmap() it anyways.
      
      Also make sure that libbpf doesn't report BPF_F_MMAPABLE flag to users.
      
      Additional, some more manual testing was performed that this feature
      works as intended.
      
      Looking at created map through bpftool shows that flags passed to kernel are
      indeed zero:
      
        $ bpftool map show
        ...
        1782: array  name .data.non_mmapa  flags 0x0
                key 4B  value 16B  max_entries 1  memlock 4096B
                btf_id 1169
                pids test_progs(8311)
        ...
      
      Checking BTF uploaded to kernel for this map shows that zero_key and
      zero_value are indeed marked as static, even though zero_key is actually
      original global (but STV_HIDDEN) variable:
      
        $ bpftool btf dump id 1169
        ...
        [51] VAR 'zero_key' type_id=2, linkage=static
        [52] VAR 'zero_value' type_id=7, linkage=static
        ...
        [62] DATASEC '.data.non_mmapable' size=16 vlen=2
                type_id=51 offset=0 size=4 (VAR 'zero_key')
                type_id=52 offset=4 size=12 (VAR 'zero_value')
        ...
      
      And original BTF does have zero_key marked as linkage=global:
      
        $ bpftool btf dump file test_skeleton.bpf.linked3.o
        ...
        [51] VAR 'zero_key' type_id=2, linkage=global
        [52] VAR 'zero_value' type_id=7, linkage=static
        ...
        [62] DATASEC '.data.non_mmapable' size=16 vlen=2
                type_id=51 offset=0 size=4 (VAR 'zero_key')
                type_id=52 offset=4 size=12 (VAR 'zero_value')
      
      Bpftool didn't require any changes at all because it checks whether internal
      map is mmapable already, but just to double-check generated skeleton, we
      see that .data.non_mmapable neither sets mmaped pointer nor has
      a corresponding field in the skeleton:
      
        $ grep non_mmapable test_skeleton.skel.h
                        struct bpf_map *data_non_mmapable;
                s->maps[7].name = ".data.non_mmapable";
                s->maps[7].map = &obj->maps.data_non_mmapable;
      
      But .data.read_mostly has all of those things:
      
        $ grep read_mostly test_skeleton.skel.h
                        struct bpf_map *data_read_mostly;
                struct test_skeleton__data_read_mostly {
                        int read_mostly_var;
                } *data_read_mostly;
                s->maps[6].name = ".data.read_mostly";
                s->maps[6].map = &obj->maps.data_read_mostly;
                s->maps[6].mmaped = (void **)&obj->data_read_mostly;
                _Static_assert(sizeof(s->data_read_mostly->read_mostly_var) == 4, "unexpected size of 'read_mostly_var'");
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221019002816.359650-4-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2f968e9f
    • Andrii Nakryiko's avatar
      libbpf: only add BPF_F_MMAPABLE flag for data maps with global vars · 4fcac46c
      Andrii Nakryiko authored
      Teach libbpf to not add BPF_F_MMAPABLE flag unnecessarily for ARRAY maps
      that are backing data sections, if such data sections don't expose any
      variables to user-space. Exposed variables are those that have
      STB_GLOBAL or STB_WEAK ELF binding and correspond to BTF VAR's
      BTF_VAR_GLOBAL_ALLOCATED linkage.
      
      The overall idea is that if some data section doesn't have any variable that
      is exposed through BPF skeleton, then there is no reason to make such
      BPF array mmapable. Making BPF array mmapable is not a free no-op
      action, because BPF verifier doesn't allow users to put special objects
      (such as BPF spin locks, RB tree nodes, linked list nodes, kptrs, etc;
      anything that has a sensitive internal state that should not be modified
      arbitrarily from user space) into mmapable arrays, as there is no way to
      prevent user space from corrupting such sensitive state through direct
      memory access through memory-mapped region.
      
      By making sure that libbpf doesn't add BPF_F_MMAPABLE flag to BPF array
      maps corresponding to data sections that only have static variables
      (which are not supposed to be visible to user space according to libbpf
      and BPF skeleton rules), users now can have spinlocks, kptrs, etc in
      either default .bss/.data sections or custom .data.* sections (assuming
      there are no global variables in such sections).
      
      The only possible hiccup with this approach is the need to use global
      variables during BPF static linking, even if it's not intended to be
      shared with user space through BPF skeleton. To allow such scenarios,
      extend libbpf's STV_HIDDEN ELF visibility attribute handling to
      variables. Libbpf is already treating global hidden BPF subprograms as
      static subprograms and adjusts BTF accordingly to make BPF verifier
      verify such subprograms as static subprograms with preserving entire BPF
      verifier state between subprog calls. This patch teaches libbpf to treat
      global hidden variables as static ones and adjust BTF information
      accordingly as well. This allows to share variables between multiple
      object files during static linking, but still keep them internal to BPF
      program and not get them exposed through BPF skeleton.
      
      Note, that if the user has some advanced scenario where they absolutely
      need BPF_F_MMAPABLE flag on .data/.bss/.rodata BPF array map despite
      only having static variables, they still can achieve this by forcing it
      through explicit bpf_map__set_map_flags() API.
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20221019002816.359650-3-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4fcac46c