1. 05 Jan, 2023 4 commits
    • Siddharth Vadapalli's avatar
      net: ethernet: ti: am65-cpsw: Add support for SERDES configuration · dab2b265
      Siddharth Vadapalli authored
      Use PHY framework APIs to initialize the SERDES PHY connected to CPSW MAC.
      
      Define the functions am65_cpsw_disable_phy(), am65_cpsw_enable_phy(),
      am65_cpsw_disable_serdes_phy() and am65_cpsw_enable_serdes_phy().
      
      Add new member "serdes_phy" to struct "am65_cpsw_slave_data" to store the
      SERDES PHY for each port, if it exists. Use it later while disabling the
      SERDES PHY for each port.
      
      Power on and initialize the SerDes PHY in am65_cpsw_nuss_init_slave_ports()
      by invoking am65_cpsw_enable_serdes_phy().
      
      Power off the SerDes PHY in am65_cpsw_nuss_remove() by invoking
      am65_cpsw_disable_serdes_phy().
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      dab2b265
    • Siddharth Vadapalli's avatar
      net: ethernet: ti: am65-cpsw: Enable QSGMII mode for J721e CPSW9G · 944131fa
      Siddharth Vadapalli authored
      CPSW9G in J721e supports additional modes like QSGMII.
      Add new compatible for J721e in am65-cpsw driver.
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      944131fa
    • Siddharth Vadapalli's avatar
      dt-bindings: net: ti: k3-am654-cpsw-nuss: Add J721e CPSW9G support · c85b53e3
      Siddharth Vadapalli authored
      Update bindings for TI K3 J721e SoC which contains 9 ports (8 external
      ports) CPSW9G module and add compatible for it.
      
      Changes made:
          - Add new compatible ti,j721e-cpswxg-nuss for CPSW9G.
          - Extend pattern properties for new compatible.
          - Change maximum number of CPSW ports to 8 for new compatible.
      Signed-off-by: default avatarSiddharth Vadapalli <s-vadapalli@ti.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c85b53e3
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · d75858ef
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      bpf-next 2023-01-04
      
      We've added 45 non-merge commits during the last 21 day(s) which contain
      a total of 50 files changed, 1454 insertions(+), 375 deletions(-).
      
      The main changes are:
      
      1) Fixes, improvements and refactoring of parts of BPF verifier's
         state equivalence checks, from Andrii Nakryiko.
      
      2) Fix a few corner cases in libbpf's BTF-to-C converter in particular
         around padding handling and enums, also from Andrii Nakryiko.
      
      3) Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better
        support decap on GRE tunnel devices not operating in collect metadata,
        from Christian Ehrig.
      
      4) Improve x86 JIT's codegen for PROBE_MEM runtime error checks,
         from Dave Marchevsky.
      
      5) Remove the need for trace_printk_lock for bpf_trace_printk
         and bpf_trace_vprintk helpers, from Jiri Olsa.
      
      6) Add proper documentation for BPF_MAP_TYPE_SOCK{MAP,HASH} maps,
         from Maryam Tahhan.
      
      7) Improvements in libbpf's btf_parse_elf error handling, from Changbin Du.
      
      8) Bigger batch of improvements to BPF tracing code samples,
         from Daniel T. Lee.
      
      9) Add LoongArch support to libbpf's bpf_tracing helper header,
         from Hengqi Chen.
      
      10) Fix a libbpf compiler warning in perf_event_open_probe on arm32,
          from Khem Raj.
      
      11) Optimize bpf_local_storage_elem by removing 56 bytes of padding,
          from Martin KaFai Lau.
      
      12) Use pkg-config to locate libelf for resolve_btfids build,
          from Shen Jiamin.
      
      13) Various libbpf improvements around API documentation and errno
          handling, from Xin Liu.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits)
        libbpf: Return -ENODATA for missing btf section
        libbpf: Add LoongArch support to bpf_tracing.h
        libbpf: Restore errno after pr_warn.
        libbpf: Added the description of some API functions
        libbpf: Fix invalid return address register in s390
        samples/bpf: Use BPF_KSYSCALL macro in syscall tracing programs
        samples/bpf: Fix tracex2 by using BPF_KSYSCALL macro
        samples/bpf: Change _kern suffix to .bpf with syscall tracing program
        samples/bpf: Use vmlinux.h instead of implicit headers in syscall tracing program
        samples/bpf: Use kyscall instead of kprobe in syscall tracing program
        bpf: rename list_head -> graph_root in field info types
        libbpf: fix errno is overwritten after being closed.
        bpf: fix regs_exact() logic in regsafe() to remap IDs correctly
        bpf: perform byte-by-byte comparison only when necessary in regsafe()
        bpf: reject non-exact register type matches in regsafe()
        bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule
        bpf: reorganize struct bpf_reg_state fields
        bpf: teach refsafe() to take into account ID remapping
        bpf: Remove unused field initialization in bpf's ctl_table
        selftests/bpf: Add jit probe_mem corner case tests to s390x denylist
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20230105000926.31350-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d75858ef
  2. 04 Jan, 2023 1 commit
  3. 03 Jan, 2023 5 commits
  4. 30 Dec, 2022 1 commit
  5. 29 Dec, 2022 9 commits
  6. 28 Dec, 2022 7 commits
    • Xin Liu's avatar
      libbpf: fix errno is overwritten after being closed. · 07453245
      Xin Liu authored
      In the ensure_good_fd function, if the fcntl function succeeds but
      the close function fails, ensure_good_fd returns a normal fd and
      sets errno, which may cause users to misunderstand. The close
      failure is not a serious problem, and the correct FD has been
      handed over to the upper-layer application. Let's restore errno here.
      Signed-off-by: default avatarXin Liu <liuxin350@huawei.com>
      Link: https://lore.kernel.org/r/20221223133618.10323-1-liuxin350@huawei.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      07453245
    • Andrii Nakryiko's avatar
      bpf: fix regs_exact() logic in regsafe() to remap IDs correctly · 4633a006
      Andrii Nakryiko authored
      Comparing IDs exactly between two separate states is not just
      suboptimal, but also incorrect in some cases. So update regs_exact()
      check to do byte-by-byte memcmp() only up to id/ref_obj_id. For id and
      ref_obj_id perform proper check_ids() checks, taking into account idmap.
      
      This change makes more states equivalent improving insns and states
      stats across a bunch of selftest BPF programs:
      
      File                                         Program                           Insns (A)  Insns (B)  Insns   (DIFF)  States (A)  States (B)  States (DIFF)
      -------------------------------------------  --------------------------------  ---------  ---------  --------------  ----------  ----------  -------------
      cgrp_kfunc_success.bpf.linked1.o             test_cgrp_get_release                   141        137     -4 (-2.84%)          13          13    +0 (+0.00%)
      cgrp_kfunc_success.bpf.linked1.o             test_cgrp_xchg_release                  142        139     -3 (-2.11%)          14          13    -1 (-7.14%)
      connect6_prog.bpf.linked1.o                  connect_v6_prog                         139        102   -37 (-26.62%)           9           6   -3 (-33.33%)
      ima.bpf.linked1.o                            bprm_creds_for_exec                      68         61    -7 (-10.29%)           6           5   -1 (-16.67%)
      linked_list.bpf.linked1.o                    global_list_in_list                     569        499   -70 (-12.30%)          60          52   -8 (-13.33%)
      linked_list.bpf.linked1.o                    global_list_push_pop                    167        150   -17 (-10.18%)          18          16   -2 (-11.11%)
      linked_list.bpf.linked1.o                    global_list_push_pop_multiple           881        815    -66 (-7.49%)          74          63  -11 (-14.86%)
      linked_list.bpf.linked1.o                    inner_map_list_in_list                  579        534    -45 (-7.77%)          61          55    -6 (-9.84%)
      linked_list.bpf.linked1.o                    inner_map_list_push_pop                 190        181     -9 (-4.74%)          19          18    -1 (-5.26%)
      linked_list.bpf.linked1.o                    inner_map_list_push_pop_multiple        916        850    -66 (-7.21%)          75          64  -11 (-14.67%)
      linked_list.bpf.linked1.o                    map_list_in_list                        588        525   -63 (-10.71%)          62          55   -7 (-11.29%)
      linked_list.bpf.linked1.o                    map_list_push_pop                       183        174     -9 (-4.92%)          18          17    -1 (-5.56%)
      linked_list.bpf.linked1.o                    map_list_push_pop_multiple              909        843    -66 (-7.26%)          75          64  -11 (-14.67%)
      map_kptr.bpf.linked1.o                       test_map_kptr                           264        256     -8 (-3.03%)          26          26    +0 (+0.00%)
      map_kptr.bpf.linked1.o                       test_map_kptr_ref                        95         91     -4 (-4.21%)           9           8   -1 (-11.11%)
      task_kfunc_success.bpf.linked1.o             test_task_xchg_release                  139        136     -3 (-2.16%)          14          13    -1 (-7.14%)
      test_bpf_nf.bpf.linked1.o                    nf_skb_ct_test                          815        509  -306 (-37.55%)          57          30  -27 (-47.37%)
      test_bpf_nf.bpf.linked1.o                    nf_xdp_ct_test                          815        509  -306 (-37.55%)          57          30  -27 (-47.37%)
      test_cls_redirect.bpf.linked1.o              cls_redirect                          78925      78390   -535 (-0.68%)        4782        4704   -78 (-1.63%)
      test_cls_redirect_subprogs.bpf.linked1.o     cls_redirect                          64901      63897  -1004 (-1.55%)        4612        4470  -142 (-3.08%)
      test_sk_lookup.bpf.linked1.o                 access_ctx_sk                           181         95   -86 (-47.51%)          19          10   -9 (-47.37%)
      test_sk_lookup.bpf.linked1.o                 ctx_narrow_access                       447        437    -10 (-2.24%)          38          37    -1 (-2.63%)
      test_sk_lookup_kern.bpf.linked1.o            sk_lookup_success                       148        133   -15 (-10.14%)          14          12   -2 (-14.29%)
      test_tcp_check_syncookie_kern.bpf.linked1.o  check_syncookie_clsact                  304        300     -4 (-1.32%)          23          22    -1 (-4.35%)
      test_tcp_check_syncookie_kern.bpf.linked1.o  check_syncookie_xdp                     304        300     -4 (-1.32%)          23          22    -1 (-4.35%)
      test_verify_pkcs7_sig.bpf.linked1.o          bpf                                      87         76   -11 (-12.64%)           7           6   -1 (-14.29%)
      -------------------------------------------  --------------------------------  ---------  ---------  --------------  ----------  ----------  -------------
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-7-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4633a006
    • Andrii Nakryiko's avatar
      bpf: perform byte-by-byte comparison only when necessary in regsafe() · 4a95c85c
      Andrii Nakryiko authored
      Extract byte-by-byte comparison of bpf_reg_state in regsafe() into
      a helper function, which makes it more convenient to use it "on demand"
      only for registers that benefit from such checks, instead of doing it
      all the time, even if result of such comparison is ignored.
      
      Also, remove WARN_ON_ONCE(1)+return false dead code. There is no risk of
      missing some case as compiler will warn about non-void function not
      returning value in some branches (and that under assumption that default
      case is removed in the future).
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-6-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4a95c85c
    • Andrii Nakryiko's avatar
      bpf: reject non-exact register type matches in regsafe() · 910f6999
      Andrii Nakryiko authored
      Generalize the (somewhat implicit) rule of regsafe(), which states that
      if register types in old and current states do not match *exactly*, they
      can't be safely considered equivalent.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-5-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      910f6999
    • Andrii Nakryiko's avatar
      bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule · 7f4ce97c
      Andrii Nakryiko authored
      Make generic check to prevent XXX_OR_NULL and XXX register types to be
      intermixed. While technically in some situations it could be safe, it's
      impossible to enforce due to the loss of an ID when converting
      XXX_OR_NULL to its non-NULL variant. So prevent this in general, not
      just for PTR_TO_MAP_KEY and PTR_TO_MAP_VALUE.
      
      PTR_TO_MAP_KEY_OR_NULL and PTR_TO_MAP_VALUE_OR_NULL checks, which were
      previously special-cased, are simplified to generic check that takes
      into account range_within() and tnum_in(). This is correct as BPF
      verifier doesn't allow arithmetic on XXX_OR_NULL register types, so
      var_off and ranges should stay zero. But even if in the future this
      restriction is lifted, it's even more important to enforce that var_off
      and ranges are compatible, otherwise it's possible to construct case
      where this can be exploited to bypass verifier's memory range safety
      checks.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-4-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7f4ce97c
    • Andrii Nakryiko's avatar
      bpf: reorganize struct bpf_reg_state fields · a73bf9f2
      Andrii Nakryiko authored
      Move id and ref_obj_id fields after scalar data section (var_off and
      ranges). This is necessary to simplify next patch which will change
      regsafe()'s logic to be safer, as it makes the contents that has to be
      an exact match (type-specific parts, off, type, and var_off+ranges)
      a single sequential block of memory, while id and ref_obj_id should
      always be remapped and thus can't be memcp()'ed.
      
      There are few places that assume that var_off is after id/ref_obj_id to
      clear out id/ref_obj_id with the single memset(0). These are changed to
      explicitly zero-out id/ref_obj_id fields. Other places are adjusted to
      preserve exact byte-by-byte comparison behavior.
      
      No functional changes.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-3-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a73bf9f2
    • Andrii Nakryiko's avatar
      bpf: teach refsafe() to take into account ID remapping · e8f55fcf
      Andrii Nakryiko authored
      states_equal() check performs ID mapping between old and new states to
      establish a 1-to-1 correspondence between IDs, even if their absolute
      numberic values across two equivalent states differ. This is important
      both for correctness and to avoid unnecessary work when two states are
      equivalent.
      
      With recent changes we partially fixed this logic by maintaining ID map
      across all function frames. This patch also makes refsafe() check take
      into account (and maintain) ID map, making states_equal() behavior more
      optimal and correct.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221223054921.958283-2-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e8f55fcf
  7. 22 Dec, 2022 2 commits
  8. 21 Dec, 2022 11 commits
    • Dave Marchevsky's avatar
      selftests/bpf: Add verifier test exercising jit PROBE_MEM logic · 59fe41b5
      Dave Marchevsky authored
      This patch adds a test exercising logic that was fixed / improved in
      the previous patch in the series, as well as general sanity checking for
      jit's PROBE_MEM logic which should've been unaffected by the previous
      patch.
      
      The added verifier test does the following:
      
        * Acquire a referenced kptr to struct prog_test_ref_kfunc using
          existing net/bpf/test_run.c kfunc
          * Helper returns ptr to a specific prog_test_ref_kfunc whose first
            two fields - both ints - have been prepopulated w/ vals 42 and
            108, respectively
        * kptr_xchg the acquired ptr into an arraymap
        * Do a direct map_value load of the just-added ptr
          * Goal of all this setup is to get an unreferenced kptr pointing to
            struct with ints of known value, which is the result of this step
        * Using unreferenced kptr obtained in previous step, do loads of
          prog_test_ref_kfunc.a (offset 0) and .b (offset 4)
        * Then incr the kptr by 8 and load prog_test_ref_kfunc.a again (this
          time at offset -8)
        * Add all the loaded ints together and return
      
      Before the PROBE_MEM fixes in previous patch, the loads at offset 0 and
      4 would succeed, while the load at offset -8 would incorrectly fail
      runtime check emitted by the JIT and 0 out dst reg as a result. This
      confirmed by retval of 150 for this test before previous patch - since
      second .a read is 0'd out - and a retval of 192 with the fixed logic.
      
      The test exercises the two optimizations to fixed logic added in last
      patch as well:
      
        * First load, with insn "r8 = *(u32 *)(r9 + 0)" exercises "insn->off
          is 0, no need to add / sub from src_reg" optimization
        * Third load, with insn "r9 = *(u32 *)(r9 - 8)" exercises "src_reg ==
          dst_reg, no need to restore src_reg after load" optimization
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20221216214319.3408356-2-davemarchevsky@fb.com
      59fe41b5
    • Dave Marchevsky's avatar
      bpf, x86: Improve PROBE_MEM runtime load check · 90156f4b
      Dave Marchevsky authored
      This patch rewrites the runtime PROBE_MEM check insns emitted by the BPF
      JIT in order to ensure load safety. The changes in the patch fix two
      issues with the previous logic and more generally improve size of
      emitted code. Paragraphs between this one and "FIX 1" below explain the
      purpose of the runtime check and examine the current implementation.
      
      When a load is marked PROBE_MEM - e.g. due to PTR_UNTRUSTED access - the
      address being loaded from is not necessarily valid. The BPF jit sets up
      exception handlers for each such load which catch page faults and 0 out
      the destination register.
      
      Arbitrary register-relative loads can escape this exception handling
      mechanism. Specifically, a load like dst_reg = *(src_reg + off) will not
      trigger BPF exception handling if (src_reg + off) is outside of kernel
      address space, resulting in an uncaught page fault. A concrete example
      of such behavior is a program like:
      
        struct result {
          char space[40];
          long a;
        };
      
        /* if err, returns ERR_PTR(-EINVAL) */
        struct result *ptr = get_ptr_maybe_err();
        long x = ptr->a;
      
      If get_ptr_maybe_err returns ERR_PTR(-EINVAL) and the result isn't
      checked for err, 'result' will be (u64)-EINVAL, a number close to
      U64_MAX. The ptr->a load will be > U64_MAX and will wrap over to a small
      positive u64, which will be in userspace and thus not covered by BPF
      exception handling mechanism.
      
      In order to prevent such loads from occurring, the BPF jit emits some
      instructions which do runtime checking of (src_reg + off) and skip the
      actual load if it's out of range. As an example, here are instructions
      emitted for a %rdi = *(%rdi + 0x10) PROBE_MEM load:
      
        72:   movabs $0x800000000010,%r11 --|
        7c:   cmp    %r11,%rdi              |- 72 - 7f: Check 1
        7f:    jb    0x000000000000008d   --|
        81:   mov    %rdi,%r11             -----|
        84:   add    $0x0000000000000010,%r11   |- 81-8b: Check 2
        8b:   jnc    0x0000000000000091    -----|
        8d:   xor    %edi,%edi             ---- 0 out dest
        8f:   jmp    0x0000000000000095
        91:   mov    0x10(%rdi),%rdi       ---- Actual load
        95:
      
      The JIT considers kernel address space to start at MAX_TASK_SIZE +
      PAGE_SIZE. Determining whether a load will be outside of kernel address
      space should be a simple check:
      
        (src_reg + off) >= MAX_TASK_SIZE + PAGE_SIZE
      
      But because there is only one spare register when the checking logic is
      emitted, this logic is split into two checks:
      
        Check 1: src_reg >= (MAX_TASK_SIZE + PAGE_SIZE - off)
        Check 2: src_reg + off doesn't wrap over U64_MAX and result in small pos u64
      
      Emitted insns implementing Checks 1 and 2 are annotated in the above
      example. Check 1 can be done with a single spare register since the
      source reg by definition is the left-hand-side of the inequality.
      Since adding 'off' to both sides of Check 1's inequality results in the
      original inequality we want, it's equivalent to testing that inequality.
      Except in the case where src_reg + off wraps past U64_MAX, which is why
      Check 2 needs to actually add src_reg + off if Check 1 passes - again
      using the single spare reg.
      
      FIX 1: The Check 1 inequality listed above is not what current code is
      doing. Current code is a bit more pessimistic, instead checking:
      
        src_reg >= (MAX_TASK_SIZE + PAGE_SIZE + abs(off))
      
      The 0x800000000010 in above example is from this current check. If Check
      1 was corrected to use the correct right-hand-side, the value would be
      0x7ffffffffff0. This patch changes the checking logic more broadly (FIX
      2 below will elaborate), fixing this issue as a side-effect of the
      rewrite. Regardless, it's important to understand why Check 1 should've
      been doing MAX_TASK_SIZE + PAGE_SIZE - off before proceeding.
      
      FIX 2: Current code relies on a 'jnc' to determine whether src_reg + off
      addition wrapped over. For negative offsets this logic is incorrect.
      Consider Check 2 insns emitted when off = -0x10:
      
        81:   mov    %rdi,%r11
        84:   add    0xfffffffffffffff0,%r11
        8b:   jnc    0x0000000000000091
      
      2's complement representation of -0x10 is a large positive u64. Any
      value of src_reg that passes Check 1 will result in carry flag being set
      after (src_reg + off) addition. So a load with any negative offset will
      always fail Check 2 at runtime and never do the actual load. This patch
      fixes the negative offset issue by rewriting both checks in order to not
      rely on carry flag.
      
      The rewrite takes advantage of the fact that, while we only have one
      scratch reg to hold arbitrary values, we know the offset at JIT time.
      This we can use src_reg as a temporary scratch reg to hold src_reg +
      offset since we can return it to its original value by later subtracting
      offset. As a result we can directly check the original inequality we
      care about:
      
        (src_reg + off) >= MAX_TASK_SIZE + PAGE_SIZE
      
      For a load like %rdi = *(%rsi + -0x10), this results in emitted code:
      
        43:   movabs $0x800000000000,%r11
        4d:   add    $0xfffffffffffffff0,%rsi --- src_reg += off
        54:   cmp    %r11,%rsi                --- Check original inequality
        57:   jae    0x000000000000005d
        59:   xor    %edi,%edi
        5b:   jmp    0x0000000000000061
        5d:   mov    0x0(%rdi),%rsi           --- Actual Load
        61:   sub    $0xfffffffffffffff0,%rsi --- src_reg -= off
      
      Note that the actual load is always done with offset 0, since previous
      insns have already done src_reg += off. Regardless of whether the new
      check succeeds or fails, insn 61 is always executed, returning src_reg
      to its original value.
      
      Because the goal of these checks is to ensure that loaded-from address
      will be protected by BPF exception handler, the new check can safely
      ignore any wrapover from insn 4d. If such wrapped-over address passes
      insn 54 + 57's cmp-and-jmp it will have such protection so the load can
      proceed.
      
      IMPROVEMENTS: The above improved logic is 8 insns vs original logic's 9,
      and has 1 fewer jmp. The number of checking insns can be further
      improved in common scenarios:
      
      If src_reg == dst_reg, the actual load insn will clobber src_reg, so
      there's no original src_reg state for the sub insn immediately following
      the load to restore, so it can be omitted. In fact, it must be omitted
      since it would incorrectly subtract from the result of the load if it
      wasn't. So for src_reg == dst_reg, JIT emits these insns:
      
        3c:   movabs $0x800000000000,%r11
        46:   add    $0xfffffffffffffff0,%rdi
        4d:   cmp    %r11,%rdi
        50:   jae    0x0000000000000056
        52:   xor    %edi,%edi
        54:   jmp    0x000000000000005a
        56:   mov    0x0(%rdi),%rdi
        5a:
      
      The only difference from larger example being the omitted sub, which
      would've been insn 5a in this example.
      
      If offset == 0, we can similarly omit the sub as in previous case, since
      there's nothing added to subtract. For the same reason we can omit the
      addition as well, resulting in JIT emitting these insns:
      
        46:   movabs $0x800000000000,%r11
        4d:   cmp    %r11,%rdi
        50:   jae    0x0000000000000056
        52:   xor    %edi,%edi
        54:   jmp    0x000000000000005a
        56:   mov    0x0(%rdi),%rdi
        5a:
      
      Although the above example also has src_reg == dst_reg, the same
      offset == 0 optimization is valid to apply if src_reg != dst_reg.
      
      To summarize the improvements in emitted insn count for the
      check-and-load:
      
      BEFORE:                8 check insns, 3 jmps
      AFTER (general case):  7 check insns, 2 jmps (12.5% fewer insn, 33% jmp)
      AFTER (src == dst):    6 check insns, 2 jmps (25% fewer insn)
      AFTER (offset == 0):   5 check insns, 2 jmps (37.5% fewer insn)
      
      (Above counts don't include the 1 load insn, just checking around it)
      
      Based on BPF bytecode + JITted x86 insn I saw while experimenting with
      these improvements, I expect the src_reg == dst_reg case to occur most
      often, followed by offset == 0, then the general case.
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20221216214319.3408356-1-davemarchevsky@fb.com
      90156f4b
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c183e6c3
      Jakub Kicinski authored
      No conflicts.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c183e6c3
    • Andrii Nakryiko's avatar
      libbpf: start v1.2 development cycle · 4ec38eda
      Andrii Nakryiko authored
      Bump current version for new development cycle to v1.2.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/20221221180049.853365-1-andrii@kernel.orgSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      4ec38eda
    • Linus Torvalds's avatar
      Merge tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 609d3bc6
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bpf, netfilter and can.
      
        Current release - regressions:
      
         - bpf: synchronize dispatcher update with bpf_dispatcher_xdp_func
      
         - rxrpc:
            - fix security setting propagation
            - fix null-deref in rxrpc_unuse_local()
            - fix switched parameters in peer tracing
      
        Current release - new code bugs:
      
         - rxrpc:
            - fix I/O thread startup getting skipped
            - fix locking issues in rxrpc_put_peer_locked()
            - fix I/O thread stop
            - fix uninitialised variable in rxperf server
            - fix the return value of rxrpc_new_incoming_call()
      
         - microchip: vcap: fix initialization of value and mask
      
         - nfp: fix unaligned io read of capabilities word
      
        Previous releases - regressions:
      
         - stop in-kernel socket users from corrupting socket's task_frag
      
         - stream: purge sk_error_queue in sk_stream_kill_queues()
      
         - openvswitch: fix flow lookup to use unmasked key
      
         - dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
      
         - devlink:
            - hold region lock when flushing snapshots
            - protect devlink dump by the instance lock
      
        Previous releases - always broken:
      
         - bpf:
            - prevent leak of lsm program after failed attach
            - resolve fext program type when checking map compatibility
      
         - skbuff: account for tail adjustment during pull operations
      
         - macsec: fix net device access prior to holding a lock
      
         - bonding: switch back when high prio link up
      
         - netfilter: flowtable: really fix NAT IPv6 offload
      
         - enetc: avoid buffer leaks on xdp_do_redirect() failure
      
         - unix: fix race in SOCK_SEQPACKET's unix_dgram_sendmsg()
      
         - dsa: microchip: remove IRQF_TRIGGER_FALLING in
           request_threaded_irq"
      
      * tag 'net-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
        net: fec: check the return value of build_skb()
        net: simplify sk_page_frag
        Treewide: Stop corrupting socket's task_frag
        net: Introduce sk_use_task_frag in struct sock.
        mctp: Remove device type check at unregister
        net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq
        can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
        can: flexcan: avoid unbalanced pm_runtime_enable warning
        Documentation: devlink: add missing toc entry for etas_es58x devlink doc
        mctp: serial: Fix starting value for frame check sequence
        nfp: fix unaligned io read of capabilities word
        net: stream: purge sk_error_queue in sk_stream_kill_queues()
        myri10ge: Fix an error handling path in myri10ge_probe()
        net: microchip: vcap: Fix initialization of value and mask
        rxrpc: Fix the return value of rxrpc_new_incoming_call()
        rxrpc: rxperf: Fix uninitialised variable
        rxrpc: Fix I/O thread stop
        rxrpc: Fix switched parameters in peer tracing
        rxrpc: Fix locking issues in rxrpc_put_peer_locked()
        rxrpc: Fix I/O thread startup getting skipped
        ...
      609d3bc6
    • Linus Torvalds's avatar
      Merge tag 'fs.vfsuid.ima.v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping · 878cf96f
      Linus Torvalds authored
      Pull vfsuid cleanup from Christian Brauner:
       "This moves the ima specific vfs{g,u}id_t comparison helpers out of the
        header and into the one file in ima where they are used.
      
        We shouldn't incentivize people to use them by placing them into the
        header. As discussed and suggested by Linus in [1] let's just define
        them locally in the one file in ima where they are used"
      
      Link: https://lore.kernel.org/lkml/CAHk-=wj4BpEwUd=OkTv1F9uykvSrsBNZJVHMp+p_+e2kiV71_A@mail.gmail.com [1]
      
      * tag 'fs.vfsuid.ima.v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
        mnt_idmapping: move ima-only helpers to ima
      878cf96f
    • Linus Torvalds's avatar
      Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random · 222882c2
      Linus Torvalds authored
      Pull more random number generator updates from Jason Donenfeld:
       "Two remaining changes that are now possible after you merged a few
        other trees:
      
         - #include <asm/archrandom.h> can be removed from random.h now,
           making the direct use of the arch_random_* API more of a private
           implementation detail between the archs and random.c, rather than
           something for general consumers.
      
         - Two additional uses of prandom_u32_max() snuck in during the
           initial phase of pulls, so these have been converted to
           get_random_u32_below(), and now the deprecated prandom_u32_max()
           alias -- which was just a wrapper around get_random_u32_below() --
           can be removed.
      
        In addition, there is one fix:
      
         - Check efi_rt_services_supported() before attempting to use an EFI
           runtime function.
      
           This affected EFI systems that disable runtime services yet still
           boot via EFI (e.g. the reporter's Lenovo Thinkpad X13s laptop), as
           well systems where EFI runtime services have been forcibly
           disabled, such as on PREEMPT_RT.
      
           On those machines, a very early and hard to diagnose crash would
           happen, preventing boot"
      
      * tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
        prandom: remove prandom_u32_max()
        efi: random: fix NULL-deref when refreshing seed
        random: do not include <asm/archrandom.h> from random.h
      222882c2
    • Linus Torvalds's avatar
      Merge tag 'rcu-urgent.2022.12.17a' of... · 19822e3e
      Linus Torvalds authored
      Merge tag 'rcu-urgent.2022.12.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull RCU fix from Paul McKenney:
       "This fixes a lockdep false positive in synchronize_rcu() that can
        otherwise occur during early boot.
      
        The fix simply avoids invoking lockdep if the scheduler has not yet
        been initialized, that is, during that portion of boot when interrupts
        are disabled"
      
      * tag 'rcu-urgent.2022.12.17a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        rcu: Don't assert interrupts enabled too early in boot
      19822e3e
    • Martin KaFai Lau's avatar
      bpf: Reduce smap->elem_size · 552d42a3
      Martin KaFai Lau authored
      'struct bpf_local_storage_elem' has an unused 56 byte padding at the
      end due to struct's cache-line alignment requirement. This padding
      space is overlapped by storage value contents, so if we use sizeof()
      to calculate the total size, we overinflate it by 56 bytes. Use
      offsetof() instead to calculate more exact memory use.
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20221221013036.3427431-1-martin.lau@linux.dev
      552d42a3
    • Andrii Nakryiko's avatar
      Merge branch 'bpftool: improve error handing for missing .BTF section' · 7b43df6c
      Andrii Nakryiko authored
      Changbin Du says:
      
      ====================
      Display error message for missing ".BTF" section and clean up empty
      vmlinux.h file.
      
      v3:
       - fix typo and make error message consistent. (Andrii Nakryiko)
       - split out perf change.
      v2:
       - remove vmlinux specific error info.
       - use builtin target .DELETE_ON_ERROR: to delete empty vmlinux.h
      ====================
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      7b43df6c
    • Changbin Du's avatar
      bpf: makefiles: Do not generate empty vmlinux.h · e7f0d5cd
      Changbin Du authored
      Remove the empty vmlinux.h if bpftool failed to dump btf info.
      The empty vmlinux.h can hide real error when reading output
      of make.
      
      This is done by adding .DELETE_ON_ERROR special target in related
      makefiles.
      Signed-off-by: default avatarChangbin Du <changbin.du@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20221217223509.88254-3-changbin.du@gmail.com
      e7f0d5cd