1. 25 Jan, 2024 30 commits
  2. 24 Jan, 2024 10 commits
    • Martin KaFai Lau's avatar
      selftests/bpf: Wait for the netstamp_needed_key static key to be turned on · ce6f6cff
      Martin KaFai Lau authored
      After the previous patch that speeded up the test (by avoiding neigh
      discovery in IPv6), the BPF CI occasionally hits this error:
      
      rcv tstamp unexpected pkt rcv tstamp: actual 0 == expected 0
      
      The test complains about the cmsg returned from the recvmsg() does not
      have the rcv timestamp. Setting skb->tstamp or not is
      controlled by a kernel static key "netstamp_needed_key". The static
      key is enabled whenever this is at least one sk with the SOCK_TIMESTAMP
      set.
      
      The test_redirect_dtime does use setsockopt() to turn on
      the SOCK_TIMESTAMP for the reading sk. In the kernel
      net_enable_timestamp() has a delay to enable the "netstamp_needed_key"
      when CONFIG_JUMP_LABEL is set. This potential delay is the likely reason
      for packet missing rcv timestamp occasionally.
      
      This patch is to create udp sockets with SOCK_TIMESTAMP set.
      It sends and receives some packets until the received packet
      has a rcv timestamp. It currently retries at most 5 times with 1s
      in between. This should be enough to wait for the "netstamp_needed_key".
      It then holds on to the socket and only closes it at the end of the test.
      This guarantees that the test has the "netstamp_needed_key" key turned
      on from the beginning.
      
      To simplify the udp sockets setup, they are sending/receiving packets
      in the same netns (ns_dst is used) and communicate over the "lo" dev.
      Hence, the patch enables the "lo" dev in the ns_dst.
      
      Fixes: c803475f ("bpf: selftests: test skb->tstamp in redirect_neigh")
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240120060518.3604920-2-martin.lau@linux.dev
      ce6f6cff
    • Martin KaFai Lau's avatar
      selftests/bpf: Fix the flaky tc_redirect_dtime test · 177f1d08
      Martin KaFai Lau authored
      BPF CI has been reporting the tc_redirect_dtime test failing
      from time to time:
      
      test_inet_dtime:PASS:setns src 0 nsec
      (network_helpers.c:253: errno: No route to host) Failed to connect to server
      close_netns:PASS:setns 0 nsec
      test_inet_dtime:FAIL:connect_to_fd unexpected connect_to_fd: actual -1 < expected 0
      test_tcp_clear_dtime:PASS:tcp ip6 clear dtime ingress_fwdns_p100 0 nsec
      
      The connect_to_fd failure (EHOSTUNREACH) is from the
      test_tcp_clear_dtime() test and it is the very first IPv6 traffic
      after setting up all the links, addresses, and routes.
      
      The symptom is this first connect() is always slow. In my setup, it
      could take ~3s.
      
      After some tracing and tcpdump, the slowness is mostly spent in
      the neighbor solicitation in the "ns_fwd" namespace while
      the "ns_src" and "ns_dst" are fine.
      
      I forced the kernel to drop the neighbor solicitation messages.
      I can then reproduce EHOSTUNREACH. What actually happen could be:
      - the neighbor advertisement came back a little slow.
      - the "ns_fwd" namespace concluded a neighbor discovery failure
        and triggered the ndisc_error_report() => ip6_link_failure() =>
        icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0)
      - the client's connect() reports EHOSTUNREACH after receiving
        the ICMPV6_DEST_UNREACH message.
      
      The neigh table of both "ns_src" and "ns_dst" namespace has already
      been manually populated but not the "ns_fwd" namespace. This patch
      fixes it by manually populating the neigh table also in the "ns_fwd"
      namespace.
      
      Although the namespace configuration part had been existed before
      the tc_redirect_dtime test, still Fixes-tagging the patch when
      the tc_redirect_dtime test was added since it is the only test
      hitting it so far.
      
      Fixes: c803475f ("bpf: selftests: test skb->tstamp in redirect_neigh")
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240120060518.3604920-1-martin.lau@linux.dev
      177f1d08
    • Dima Tisnek's avatar
      libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo struct · d47b9f68
      Dima Tisnek authored
      Past commit ([0]) removed the last vestiges of struct bpf_field_reloc,
      it's called struct bpf_core_relo now.
      
        [0] 28b93c64 ("libbpf: Clean up and improve CO-RE reloc logging")
      Signed-off-by: default avatarDima Tisnek <dimaqq@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/bpf/20240121060126.15650-1-dimaqq@gmail.com
      d47b9f68
    • Andrii Nakryiko's avatar
      Merge branch 'skip-callback-tests-if-jit-is-disabled-in-test_verifier' · 32749605
      Andrii Nakryiko authored
      Tiezhu Yang says:
      
      ====================
      Skip callback tests if jit is disabled in test_verifier
      
      Thanks very much for the feedbacks from Eduard, John, Jiri, Daniel,
      Hou Tao, Song Liu and Andrii.
      
      v7:
        -- Add an explicit flag F_NEEDS_JIT_ENABLED for checking,
           thanks Andrii.
      
      v6:
        -- Copy insn_is_pseudo_func() into testing_helpers,
           thanks Andrii.
      
      v5:
        -- Reuse is_ldimm64_insn() and insn_is_pseudo_func(),
           thanks Song Liu.
      
      v4:
        -- Move the not-allowed-checking into "if (expected_ret ...)"
           block, thanks Hou Tao.
        -- Do some small changes to avoid checkpatch warning
           about "line length exceeds 100 columns".
      
      v3:
        -- Rebase on the latest bpf-next tree.
        -- Address the review comments by Hou Tao,
           remove the second argument "0" of open(),
           check only once whether jit is disabled,
           check fd_prog, saved_errno and jit_disabled to skip.
      ====================
      
      Link: https://lore.kernel.org/r/20240123090351.2207-1-yangtiezhu@loongson.cnSigned-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      32749605
    • Tiezhu Yang's avatar
      selftests/bpf: Skip callback tests if jit is disabled in test_verifier · 0b50478f
      Tiezhu Yang authored
      If CONFIG_BPF_JIT_ALWAYS_ON is not set and bpf_jit_enable is 0, there
      exist 6 failed tests.
      
        [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
        [root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
        [root@linux bpf]# ./test_verifier | grep FAIL
        #106/p inline simple bpf_loop call FAIL
        #107/p don't inline bpf_loop call, flags non-zero FAIL
        #108/p don't inline bpf_loop call, callback non-constant FAIL
        #109/p bpf_loop_inline and a dead func FAIL
        #110/p bpf_loop_inline stack locations for loop vars FAIL
        #111/p inline bpf_loop call in a big program FAIL
        Summary: 768 PASSED, 15 SKIPPED, 6 FAILED
      
      The test log shows that callbacks are not allowed in non-JITed programs,
      interpreter doesn't support them yet, thus these tests should be skipped
      if jit is disabled.
      
      Add an explicit flag F_NEEDS_JIT_ENABLED to those tests to mark that they
      require JIT enabled in bpf_loop_inline.c, check the flag and jit_disabled
      at the beginning of do_test_single() to handle this case.
      
      With this patch:
      
        [root@linux bpf]# echo 0 > /proc/sys/net/core/bpf_jit_enable
        [root@linux bpf]# echo 0 > /proc/sys/kernel/unprivileged_bpf_disabled
        [root@linux bpf]# ./test_verifier | grep FAIL
        Summary: 768 PASSED, 21 SKIPPED, 0 FAILED
      Suggested-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20240123090351.2207-3-yangtiezhu@loongson.cn
      0b50478f
    • Tiezhu Yang's avatar
      selftests/bpf: Move is_jit_enabled() into testing_helpers · 15b4f88d
      Tiezhu Yang authored
      Currently, is_jit_enabled() is only used in test_progs, move it into
      testing_helpers so that it can be used in test_verifier. While at it,
      remove the second argument "0" of open() as Hou Tao suggested.
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarHou Tao <houtao1@huawei.com>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/bpf/20240123090351.2207-2-yangtiezhu@loongson.cn
      15b4f88d
    • Martin KaFai Lau's avatar
      Merge branch 'Registrating struct_ops types from modules' · 8b593021
      Martin KaFai Lau authored
      Kui-Feng Lee says:
      
      ====================
      Given the current constraints of the current implementation,
      struct_ops cannot be registered dynamically. This presents a
      significant limitation for modules like coming fuse-bpf, which seeks
      to implement a new struct_ops type. To address this issue, a new API
      is introduced that allows the registration of new struct_ops types
      from modules.
      
      Previously, struct_ops types were defined in bpf_struct_ops_types.h
      and collected as a static array. The new API lets callers add new
      struct_ops types dynamically. The static array has been removed and
      replaced by the per-btf struct_ops_tab.
      
      The struct_ops subsystem relies on BTF to determine the layout of
      values in a struct_ops map and identify the subsystem that the
      struct_ops map registers to. However, the kernel BTF does not include
      the type information of struct_ops types defined by a module. The
      struct_ops subsystem requires knowledge of the corresponding module
      for a given struct_ops map and the utilization of BTF information from
      that module. We empower libbpf to determine the correct module for
      accessing the BTF information and pass an identity (FD) of the module
      btf to the kernel. The kernel looks up type information and registered
      struct_ops types directly from the given btf.
      
      If a module exits while one or more struct_ops maps still refer to a
      struct_ops type defined by the module, it can lead to unforeseen
      complications. Therefore, it is crucial to ensure that a module
      remains intact as long as any struct_ops map is still linked to a
      struct_ops type defined by the module. To achieve this, every
      struct_ops map holds a reference to the module while being registered.
      
      Changes from v16:
      
       - Fix unnecessary bpf_struct_ops_link_create() removing/adding.
      
       - Rename REGISTER_BPF_STRUCT_OPS() to register_bpf_struct_ops().
      
       - Implement bpf_map_struct_ops_info_fill() for !CONFIG_BPF_JIT.
      
      Changes from v15:
      
       - Fix the misleading commit message of part 4.
      
       - Introduce BPF_F_VTYPE_BTF_OBJ_FD flag to struct bpf_attr to tell
         if value_type_btf_obj_fd is set or not.
      
       - Introduce links_cnt to struct bpf_struct_ops_map to avoid accessing
         struct bpf_struct_ops_desc in bpf_struct_ops_map_put_progs() after
         calling module_put() against the owner module of the struct_ops
         type. (Part 9)
      
      Changes from v14:
      
       - Rebase. Add cif_stub required by
         the commit 2cd3e377 ("x86/cfi,bpf: Fix bpf_struct_ops CFI")
      
       - Remove creating struct_ops map without bpf_testmod.ko from the
         test.
      
       - Check the name of btf returned by bpf_map_info by getting the name
         with bpf_btf_get_info_by_fd().
      
       - Change value_type_btf_obj_fd to a signed type to allow the 0 fd.
      
      Changes from v13:
      
       - Change the test case to use bpf_map_create() to create a struct_ops
         map while testmod.ko is unloaded.
      
       - Move bpf_struct_ops_find*() to btf.c.
      
       - Use btf_is_module() to replace btf != btf_vmlinux.
      
      Changes from v12:
      
       - Rebase to for-next to fix conflictions.
      
      Changes from v11:
      
       - bpf_struct_ops_maps hold only the refcnt to the module, but not
         btf. (patch 1)
      
       - Fix warning messages. (patch 1, 9 and 10)
      
       - Remove unnecessary conditional compiling of CONFIG_BPF_JIT.
         (patch 4, 9 and 10)
      
       - Fix the commit log of the patch 7 to explain how a btf is pass from
         the user space and how the kernel handle it.
      
       - bpf_struct_ops_maps hold the module defining it's type, but not
         btf. A map will hold the module through its life-span from
         allocating to being free. (patch 8)
      
       - Change selftests and tracing __bpf_struct_ops_map_free() to wait
         for the release of the bpf_testmod module.
      
       - Include btf_obj_id in bpf_map_info. (patch 14)
      
      Changes from v10:
      
       - Guard btf.c from CONFIG_BPF_JIT=n. This patchset has introduced
         symbols from bpf_struct_ops.c which is only built when
         CONFIG_BPF_JIT=y.
      
       - Fix the warning of unused errout_free label by moving code that is
         leaked to patch 8 to patch 7.
      
      Changes from v9:
      
       - Remove the call_rcu_tasks_trace() changes from kern_sync_rcu().
      
       - Trace btf_put() in the test case to ensure the release of kmod's
         btf, or the consequent tests may fail for using kmod's unloaded old
         btf instead the new one created after loading again. The kmod's btf
         may live for awhile after unloading the kmod, for a map being freed
         asynchronized is still holding the btf.
      
       - Split "add struct_ops_tab to btf" into tow patches by adding
         "make struct_ops_map support btfs other than btf_vmlinux".
      
       - Flip the order of "pass attached BTF to the bpf_struct_ops
         subsystem" and "hold module for bpf_struct_ops_map" to make it more
         reasonable.
      
       - Fix the compile errors of a missing header file.
      
      Changes from v8:
      
       - Rename bpf_struct_ops_init_one() to bpf_struct_ops_desc_init().
      
       - Move code that using BTF_ID_LIST to the newly added patch 2.
      
       - Move code that lookup struct_ops types from a given module to the
         newly added patch 5.
      
       - Store the pointers of btf at st_maps.
      
       - Add test cases for the cases of modules being unload.
      
       - Call bpf_struct_ops_init() in btf_add_struct_ops() to fix an
         inconsistent issue.
      
      Changes from v7:
      
       - Fix check_struct_ops_btf_id() to use attach btf if there is instead
         of btf_vmlinux.
      
      Changes from v6:
      
       - Change returned error code to -EINVAL for the case of
         bpf_try_get_module().
      
       - Return an error code from bpf_struct_ops_init().
      
       - Fix the dependency issue of testing_helpers.c and
         rcu_tasks_trace_gp.skel.h.
      
      Changes from v5:
      
       - As the 2nd patch, we introduce "bpf_struct_ops_desc". This change
         involves moving certain members of "bpf_struct_ops" to
         "bpf_struct_ops_desc", which becomes a part of
         "btf_struct_ops_tab". This ensures that these members remain
         accessible even when the owner module of a "bpf_struct_ops" is
         unloaded.
      
       - Correct the order of arguments when calling
          in the 3rd patch.
      
       - Remove the owner argument from bpf_struct_ops_init_one(). Instead,
         callers should fill in st_ops->owner.
      
       - Make sure to hold the owner module when calling
         bpf_struct_ops_find() and bpf_struct_ops_find_value() in the 6th
         patch.
      
       - Merge the functions register_bpf_struct_ops_btf() and
         register_bpf_struct_ops() into a single function and relocate it to
         btf.c for better organization and clarity.
      
       - Undo the name modifications made to find_kernel_btf_id() and
         find_ksym_btf_id() in the 8th patch.
      
      Changes from v4:
      
       - Fix the dependency between testing_helpers.o and
         rcu_tasks_trace_gp.skel.h.
      
      Changes from v3:
      
       - Fix according to the feedback for v3.
      
         - Change of the order of arguments to make btf as the first
           argument.
      
         - Use btf_try_get_module() instead of try_get_module() since the
           module pointed by st_ops->owner can gone while some one is still
           holding its btf.
      
         - Move variables defined by BPF_STRUCT_OPS_COMMON_VALUE to struct
           bpf_struct_ops_common_value to validation easier.
      
         - Register the struct_ops type defined by bpf_testmod in its init
           function.
      
         - Rename field name to 'value_type_btf_obj_fd' to make it explicit.
      
         - Fix leaking of btf objects on error.
      
         - st_maps hold their modules to keep modules alive and prevent they
           from unloading.
      
         - bpf_map of libbpf keeps mod_btf_fd instead of a pointer to module_btf.
      
         - Do call_rcu_tasks_trace() in kern_sync_rcu() to ensure the
           bpf_testmod is unloaded properly. It uses rcu_tasks_trace_gp to
           trigger call_rcu_tasks_trace() in the kernel.
      
       - Merge and reorder patches in a reasonable order.
      
      Changes from v2:
      
       - Remove struct_ops array, and add a per-btf (module) struct_ops_tab
         to collect registered struct_ops types.
      
       - Validate value_type by checking member names and types.
      ---
      v16: https://lore.kernel.org/all/20240118014930.1992551-1-thinker.li@gmail.com/
      v15: https://lore.kernel.org/all/20231220222654.1435895-1-thinker.li@gmail.com/
      v14: https://lore.kernel.org/all/20231217081132.1025020-1-thinker.li@gmail.com/
      v13: https://lore.kernel.org/all/20231209002709.535966-1-thinker.li@gmail.com/
      v12: https://lore.kernel.org/all/20231207013950.1689269-1-thinker.li@gmail.com/
      v11: https://lore.kernel.org/all/20231106201252.1568931-1-thinker.li@gmail.com/
      v10: https://lore.kernel.org/all/20231103232202.3664407-1-thinker.li@gmail.com/
      v9: https://lore.kernel.org/all/20231101204519.677870-1-thinker.li@gmail.com/
      v8: https://lore.kernel.org/all/20231030192810.382942-1-thinker.li@gmail.com/
      v7: https://lore.kernel.org/all/20231027211702.1374597-1-thinker.li@gmail.com/
      v6: https://lore.kernel.org/all/20231022050335.2579051-11-thinker.li@gmail.com/
      v5: https://lore.kernel.org/all/20231017162306.176586-1-thinker.li@gmail.com/
      v4: https://lore.kernel.org/all/20231013224304.187218-1-thinker.li@gmail.com/
      v3: https://lore.kernel.org/all/20230920155923.151136-1-thinker.li@gmail.com/
      v2: https://lore.kernel.org/all/20230913061449.1918219-1-thinker.li@gmail.com/
      ====================
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      8b593021
    • Kui-Feng Lee's avatar
      selftests/bpf: test case for register_bpf_struct_ops(). · 0253e059
      Kui-Feng Lee authored
      Create a new struct_ops type called bpf_testmod_ops within the bpf_testmod
      module. When a struct_ops object is registered, the bpf_testmod module will
      invoke test_2 from the module.
      Signed-off-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Link: https://lore.kernel.org/r/20240119225005.668602-15-thinker.li@gmail.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      0253e059
    • Kui-Feng Lee's avatar
      bpf: export btf_ctx_access to modules. · 7c81c249
      Kui-Feng Lee authored
      The module requires the use of btf_ctx_access() to invoke
      bpf_tracing_btf_ctx_access() from a module. This function is valuable for
      implementing validation functions that ensure proper access to ctx.
      Signed-off-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Link: https://lore.kernel.org/r/20240119225005.668602-14-thinker.li@gmail.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      7c81c249
    • Kui-Feng Lee's avatar
      libbpf: Find correct module BTFs for struct_ops maps and progs. · 9e926acd
      Kui-Feng Lee authored
      Locate the module BTFs for struct_ops maps and progs and pass them to the
      kernel. This ensures that the kernel correctly resolves type IDs from the
      appropriate module BTFs.
      
      For the map of a struct_ops object, the FD of the module BTF is set to
      bpf_map to keep a reference to the module BTF. The FD is passed to the
      kernel as value_type_btf_obj_fd when the struct_ops object is loaded.
      
      For a bpf_struct_ops prog, attach_btf_obj_fd of bpf_prog is the FD of a
      module BTF in the kernel.
      Signed-off-by: default avatarKui-Feng Lee <thinker.li@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20240119225005.668602-13-thinker.li@gmail.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      9e926acd