1. 05 Jun, 2023 3 commits
    • Dave Marchevsky's avatar
      bpf: Make bpf_refcount_acquire fallible for non-owning refs · 7793fc3b
      Dave Marchevsky authored
      This patch fixes an incorrect assumption made in the original
      bpf_refcount series [0], specifically that the BPF program calling
      bpf_refcount_acquire on some node can always guarantee that the node is
      alive. In that series, the patch adding failure behavior to rbtree_add
      and list_push_{front, back} breaks this assumption for non-owning
      references.
      
      Consider the following program:
      
        n = bpf_kptr_xchg(&mapval, NULL);
        /* skip error checking */
      
        bpf_spin_lock(&l);
        if(bpf_rbtree_add(&t, &n->rb, less)) {
          bpf_refcount_acquire(n);
          /* Failed to add, do something else with the node */
        }
        bpf_spin_unlock(&l);
      
      It's incorrect to assume that bpf_refcount_acquire will always succeed in this
      scenario. bpf_refcount_acquire is being called in a critical section
      here, but the lock being held is associated with rbtree t, which isn't
      necessarily the lock associated with the tree that the node is already
      in. So after bpf_rbtree_add fails to add the node and calls bpf_obj_drop
      in it, the program has no ownership of the node's lifetime. Therefore
      the node's refcount can be decr'd to 0 at any time after the failing
      rbtree_add. If this happens before the refcount_acquire above, the node
      might be free'd, and regardless refcount_acquire will be incrementing a
      0 refcount.
      
      Later patches in the series exercise this scenario, resulting in the
      expected complaint from the kernel (without this patch's changes):
      
        refcount_t: addition on 0; use-after-free.
        WARNING: CPU: 1 PID: 207 at lib/refcount.c:25 refcount_warn_saturate+0xbc/0x110
        Modules linked in: bpf_testmod(O)
        CPU: 1 PID: 207 Comm: test_progs Tainted: G           O       6.3.0-rc7-02231-g723de1a718a2-dirty #371
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
        RIP: 0010:refcount_warn_saturate+0xbc/0x110
        Code: 6f 64 f6 02 01 e8 84 a3 5c ff 0f 0b eb 9d 80 3d 5e 64 f6 02 00 75 94 48 c7 c7 e0 13 d2 82 c6 05 4e 64 f6 02 01 e8 64 a3 5c ff <0f> 0b e9 7a ff ff ff 80 3d 38 64 f6 02 00 0f 85 6d ff ff ff 48 c7
        RSP: 0018:ffff88810b9179b0 EFLAGS: 00010082
        RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000
        RDX: 0000000000000202 RSI: 0000000000000008 RDI: ffffffff857c3680
        RBP: ffff88810027d3c0 R08: ffffffff8125f2a4 R09: ffff88810b9176e7
        R10: ffffed1021722edc R11: 746e756f63666572 R12: ffff88810027d388
        R13: ffff88810027d3c0 R14: ffffc900005fe030 R15: ffffc900005fe048
        FS:  00007fee0584a700(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00005634a96f6c58 CR3: 0000000108ce9002 CR4: 0000000000770ee0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        PKRU: 55555554
        Call Trace:
         <TASK>
         bpf_refcount_acquire_impl+0xb5/0xc0
      
        (rest of output snipped)
      
      The patch addresses this by changing bpf_refcount_acquire_impl to use
      refcount_inc_not_zero instead of refcount_inc and marking
      bpf_refcount_acquire KF_RET_NULL.
      
      For owning references, though, we know the above scenario is not possible
      and thus that bpf_refcount_acquire will always succeed. Some verifier
      bookkeeping is added to track "is input owning ref?" for bpf_refcount_acquire
      calls and return false from is_kfunc_ret_null for bpf_refcount_acquire on
      owning refs despite it being marked KF_RET_NULL.
      
      Existing selftests using bpf_refcount_acquire are modified where
      necessary to NULL-check its return value.
      
        [0]: https://lore.kernel.org/bpf/20230415201811.343116-1-davemarchevsky@fb.com/
      
      Fixes: d2dcc67d ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail")
      Reported-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20230602022647.1571784-5-davemarchevsky@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      7793fc3b
    • Dave Marchevsky's avatar
      bpf: Fix __bpf_{list,rbtree}_add's beginning-of-node calculation · cc0d76ca
      Dave Marchevsky authored
      Given the pointer to struct bpf_{rb,list}_node within a local kptr and
      the byte offset of that field within the kptr struct, the calculation changed
      by this patch is meant to find the beginning of the kptr so that it can
      be passed to bpf_obj_drop.
      
      Unfortunately instead of doing
      
        ptr_to_kptr = ptr_to_node_field - offset_bytes
      
      the calculation is erroneously doing
      
        ptr_to_ktpr = ptr_to_node_field - (offset_bytes * sizeof(struct bpf_rb_node))
      
      or the bpf_list_node equivalent.
      
      This patch fixes the calculation.
      
      Fixes: d2dcc67d ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail")
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20230602022647.1571784-4-davemarchevsky@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cc0d76ca
    • Dave Marchevsky's avatar
      bpf: Set kptr_struct_meta for node param to list and rbtree insert funcs · 2140a6e3
      Dave Marchevsky authored
      In verifier.c, fixup_kfunc_call uses struct bpf_insn_aux_data's
      kptr_struct_meta field to pass information about local kptr types to
      various helpers and kfuncs at runtime. The recent bpf_refcount series
      added a few functions to the set that need this information:
      
        * bpf_refcount_acquire
          * Needs to know where the refcount field is in order to increment
        * Graph collection insert kfuncs: bpf_rbtree_add, bpf_list_push_{front,back}
          * Were migrated to possibly fail by the bpf_refcount series. If
            insert fails, the input node is bpf_obj_drop'd. bpf_obj_drop needs
            the kptr_struct_meta in order to decr refcount and properly free
            special fields.
      
      Unfortunately the verifier handling of collection insert kfuncs was not
      modified to actually populate kptr_struct_meta. Accordingly, when the
      node input to those kfuncs is passed to bpf_obj_drop, it is done so
      without the information necessary to decr refcount.
      
      This patch fixes the issue by populating kptr_struct_meta for those
      kfuncs.
      
      Fixes: d2dcc67d ("bpf: Migrate bpf_rbtree_add and bpf_list_push_{front,back} to possibly fail")
      Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Link: https://lore.kernel.org/r/20230602022647.1571784-3-davemarchevsky@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2140a6e3
  2. 01 Jun, 2023 2 commits
    • Louis DeLosSantos's avatar
      selftests/bpf: Test table ID fib lookup BPF helper · d4ae3e58
      Louis DeLosSantos authored
      Add additional test cases to `fib_lookup.c` prog_test.
      
      These test cases add a new /24 network to the previously unused veth2
      device, removes the directly connected route from the main routing table
      and moves it to table 100.
      
      The first test case then confirms a fib lookup for a remote address in
      this directly connected network, using the main routing table fails.
      
      The second test case ensures the same fib lookup using table 100 succeeds.
      
      An additional pair of tests which function in the same manner are added
      for IPv6.
      Signed-off-by: default avatarLouis DeLosSantos <louis.delos.devel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-2-0a31c22c748c@gmail.com
      d4ae3e58
    • Louis DeLosSantos's avatar
      bpf: Add table ID to bpf_fib_lookup BPF helper · 8ad77e72
      Louis DeLosSantos authored
      Add ability to specify routing table ID to the `bpf_fib_lookup` BPF
      helper.
      
      A new field `tbid` is added to `struct bpf_fib_lookup` used as
      parameters to the `bpf_fib_lookup` BPF helper.
      
      When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` and
      `BPF_FIB_LOOKUP_TBID` flags the `tbid` field in `struct bpf_fib_lookup`
      will be used as the table ID for the fib lookup.
      
      If the `tbid` does not exist the fib lookup will fail with
      `BPF_FIB_LKUP_RET_NOT_FWDED`.
      
      The `tbid` field becomes a union over the vlan related output fields
      in `struct bpf_fib_lookup` and will be zeroed immediately after usage.
      
      This functionality is useful in containerized environments.
      
      For instance, if a CNI wants to dictate the next-hop for traffic leaving
      a container it can create a container-specific routing table and perform
      a fib lookup against this table in a "host-net-namespace-side" TC program.
      
      This functionality also allows `ip rule` like functionality at the TC
      layer, allowing an eBPF program to pick a routing table based on some
      aspect of the sk_buff.
      
      As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN
      datapath.
      
      When egress traffic leaves a Pod an eBPF program attached by Cilium will
      determine which VRF the egress traffic should target, and then perform a
      FIB lookup in a specific table representing this VRF's FIB.
      Signed-off-by: default avatarLouis DeLosSantos <louis.delos.devel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-1-0a31c22c748c@gmail.com
      8ad77e72
  3. 31 May, 2023 4 commits
  4. 30 May, 2023 31 commits