1. 21 Apr, 2023 19 commits
  2. 20 Apr, 2023 9 commits
  3. 19 Apr, 2023 1 commit
  4. 18 Apr, 2023 7 commits
  5. 17 Apr, 2023 4 commits
    • Yonghong Song's avatar
      selftests/bpf: Add a selftest for checking subreg equality · 49859de9
      Yonghong Song authored
      Add a selftest to ensure subreg equality if source register
      upper 32bit is 0. Without previous patch, the test will
      fail verification.
      Acked-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20230417222139.360607-1-yhs@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      49859de9
    • Yonghong Song's avatar
      bpf: Improve verifier u32 scalar equality checking · 3be49f79
      Yonghong Song authored
      In [1], I tried to remove bpf-specific codes to prevent certain
      llvm optimizations, and add llvm TTI (target transform info) hooks
      to prevent those optimizations. During this process, I found
      if I enable llvm SimplifyCFG:shouldFoldTwoEntryPHINode
      transformation, I will hit the following verification failure with selftests:
      
        ...
        8: (18) r1 = 0xffffc900001b2230       ; R1_w=map_value(off=560,ks=4,vs=564,imm=0)
        10: (61) r1 = *(u32 *)(r1 +0)         ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
        ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC)
        11: (79) r2 = *(u64 *)(r6 +152)       ; R2_w=scalar() R6=ctx(off=0,imm=0)
        ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC)
        12: (55) if r2 != 0xb9fbeef goto pc+10        ; R2_w=195018479
        13: (bc) w2 = w1                      ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
        ; if (test < __NR_TESTS)
        14: (a6) if w1 < 0x9 goto pc+1 16: R0=2 R1_w=scalar(umax=8,var_off=(0x0; 0xf)) R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6=ctx(off=0,imm=0) R10=fp0
        ;
        16: (27) r2 *= 28                     ; R2_w=scalar(umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4)
        17: (18) r3 = 0xffffc900001b2118      ; R3_w=map_value(off=280,ks=4,vs=564,imm=0)
        19: (0f) r3 += r2                     ; R2_w=scalar(umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4) R3_w=map_value(off=280,ks=4,vs=564,umax=120259084260,var_off=(0x0; 0x1ffffffffc),s32_max=2147483644,u32_max=-4)
        20: (61) r2 = *(u32 *)(r3 +0)
        R3 unbounded memory access, make sure to bounds check any such access
        processed 97 insns (limit 1000000) max_states_per_insn 1 total_states 10 peak_states 10 mark_read 6
        -- END PROG LOAD LOG --
        libbpf: prog 'ingress_fwdns_prio100': failed to load: -13
        libbpf: failed to load object 'test_tc_dtime'
        libbpf: failed to load BPF skeleton 'test_tc_dtime': -13
        ...
      
      At insn 14, with condition 'w1 < 9', register r1 is changed from an arbitrary
      u32 value to `scalar(umax=8,var_off=(0x0; 0xf))`. Register r2, however, remains
      as an arbitrary u32 value. Current verifier won't claim r1/r2 equality if
      the previous mov is alu32 ('w2 = w1').
      
      If r1 upper 32bit value is not 0, we indeed cannot clamin r1/r2 equality
      after 'w2 = w1'. But in this particular case, we know r1 upper 32bit value
      is 0, so it is safe to claim r1/r2 equality. This patch exactly did this.
      For a 32bit subreg mov, if the src register upper 32bit is 0,
      it is okay to claim equality between src and dst registers.
      
      With this patch, the above verification sequence becomes
      
        ...
        8: (18) r1 = 0xffffc9000048e230       ; R1_w=map_value(off=560,ks=4,vs=564,imm=0)
        10: (61) r1 = *(u32 *)(r1 +0)         ; R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
        ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC)
        11: (79) r2 = *(u64 *)(r6 +152)       ; R2_w=scalar() R6=ctx(off=0,imm=0)
        ; if (skb->tstamp == EGRESS_ENDHOST_MAGIC)
        12: (55) if r2 != 0xb9fbeef goto pc+10        ; R2_w=195018479
        13: (bc) w2 = w1                      ; R1_w=scalar(id=6,umax=4294967295,var_off=(0x0; 0xffffffff)) R2_w=scalar(id=6,umax=4294967295,var_off=(0x0; 0xffffffff))
        ; if (test < __NR_TESTS)
        14: (a6) if w1 < 0x9 goto pc+1        ; R1_w=scalar(id=6,umin=9,umax=4294967295,var_off=(0x0; 0xffffffff))
        ...
        from 14 to 16: R0=2 R1_w=scalar(id=6,umax=8,var_off=(0x0; 0xf)) R2_w=scalar(id=6,umax=8,var_off=(0x0; 0xf)) R6=ctx(off=0,imm=0) R10=fp0
        16: (27) r2 *= 28                     ; R2_w=scalar(umax=224,var_off=(0x0; 0xfc))
        17: (18) r3 = 0xffffc9000048e118      ; R3_w=map_value(off=280,ks=4,vs=564,imm=0)
        19: (0f) r3 += r2
        20: (61) r2 = *(u32 *)(r3 +0)         ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R3_w=map_value(off=280,ks=4,vs=564,umax=224,var_off=(0x0; 0xfc),s32_max=252,u32_max=252)
        ...
      
      and eventually the bpf program can be verified successfully.
      
        [1] https://reviews.llvm.org/D147968Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20230417222134.359714-1-yhs@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      3be49f79
    • Sean Young's avatar
      bpf: lirc program type should not require SYS_CAP_ADMIN · 69a8c792
      Sean Young authored
      Make it possible to load lirc program type with just CAP_BPF. There is
      nothing exceptional about lirc programs that means they require
      SYS_CAP_ADMIN.
      
      In order to attach or detach a lirc program type you need permission to
      open /dev/lirc0; if you have permission to do that, you can alter all
      sorts of lirc receiving options. Changing the IR protocol decoder is no
      different.
      
      Right now on a typical distribution /dev/lirc devices are only
      read/write by root. Ideally we would make them group read/write like
      other devices so that local users can use them without becoming root.
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Link: https://lore.kernel.org/r/ZD0ArKpwnDBJZsrE@gofer.mess.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      69a8c792
    • Daniel Borkmann's avatar
      bpf: Set skb redirect and from_ingress info in __bpf_tx_skb · 59e498a3
      Daniel Borkmann authored
      There are some use-cases where it is desirable to use bpf_redirect()
      in combination with ifb device, which currently is not supported, for
      example, around filtering inbound traffic with BPF to then push it to
      ifb which holds the qdisc for shaping in contrast to doing that on the
      egress device.
      
      Toke mentions the following case related to OpenWrt:
      
         Because there's not always a single egress on the other side. These are
         mainly home routers, which tend to have one or more WiFi devices bridged
         to one or more ethernet ports on the LAN side, and a single upstream WAN
         port. And the objective is to control the total amount of traffic going
         over the WAN link (in both directions), to deal with bufferbloat in the
         ISP network (which is sadly still all too prevalent).
      
         In this setup, the traffic can be split arbitrarily between the links
         on the LAN side, and the only "single bottleneck" is the WAN link. So we
         install both egress and ingress shapers on this, configured to something
         like 95-98% of the true link bandwidth, thus moving the queues into the
         qdisc layer in the router. It's usually necessary to set the ingress
         bandwidth shaper a bit lower than the egress due to being "downstream"
         of the bottleneck link, but it does work surprisingly well.
      
         We usually use something like a matchall filter to put all ingress
         traffic on the ifb, so doing the redirect from BPF has not been an
         immediate requirement thus far. However, it does seem a bit odd that
         this is not possible, and we do have a BPF-based filter that layers on
         top of this kind of setup, which currently uses u32 as the ingress
         filter and so it could presumably be improved to use BPF instead if
         that was available.
      Reported-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Reported-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Reported-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
      Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk
      Link: https://lore.kernel.org/r/8cebc8b2b6e967e10cbafe2ffd6795050e74accd.1681739137.git.daniel@iogearbox.netSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      59e498a3