1. 13 Jun, 2024 4 commits
    • Daniel Borkmann's avatar
      bpf: Reduce stack consumption in check_stack_write_fixed_off · e73cd1cf
      Daniel Borkmann authored
      The fake_reg moved into env->fake_reg given it consumes a lot of stack
      space (120 bytes). Migrate the fake_reg in check_stack_write_fixed_off()
      as well now that we have it.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20240613115310.25383-2-daniel@iogearbox.netSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e73cd1cf
    • Daniel Borkmann's avatar
      bpf: Fix reg_set_min_max corruption of fake_reg · 92424801
      Daniel Borkmann authored
      Juan reported that after doing some changes to buzzer [0] and implementing
      a new fuzzing strategy guided by coverage, they noticed the following in
      one of the probes:
      
        [...]
        13: (79) r6 = *(u64 *)(r0 +0)         ; R0=map_value(ks=4,vs=8) R6_w=scalar()
        14: (b7) r0 = 0                       ; R0_w=0
        15: (b4) w0 = -1                      ; R0_w=0xffffffff
        16: (74) w0 >>= 1                     ; R0_w=0x7fffffff
        17: (5c) w6 &= w0                     ; R0_w=0x7fffffff R6_w=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff))
        18: (44) w6 |= 2                      ; R6_w=scalar(smin=umin=smin32=umin32=2,smax=umax=umax32=0x7fffffff,var_off=(0x2; 0x7ffffffd))
        19: (56) if w6 != 0x7ffffffd goto pc+1
        REG INVARIANTS VIOLATION (true_reg2): range bounds violation u64=[0x7fffffff, 0x7ffffffd] s64=[0x7fffffff, 0x7ffffffd] u32=[0x7fffffff, 0x7ffffffd] s32=[0x7fffffff, 0x7ffffffd] var_off=(0x7fffffff, 0x0)
        REG INVARIANTS VIOLATION (false_reg1): range bounds violation u64=[0x7fffffff, 0x7ffffffd] s64=[0x7fffffff, 0x7ffffffd] u32=[0x7fffffff, 0x7ffffffd] s32=[0x7fffffff, 0x7ffffffd] var_off=(0x7fffffff, 0x0)
        REG INVARIANTS VIOLATION (false_reg2): const tnum out of sync with range bounds u64=[0x0, 0xffffffffffffffff] s64=[0x8000000000000000, 0x7fffffffffffffff] u32=[0x0, 0xffffffff] s32=[0x80000000, 0x7fffffff] var_off=(0x7fffffff, 0x0)
        19: R6_w=0x7fffffff
        20: (95) exit
      
        from 19 to 21: R0=0x7fffffff R6=scalar(smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=0x7ffffffe,var_off=(0x2; 0x7ffffffd)) R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
        21: R0=0x7fffffff R6=scalar(smin=umin=smin32=umin32=2,smax=umax=smax32=umax32=0x7ffffffe,var_off=(0x2; 0x7ffffffd)) R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
        21: (14) w6 -= 2147483632             ; R6_w=scalar(smin=umin=umin32=2,smax=umax=0xffffffff,smin32=0x80000012,smax32=14,var_off=(0x2; 0xfffffffd))
        22: (76) if w6 s>= 0xe goto pc+1      ; R6_w=scalar(smin=umin=umin32=2,smax=umax=0xffffffff,smin32=0x80000012,smax32=13,var_off=(0x2; 0xfffffffd))
        23: (95) exit
      
        from 22 to 24: R0=0x7fffffff R6_w=14 R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
        24: R0=0x7fffffff R6_w=14 R7=map_ptr(ks=4,vs=8) R9=ctx() R10=fp0 fp-24=map_ptr(ks=4,vs=8) fp-40=mmmmmmmm
        24: (14) w6 -= 14                     ; R6_w=0
        [...]
      
      What can be seen here is a register invariant violation on line 19. After
      the binary-or in line 18, the verifier knows that bit 2 is set but knows
      nothing about the rest of the content which was loaded from a map value,
      meaning, range is [2,0x7fffffff] with var_off=(0x2; 0x7ffffffd). When in
      line 19 the verifier analyzes the branch, it splits the register states
      in reg_set_min_max() into the registers of the true branch (true_reg1,
      true_reg2) and the registers of the false branch (false_reg1, false_reg2).
      
      Since the test is w6 != 0x7ffffffd, the src_reg is a known constant.
      Internally, the verifier creates a "fake" register initialized as scalar
      to the value of 0x7ffffffd, and then passes it onto reg_set_min_max(). Now,
      for line 19, it is mathematically impossible to take the false branch of
      this program, yet the verifier analyzes it. It is impossible because the
      second bit of r6 will be set due to the prior or operation and the
      constant in the condition has that bit unset (hex(fd) == binary(1111 1101).
      
      When the verifier first analyzes the false / fall-through branch, it will
      compute an intersection between the var_off of r6 and of the constant. This
      is because the verifier creates a "fake" register initialized to the value
      of the constant. The intersection result later refines both registers in
      regs_refine_cond_op():
      
        [...]
        t = tnum_intersect(tnum_subreg(reg1->var_off), tnum_subreg(reg2->var_off));
        reg1->var_off = tnum_with_subreg(reg1->var_off, t);
        reg2->var_off = tnum_with_subreg(reg2->var_off, t);
        [...]
      
      Since the verifier is analyzing the false branch of the conditional jump,
      reg1 is equal to false_reg1 and reg2 is equal to false_reg2, i.e. the reg2
      is the "fake" register that was meant to hold a constant value. The resulting
      var_off of the intersection says that both registers now hold a known value
      of var_off=(0x7fffffff, 0x0) or in other words: this operation manages to
      make the verifier think that the "constant" value that was passed in the
      jump operation now holds a different value.
      
      Normally this would not be an issue since it should not influence the true
      branch, however, false_reg2 and true_reg2 are pointers to the same "fake"
      register. Meaning, the false branch can influence the results of the true
      branch. In line 24, the verifier assumes R6_w=0, but the actual runtime
      value in this case is 1. The fix is simply not passing in the same "fake"
      register location as inputs to reg_set_min_max(), but instead making a
      copy. Moving the fake_reg into the env also reduces stack consumption by
      120 bytes. With this, the verifier successfully rejects invalid accesses
      from the test program.
      
        [0] https://github.com/google/buzzer
      
      Fixes: 67420501 ("bpf: generalize reg_set_min_max() to handle non-const register comparisons")
      Reported-by: default avatarJuan José López Jaimez <jjlopezjaimez@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/r/20240613115310.25383-1-daniel@iogearbox.netSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      92424801
    • Stanislav Fomichev's avatar
      26ba7c3f
    • Petr Pavlu's avatar
      net/ipv6: Fix the RT cache flush via sysctl using a previous delay · 14a20e5b
      Petr Pavlu authored
      The net.ipv6.route.flush system parameter takes a value which specifies
      a delay used during the flush operation for aging exception routes. The
      written value is however not used in the currently requested flush and
      instead utilized only in the next one.
      
      A problem is that ipv6_sysctl_rtcache_flush() first reads the old value
      of net->ipv6.sysctl.flush_delay into a local delay variable and then
      calls proc_dointvec() which actually updates the sysctl based on the
      provided input.
      
      Fix the problem by switching the order of the two operations.
      
      Fixes: 4990509f ("[NETNS][IPV6]: Make sysctls route per namespace.")
      Signed-off-by: default avatarPetr Pavlu <petr.pavlu@suse.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      14a20e5b
  2. 12 Jun, 2024 5 commits
  3. 11 Jun, 2024 10 commits
  4. 10 Jun, 2024 9 commits
  5. 09 Jun, 2024 1 commit
  6. 07 Jun, 2024 6 commits
  7. 06 Jun, 2024 5 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d30d0e49
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from BPF and big collection of fixes for WiFi core and
        drivers.
      
        Current release - regressions:
      
         - vxlan: fix regression when dropping packets due to invalid src
           addresses
      
         - bpf: fix a potential use-after-free in bpf_link_free()
      
         - xdp: revert support for redirect to any xsk socket bound to the
           same UMEM as it can result in a corruption
      
         - virtio_net:
            - add missing lock protection when reading return code from
              control_buf
            - fix false-positive lockdep splat in DIM
            - Revert "wifi: wilc1000: convert list management to RCU"
      
         - wifi: ath11k: fix error path in ath11k_pcic_ext_irq_config
      
        Previous releases - regressions:
      
         - rtnetlink: make the "split" NLM_DONE handling generic, restore the
           old behavior for two cases where we started coalescing those
           messages with normal messages, breaking sloppily-coded userspace
      
         - wifi:
            - cfg80211: validate HE operation element parsing
            - cfg80211: fix 6 GHz scan request building
            - mt76: mt7615: add missing chanctx ops
            - ath11k: move power type check to ASSOC stage, fix connecting to
              6 GHz AP
            - ath11k: fix WCN6750 firmware crash caused by 17 num_vdevs
            - rtlwifi: ignore IEEE80211_CONF_CHANGE_RETRY_LIMITS
            - iwlwifi: mvm: fix a crash on 7265
      
        Previous releases - always broken:
      
         - ncsi: prevent multi-threaded channel probing, a spec violation
      
         - vmxnet3: disable rx data ring on dma allocation failure
      
         - ethtool: init tsinfo stats if requested, prevent unintentionally
           reporting all-zero stats on devices which don't implement any
      
         - dst_cache: fix possible races in less common IPv6 features
      
         - tcp: auth: don't consider TCP_CLOSE to be in TCP_AO_ESTABLISHED
      
         - ax25: fix two refcounting bugs
      
         - eth: ionic: fix kernel panic in XDP_TX action
      
        Misc:
      
         - tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTAB"
      
      * tag 'net-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
        selftests: net: lib: set 'i' as local
        selftests: net: lib: avoid error removing empty netns name
        selftests: net: lib: support errexit with busywait
        net: ethtool: fix the error condition in ethtool_get_phy_stats_ethtool()
        ipv6: fix possible race in __fib6_drop_pcpu_from()
        af_unix: Annotate data-race of sk->sk_shutdown in sk_diag_fill().
        af_unix: Use skb_queue_len_lockless() in sk_diag_show_rqlen().
        af_unix: Use skb_queue_empty_lockless() in unix_release_sock().
        af_unix: Use unix_recvq_full_lockless() in unix_stream_connect().
        af_unix: Annotate data-race of net->unx.sysctl_max_dgram_qlen.
        af_unix: Annotate data-races around sk->sk_sndbuf.
        af_unix: Annotate data-races around sk->sk_state in UNIX_DIAG.
        af_unix: Annotate data-race of sk->sk_state in unix_stream_read_skb().
        af_unix: Annotate data-races around sk->sk_state in sendmsg() and recvmsg().
        af_unix: Annotate data-race of sk->sk_state in unix_accept().
        af_unix: Annotate data-race of sk->sk_state in unix_stream_connect().
        af_unix: Annotate data-races around sk->sk_state in unix_write_space() and poll().
        af_unix: Annotate data-race of sk->sk_state in unix_inq_len().
        af_unix: Annodate data-races around sk->sk_state for writers.
        af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.
        ...
      d30d0e49
    • Linus Torvalds's avatar
      Merge tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo · 2faf6332
      Linus Torvalds authored
      Pull tomoyo fixlet from Tetsuo Handa:
       "Single patch to update project links, no behavior changes"
      
      * tag 'tomoyo-pr-20240606' of git://git.code.sf.net/p/tomoyo/tomoyo:
        tomoyo: update project links
      2faf6332
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · a34adf60
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - Ensure that .discard sections are really discarded in the EFI zboot
         image build
      
       - Return proper error numbers from efi-pstore
      
       - Add __nocfi annotations to EFI runtime wrappers
      
      * tag 'efi-fixes-for-v6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: Add missing __nocfi annotations to runtime wrappers
        efi: pstore: Return proper errors on UEFI failures
        efi/libstub: zboot.lds: Discard .discard sections
      a34adf60
    • Jakub Kicinski's avatar
      Merge branch 'selftests-net-lib-small-fixes' · 27bc8654
      Jakub Kicinski authored
      Matthieu Baerts says:
      
      ====================
      selftests: net: lib: small fixes
      
      While looking at using 'lib.sh' for the MPTCP selftests [1], we found
      some small issues with 'lib.sh'. Here they are:
      
      - Patch 1: fix 'errexit' (set -e) support with busywait. 'errexit' is
        supported in some functions, not all. A fix for v6.8+.
      
      - Patch 2: avoid confusing error messages linked to the cleaning part
        when the netns setup fails. A fix for v6.8+.
      
      - Patch 3: set a variable as local to avoid accidentally changing the
        value of a another one with the same name on the caller side. A fix
        for v6.10-rc1+.
      
      Link: https://lore.kernel.org/mptcp/5f4615c3-0621-43c5-ad25-55747a4350ce@kernel.org/T/ [1]
      ====================
      
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-0-b3afadd368c9@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      27bc8654
    • Matthieu Baerts (NGI0)'s avatar
      selftests: net: lib: set 'i' as local · 84a8bc3e
      Matthieu Baerts (NGI0) authored
      Without this, the 'i' variable declared before could be overridden by
      accident, e.g.
      
        for i in "${@}"; do
            __ksft_status_merge "${i}"  ## 'i' has been modified
            foo "${i}"                  ## using 'i' with an unexpected value
        done
      
      After a quick look, it looks like 'i' is currently not used after having
      been modified in __ksft_status_merge(), but still, better be safe than
      sorry. I saw this while modifying the same file, not because I suspected
      an issue somewhere.
      
      Fixes: 596c8819 ("selftests: forwarding: Have RET track kselftest framework constants")
      Acked-by: default avatarGeliang Tang <geliang@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20240605-upstream-net-20240605-selftests-net-lib-fixes-v1-3-b3afadd368c9@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      84a8bc3e