1. 28 Sep, 2022 1 commit
    • Liu Jian's avatar
      xfrm: Reinject transport-mode packets through workqueue · 4f492066
      Liu Jian authored
      The following warning is displayed when the tcp6-multi-diffip11 stress
      test case of the LTP test suite is tested:
      
      watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [ns-tcpserver:48198]
      CPU: 0 PID: 48198 Comm: ns-tcpserver Kdump: loaded Not tainted 6.0.0-rc6+ #39
      Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : des3_ede_encrypt+0x27c/0x460 [libdes]
      lr : 0x3f
      sp : ffff80000ceaa1b0
      x29: ffff80000ceaa1b0 x28: ffff0000df056100 x27: ffff0000e51e5280
      x26: ffff80004df75030 x25: ffff0000e51e4600 x24: 000000000000003b
      x23: 0000000000802080 x22: 000000000000003d x21: 0000000000000038
      x20: 0000000080000020 x19: 000000000000000a x18: 0000000000000033
      x17: ffff0000e51e4780 x16: ffff80004e2d1448 x15: ffff80004e2d1248
      x14: ffff0000e51e4680 x13: ffff80004e2d1348 x12: ffff80004e2d1548
      x11: ffff80004e2d1848 x10: ffff80004e2d1648 x9 : ffff80004e2d1748
      x8 : ffff80004e2d1948 x7 : 000000000bcaf83d x6 : 000000000000001b
      x5 : ffff80004e2d1048 x4 : 00000000761bf3bf x3 : 000000007f1dd0a3
      x2 : ffff0000e51e4780 x1 : ffff0000e3b9a2f8 x0 : 00000000db44e872
      Call trace:
       des3_ede_encrypt+0x27c/0x460 [libdes]
       crypto_des3_ede_encrypt+0x1c/0x30 [des_generic]
       crypto_cbc_encrypt+0x148/0x190
       crypto_skcipher_encrypt+0x2c/0x40
       crypto_authenc_encrypt+0xc8/0xfc [authenc]
       crypto_aead_encrypt+0x2c/0x40
       echainiv_encrypt+0x144/0x1a0 [echainiv]
       crypto_aead_encrypt+0x2c/0x40
       esp6_output_tail+0x1c8/0x5d0 [esp6]
       esp6_output+0x120/0x278 [esp6]
       xfrm_output_one+0x458/0x4ec
       xfrm_output_resume+0x6c/0x1f0
       xfrm_output+0xac/0x4ac
       __xfrm6_output+0x130/0x270
       xfrm6_output+0x60/0xec
       ip6_xmit+0x2ec/0x5bc
       inet6_csk_xmit+0xbc/0x10c
       __tcp_transmit_skb+0x460/0x8c0
       tcp_write_xmit+0x348/0x890
       __tcp_push_pending_frames+0x44/0x110
       tcp_rcv_established+0x3c8/0x720
       tcp_v6_do_rcv+0xdc/0x4a0
       tcp_v6_rcv+0xc24/0xcb0
       ip6_protocol_deliver_rcu+0xf0/0x574
       ip6_input_finish+0x48/0x7c
       ip6_input+0x48/0xc0
       ip6_rcv_finish+0x80/0x9c
       xfrm_trans_reinject+0xb0/0xf4
       tasklet_action_common.constprop.0+0xf8/0x134
       tasklet_action+0x30/0x3c
       __do_softirq+0x128/0x368
       do_softirq+0xb4/0xc0
       __local_bh_enable_ip+0xb0/0xb4
       put_cpu_fpsimd_context+0x40/0x70
       kernel_neon_end+0x20/0x40
       sha1_base_do_update.constprop.0.isra.0+0x11c/0x140 [sha1_ce]
       sha1_ce_finup+0x94/0x110 [sha1_ce]
       crypto_shash_finup+0x34/0xc0
       hmac_finup+0x48/0xe0
       crypto_shash_finup+0x34/0xc0
       shash_digest_unaligned+0x74/0x90
       crypto_shash_digest+0x4c/0x9c
       shash_ahash_digest+0xc8/0xf0
       shash_async_digest+0x28/0x34
       crypto_ahash_digest+0x48/0xcc
       crypto_authenc_genicv+0x88/0xcc [authenc]
       crypto_authenc_encrypt+0xd8/0xfc [authenc]
       crypto_aead_encrypt+0x2c/0x40
       echainiv_encrypt+0x144/0x1a0 [echainiv]
       crypto_aead_encrypt+0x2c/0x40
       esp6_output_tail+0x1c8/0x5d0 [esp6]
       esp6_output+0x120/0x278 [esp6]
       xfrm_output_one+0x458/0x4ec
       xfrm_output_resume+0x6c/0x1f0
       xfrm_output+0xac/0x4ac
       __xfrm6_output+0x130/0x270
       xfrm6_output+0x60/0xec
       ip6_xmit+0x2ec/0x5bc
       inet6_csk_xmit+0xbc/0x10c
       __tcp_transmit_skb+0x460/0x8c0
       tcp_write_xmit+0x348/0x890
       __tcp_push_pending_frames+0x44/0x110
       tcp_push+0xb4/0x14c
       tcp_sendmsg_locked+0x71c/0xb64
       tcp_sendmsg+0x40/0x6c
       inet6_sendmsg+0x4c/0x80
       sock_sendmsg+0x5c/0x6c
       __sys_sendto+0x128/0x15c
       __arm64_sys_sendto+0x30/0x40
       invoke_syscall+0x50/0x120
       el0_svc_common.constprop.0+0x170/0x194
       do_el0_svc+0x38/0x4c
       el0_svc+0x28/0xe0
       el0t_64_sync_handler+0xbc/0x13c
       el0t_64_sync+0x180/0x184
      
      Get softirq info by bcc tool:
      ./softirqs -NT 10
      Tracing soft irq event time... Hit Ctrl-C to end.
      
      15:34:34
      SOFTIRQ          TOTAL_nsecs
      block                 158990
      timer               20030920
      sched               46577080
      net_rx             676746820
      tasklet           9906067650
      
      15:34:45
      SOFTIRQ          TOTAL_nsecs
      block                  86100
      sched               38849790
      net_rx             676532470
      timer             1163848790
      tasklet           9409019620
      
      15:34:55
      SOFTIRQ          TOTAL_nsecs
      sched               58078450
      net_rx             475156720
      timer              533832410
      tasklet           9431333300
      
      The tasklet software interrupt takes too much time. Therefore, the
      xfrm_trans_reinject executor is changed from tasklet to workqueue. Add add
      spin lock to protect the queue. This reduces the processing flow of the
      tcp_sendmsg function in this scenario.
      
      Fixes: acf568ee ("xfrm: Reinject transport-mode packets through tasklet")
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      4f492066
  2. 01 Sep, 2022 1 commit
    • Khalid Masum's avatar
      xfrm: Update ipcomp_scratches with NULL when freed · 8a04d2fc
      Khalid Masum authored
      Currently if ipcomp_alloc_scratches() fails to allocate memory
      ipcomp_scratches holds obsolete address. So when we try to free the
      percpu scratches using ipcomp_free_scratches() it tries to vfree non
      existent vm area. Described below:
      
      static void * __percpu *ipcomp_alloc_scratches(void)
      {
              ...
              scratches = alloc_percpu(void *);
              if (!scratches)
                      return NULL;
      ipcomp_scratches does not know about this allocation failure.
      Therefore holding the old obsolete address.
              ...
      }
      
      So when we free,
      
      static void ipcomp_free_scratches(void)
      {
              ...
              scratches = ipcomp_scratches;
      Assigning obsolete address from ipcomp_scratches
      
              if (!scratches)
                      return;
      
              for_each_possible_cpu(i)
                     vfree(*per_cpu_ptr(scratches, i));
      Trying to free non existent page, causing warning: trying to vfree
      existent vm area.
              ...
      }
      
      Fix this breakage by updating ipcomp_scrtches with NULL when scratches
      is freed
      Suggested-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reported-by: syzbot+5ec9bb042ddfe9644773@syzkaller.appspotmail.com
      Tested-by: syzbot+5ec9bb042ddfe9644773@syzkaller.appspotmail.com
      Signed-off-by: default avatarKhalid Masum <khalid.masum.92@gmail.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      8a04d2fc
  3. 29 Aug, 2022 1 commit
  4. 27 Aug, 2022 6 commits
  5. 26 Aug, 2022 2 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 2e085ec0
      David S. Miller authored
      Daniel borkmann says:
      
      ====================
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 11 non-merge commits during the last 14 day(s) which contain
      a total of 13 files changed, 61 insertions(+), 24 deletions(-).
      
      The main changes are:
      
      1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi.
      
      2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger.
      
      3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu.
      
      4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui.
      
      5) Fix range tracking for array poke descriptors, from Daniel Borkmann.
      
      6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson.
      
      7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian.
      
      8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima.
      
      9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e085ec0
    • David S. Miller's avatar
      Merge tag 'wireless-2022-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 4ba9d38b
      David S. Miller authored
      Johannes Berg says:
      
      ====================
      pull-request: wireless-2022-08-26
      
      Here are a couple of fixes for the current cycle,
      see the tag description below.
      
      Just a couple of fixes:
       * two potential leaks
       * use-after-free in certain scan races
       * warning in IBSS code
       * error return from a debugfs file was wrong
       * possible NULL-ptr-deref when station lookup fails
      
      Please pull and let me know if there's any problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ba9d38b
  6. 25 Aug, 2022 29 commits
    • Zhengping Jiang's avatar
      Bluetooth: hci_sync: hold hdev->lock when cleanup hci_conn · 2da8eb83
      Zhengping Jiang authored
      When disconnecting all devices, hci_conn_failed is used to cleanup
      hci_conn object when the hci_conn object cannot be aborted.
      The function hci_conn_failed requires the caller holds hdev->lock.
      
      Fixes: 9b3628d7 ("Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted")
      Signed-off-by: default avatarZhengping Jiang <jiangzp@google.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      2da8eb83
    • Wolfram Sang's avatar
      Bluetooth: move from strlcpy with unused retval to strscpy · cb0d160f
      Wolfram Sang authored
      Follow the advice of the below link and prefer 'strscpy' in this
      subsystem. Conversion is 1:1 because the return value is not used.
      Generated by a coccinelle script.
      
      Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      cb0d160f
    • Archie Pusaka's avatar
      Bluetooth: hci_event: Fix checking conn for le_conn_complete_evt · f48735a9
      Archie Pusaka authored
      To prevent multiple conn complete events, we shouldn't look up the
      conn with hci_lookup_le_connect, since it requires the state to be
      BT_CONNECT. By the time the duplicate event is processed, the state
      might have changed, so we end up processing the new event anyway.
      
      Change the lookup function to hci_conn_hash_lookup_ba.
      
      Fixes: d5ebaa7c ("Bluetooth: hci_event: Ignore multiple conn complete events")
      Signed-off-by: default avatarArchie Pusaka <apusaka@chromium.org>
      Reviewed-by: default avatarSonny Sasaka <sonnysasaka@chromium.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      f48735a9
    • Luiz Augusto von Dentz's avatar
      Bluetooth: ISO: Fix not handling shutdown condition · c5729093
      Luiz Augusto von Dentz authored
      In order to properly handle shutdown syscall the code shall not assume
      that the how argument is always SHUT_RDWR resulting in SHUTDOWN_MASK as
      that would result in poll to immediately report EPOLLHUP instead of
      properly waiting for disconnect_cfm (Disconnect Complete) which is
      rather important for the likes of BAP as the CIG may need to be
      reprogrammed.
      
      Fixes: ccf74f23 ("Bluetooth: Add BTPROTO_ISO socket type")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      c5729093
    • Tetsuo Handa's avatar
      Bluetooth: hci_sync: fix double mgmt_pending_free() in remove_adv_monitor() · 3cfbc6ac
      Tetsuo Handa authored
      syzbot is reporting double kfree() at remove_adv_monitor() [1], for
      commit 7cf5c297 ("Bluetooth: hci_sync: Refactor remove Adv
      Monitor") forgot to remove duplicated mgmt_pending_remove() when
      merging "if (err) {" path and "if (!pending) {" path.
      
      Link: https://syzkaller.appspot.com/bug?extid=915a8416bf15895b8e07 [1]
      Reported-by: default avatarsyzbot <syzbot+915a8416bf15895b8e07@syzkaller.appspotmail.com>
      Fixes: 7cf5c297 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      3cfbc6ac
    • Luiz Augusto von Dentz's avatar
      Bluetooth: MGMT: Fix Get Device Flags · 23b72814
      Luiz Augusto von Dentz authored
      Get Device Flags don't check if device does actually use an RPA in which
      case it shall only set HCI_CONN_FLAG_REMOTE_WAKEUP if LL Privacy is
      enabled.
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      23b72814
    • Luiz Augusto von Dentz's avatar
      Bluetooth: L2CAP: Fix build errors in some archs · b840304f
      Luiz Augusto von Dentz authored
      This attempts to fix the follow errors:
      
      In function 'memcmp',
          inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9,
          inlined from 'l2cap_global_chan_by_psm' at
          net/bluetooth/l2cap_core.c:2003:15:
      ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp'
      specified bound 6 exceeds source size 0 [-Werror=stringop-overread]
         44 | #define __underlying_memcmp     __builtin_memcmp
            |                                 ^
      ./include/linux/fortify-string.h:420:16: note: in expansion of macro
      '__underlying_memcmp'
        420 |         return __underlying_memcmp(p, q, size);
            |                ^~~~~~~~~~~~~~~~~~~
      In function 'memcmp',
          inlined from 'bacmp' at ./include/net/bluetooth/bluetooth.h:347:9,
          inlined from 'l2cap_global_chan_by_psm' at
          net/bluetooth/l2cap_core.c:2004:15:
      ./include/linux/fortify-string.h:44:33: error: '__builtin_memcmp'
      specified bound 6 exceeds source size 0 [-Werror=stringop-overread]
         44 | #define __underlying_memcmp     __builtin_memcmp
            |                                 ^
      ./include/linux/fortify-string.h:420:16: note: in expansion of macro
      '__underlying_memcmp'
        420 |         return __underlying_memcmp(p, q, size);
            |                ^~~~~~~~~~~~~~~~~~~
      
      Fixes: 332f1795 ("Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b840304f
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_sync: Fix suspend performance regression · 1fd02d56
      Luiz Augusto von Dentz authored
      This attempts to fix suspend performance when there is no connections by
      not updating the event mask.
      
      Fixes: ef61b6ea ("Bluetooth: Always set event mask on suspend")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      1fd02d56
    • Hans de Goede's avatar
      Bluetooth: hci_event: Fix vendor (unknown) opcode status handling · b82a26d8
      Hans de Goede authored
      Commit c8992cff ("Bluetooth: hci_event: Use of a function table to
      handle Command Complete") was (presumably) meant to only refactor things
      without any functional changes.
      
      But it does have one undesirable side-effect, before *status would always
      be set to skb->data[0] and it might be overridden by some of the opcode
      specific handling. While now it always set by the opcode specific handlers.
      This means that if the opcode is not known *status does not get set any
      more at all!
      
      This behavior change has broken bluetooth support for BCM4343A0 HCIs,
      the hci_bcm.c code tries to configure UART attached HCIs at a higher
      baudraute using vendor specific opcodes. The BCM4343A0 does not
      support this and this used to simply fail:
      
      [   25.646442] Bluetooth: hci0: BCM: failed to write clock (-56)
      [   25.646481] Bluetooth: hci0: Failed to set baudrate
      
      After which things would continue with the initial baudraute. But now
      that hci_cmd_complete_evt() no longer sets status for unknown opcodes
      *status is left at 0. This causes the hci_bcm.c code to think the baudraute
      has been changed on the HCI side and to also adjust the UART baudrate,
      after which communication with the HCI is broken, leading to:
      
      [   28.579042] Bluetooth: hci0: command 0x0c03 tx timeout
      [   36.961601] Bluetooth: hci0: BCM: Reset failed (-110)
      
      And non working bluetooth. Fix this by restoring the previous
      default "*status = skb->data[0]" handling for unknown opcodes.
      
      Fixes: c8992cff ("Bluetooth: hci_event: Use of a function table to handle Command Complete")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b82a26d8
    • Daniel Borkmann's avatar
      bpf: Don't use tnum_range on array range checking for poke descriptors · a657182a
      Daniel Borkmann authored
      Hsin-Wei reported a KASAN splat triggered by their BPF runtime fuzzer which
      is based on a customized syzkaller:
      
        BUG: KASAN: slab-out-of-bounds in bpf_int_jit_compile+0x1257/0x13f0
        Read of size 8 at addr ffff888004e90b58 by task syz-executor.0/1489
        CPU: 1 PID: 1489 Comm: syz-executor.0 Not tainted 5.19.0 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
        1.13.0-1ubuntu1.1 04/01/2014
        Call Trace:
         <TASK>
         dump_stack_lvl+0x9c/0xc9
         print_address_description.constprop.0+0x1f/0x1f0
         ? bpf_int_jit_compile+0x1257/0x13f0
         kasan_report.cold+0xeb/0x197
         ? kvmalloc_node+0x170/0x200
         ? bpf_int_jit_compile+0x1257/0x13f0
         bpf_int_jit_compile+0x1257/0x13f0
         ? arch_prepare_bpf_dispatcher+0xd0/0xd0
         ? rcu_read_lock_sched_held+0x43/0x70
         bpf_prog_select_runtime+0x3e8/0x640
         ? bpf_obj_name_cpy+0x149/0x1b0
         bpf_prog_load+0x102f/0x2220
         ? __bpf_prog_put.constprop.0+0x220/0x220
         ? find_held_lock+0x2c/0x110
         ? __might_fault+0xd6/0x180
         ? lock_downgrade+0x6e0/0x6e0
         ? lock_is_held_type+0xa6/0x120
         ? __might_fault+0x147/0x180
         __sys_bpf+0x137b/0x6070
         ? bpf_perf_link_attach+0x530/0x530
         ? new_sync_read+0x600/0x600
         ? __fget_files+0x255/0x450
         ? lock_downgrade+0x6e0/0x6e0
         ? fput+0x30/0x1a0
         ? ksys_write+0x1a8/0x260
         __x64_sys_bpf+0x7a/0xc0
         ? syscall_enter_from_user_mode+0x21/0x70
         do_syscall_64+0x3b/0x90
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
        RIP: 0033:0x7f917c4e2c2d
      
      The problem here is that a range of tnum_range(0, map->max_entries - 1) has
      limited ability to represent the concrete tight range with the tnum as the
      set of resulting states from value + mask can result in a superset of the
      actual intended range, and as such a tnum_in(range, reg->var_off) check may
      yield true when it shouldn't, for example tnum_range(0, 2) would result in
      00XX -> v = 0000, m = 0011 such that the intended set of {0, 1, 2} is here
      represented by a less precise superset of {0, 1, 2, 3}. As the register is
      known const scalar, really just use the concrete reg->var_off.value for the
      upper index check.
      
      Fixes: d2e4c1e6 ("bpf: Constant map key tracking for prog array pokes")
      Reported-by: default avatarHsin-Wei Hung <hsinweih@uci.edu>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Shung-Hsi Yu <shung-hsi.yu@suse.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/r/984b37f9fdf7ac36831d2137415a4a915744c1b6.1661462653.git.daniel@iogearbox.netSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a657182a
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4c612826
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from ipsec and netfilter (with one broken Fixes tag).
      
        Current release - new code bugs:
      
         - dsa: don't dereference NULL extack in dsa_slave_changeupper()
      
         - dpaa: fix <1G ethernet on LS1046ARDB
      
         - neigh: don't call kfree_skb() under spin_lock_irqsave()
      
        Previous releases - regressions:
      
         - r8152: fix the RX FIFO settings when suspending
      
         - dsa: microchip: keep compatibility with device tree blobs with no
           phy-mode
      
         - Revert "net: macsec: update SCI upon MAC address change."
      
         - Revert "xfrm: update SA curlft.use_time", comply with RFC 2367
      
        Previous releases - always broken:
      
         - netfilter: conntrack: work around exceeded TCP receive window
      
         - ipsec: fix a null pointer dereference of dst->dev on a metadata dst
           in xfrm_lookup_with_ifid
      
         - moxa: get rid of asymmetry in DMA mapping/unmapping
      
         - dsa: microchip: make learning configurable and keep it off while
           standalone
      
         - ice: xsk: prohibit usage of non-balanced queue id
      
         - rxrpc: fix locking in rxrpc's sendmsg
      
        Misc:
      
         - another chunk of sysctl data race silencing"
      
      * tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
        net: lantiq_xrx200: restore buffer if memory allocation failed
        net: lantiq_xrx200: fix lock under memory pressure
        net: lantiq_xrx200: confirm skb is allocated before using
        net: stmmac: work around sporadic tx issue on link-up
        ionic: VF initial random MAC address if no assigned mac
        ionic: fix up issues with handling EAGAIN on FW cmds
        ionic: clear broken state on generation change
        rxrpc: Fix locking in rxrpc's sendmsg
        net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
        MAINTAINERS: rectify file entry in BONDING DRIVER
        i40e: Fix incorrect address type for IPv6 flow rules
        ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
        net: Fix a data-race around sysctl_somaxconn.
        net: Fix a data-race around netdev_unregister_timeout_secs.
        net: Fix a data-race around gro_normal_batch.
        net: Fix data-races around sysctl_devconf_inherit_init_net.
        net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
        net: Fix a data-race around netdev_budget_usecs.
        net: Fix data-races around sysctl_max_skb_frags.
        net: Fix a data-race around netdev_budget.
        ...
      4c612826
    • Jakub Kicinski's avatar
      Merge branch 'net-lantiq_xrx200-fix-errors-under-memory-pressure' · d974730c
      Jakub Kicinski authored
      Aleksander Jan Bajkowski says:
      
      ====================
      net: lantiq_xrx200: fix errors under memory pressure
      
      This series fixes issues that can occur in the driver under memory pressure.
      Situations when the system cannot allocate memory are rare, so the mentioned
      bugs have been fixed recently. The patches have been tested on a BT Home
      router with the Lantiq xRX200 chipset.
      
      Changelog:
        v3: - removed netdev_err() log from the first patch
        v2:
         - the second patch has been changed, so that under memory pressure situation
           the driver will not receive packets indefinitely regardless of the NAPI budget,
         - the third patch has been added.
      ====================
      
      Link: https://lore.kernel.org/r/20220824215408.4695-1-olek2@wp.plSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d974730c
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: restore buffer if memory allocation failed · c9c3b177
      Aleksander Jan Bajkowski authored
      In a situation where memory allocation fails, an invalid buffer address
      is stored. When this descriptor is used again, the system panics in the
      build_skb() function when accessing memory.
      
      Fixes: 7ea6cd16 ("lantiq: net: fix duplicated skb in rx descriptor ring")
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c9c3b177
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: fix lock under memory pressure · c4b6e934
      Aleksander Jan Bajkowski authored
      When the xrx200_hw_receive() function returns -ENOMEM, the NAPI poll
      function immediately returns an error.
      This is incorrect for two reasons:
      * the function terminates without enabling interrupts or scheduling NAPI,
      * the error code (-ENOMEM) is returned instead of the number of received
      packets.
      
      After the first memory allocation failure occurs, packet reception is
      locked due to disabled interrupts from DMA..
      
      Fixes: fe1a5642 ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver")
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c4b6e934
    • Aleksander Jan Bajkowski's avatar
      net: lantiq_xrx200: confirm skb is allocated before using · c8b04370
      Aleksander Jan Bajkowski authored
      xrx200_hw_receive() assumes build_skb() always works and goes straight
      to skb_reserve(). However, build_skb() can fail under memory pressure.
      
      Add a check in case build_skb() failed to allocate and return NULL.
      
      Fixes: e0155935 ("net: lantiq_xrx200: convert to build_skb")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAleksander Jan Bajkowski <olek2@wp.pl>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c8b04370
    • Heiner Kallweit's avatar
      net: stmmac: work around sporadic tx issue on link-up · a3a57bf0
      Heiner Kallweit authored
      This is a follow-up to the discussion in [0]. It seems to me that
      at least the IP version used on Amlogic SoC's sometimes has a problem
      if register MAC_CTRL_REG is written whilst the chip is still processing
      a previous write. But that's just a guess.
      Adding a delay between two writes to this register helps, but we can
      also simply omit the offending second write. This patch uses the second
      approach and is based on a suggestion from Qi Duan.
      Benefit of this approach is that we can save few register writes, also
      on not affected chip versions.
      
      [0] https://www.spinics.net/lists/netdev/msg831526.html
      
      Fixes: bfab27a1 ("stmmac: add the experimental PCI support")
      Suggested-by: default avatarQi Duan <qi.duan@amlogic.com>
      Suggested-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/e99857ce-bd90-5093-ca8c-8cd480b5a0a2@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a3a57bf0
    • Jakub Kicinski's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · ef332fe1
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-08-24 (ixgbe, i40e)
      
      This series contains updates to ixgbe and i40e drivers.
      
      Jake stops incorrect resetting of SYSTIME registers when starting
      cyclecounter for ixgbe.
      
      Sylwester corrects a check on source IP address when validating destination
      for i40e.
      
      * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        i40e: Fix incorrect address type for IPv6 flow rules
        ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
      ====================
      
      Link: https://lore.kernel.org/r/20220824193748.874343-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ef332fe1
    • Jakub Kicinski's avatar
      Merge branch 'ionic-bug-fixes' · 92df825a
      Jakub Kicinski authored
      Shannon Nelson says:
      
      ====================
      ionic: bug fixes
      
      These are a couple of maintenance bug fixes for the Pensando ionic
      networking driver.
      
      Mohamed takes care of a "plays well with others" issue where the
      VF spec is a bit vague on VF mac addresses, but certain customers
      have come to expect behavior based on other vendor drivers.
      
      Shannon addresses a couple of corner cases seen in internal
      stress testing.
      ====================
      
      Link: https://lore.kernel.org/r/20220824165051.6185-1-snelson@pensando.ioSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      92df825a
    • R Mohamed Shah's avatar
      ionic: VF initial random MAC address if no assigned mac · 19058be7
      R Mohamed Shah authored
      Assign a random mac address to the VF interface station
      address if it boots with a zero mac address in order to match
      similar behavior seen in other VF drivers.  Handle the errors
      where the older firmware does not allow the VF to set its own
      station address.
      
      Newer firmware will allow the VF to set the station mac address
      if it hasn't already been set administratively through the PF.
      Setting it will also be allowed if the VF has trust.
      
      Fixes: fbb39807 ("ionic: support sr-iov operations")
      Signed-off-by: default avatarR Mohamed Shah <mohamed@pensando.io>
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      19058be7
    • Shannon Nelson's avatar
      ionic: fix up issues with handling EAGAIN on FW cmds · 0fc4dd45
      Shannon Nelson authored
      In looping on FW update tests we occasionally see the
      FW_ACTIVATE_STATUS command fail while it is in its EAGAIN loop
      waiting for the FW activate step to finsh inside the FW.  The
      firmware is complaining that the done bit is set when a new
      dev_cmd is going to be processed.
      
      Doing a clean on the cmd registers and doorbell before exiting
      the wait-for-done and cleaning the done bit before the sleep
      prevents this from occurring.
      
      Fixes: fbfb8031 ("ionic: Add hardware init and device commands")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0fc4dd45
    • Shannon Nelson's avatar
      ionic: clear broken state on generation change · 9cb9dadb
      Shannon Nelson authored
      There is a case found in heavy testing where a link flap happens just
      before a firmware Recovery event and the driver gets stuck in the
      BROKEN state.  This comes from the driver getting interrupted by a FW
      generation change when coming back up from the link flap, and the call
      to ionic_start_queues() in ionic_link_status_check() fails.  This can be
      addressed by having the fw_up code clear the BROKEN bit if seen, rather
      than waiting for a user to manually force the interface down and then
      back up.
      
      Fixes: 9e8eaf84 ("ionic: stop watchdog when in broken state")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9cb9dadb
    • David Howells's avatar
      rxrpc: Fix locking in rxrpc's sendmsg · b0f571ec
      David Howells authored
      Fix three bugs in the rxrpc's sendmsg implementation:
      
       (1) rxrpc_new_client_call() should release the socket lock when returning
           an error from rxrpc_get_call_slot().
      
       (2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
           held in the event that we're interrupted by a signal whilst waiting
           for tx space on the socket or relocking the call mutex afterwards.
      
           Fix this by: (a) moving the unlock/lock of the call mutex up to
           rxrpc_send_data() such that the lock is not held around all of
           rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
           whether we're return with the lock dropped.  Note that this means
           recvmsg() will not block on this call whilst we're waiting.
      
       (3) After dropping and regaining the call mutex, rxrpc_send_data() needs
           to go and recheck the state of the tx_pending buffer and the
           tx_total_len check in case we raced with another sendmsg() on the same
           call.
      
      Thinking on this some more, it might make sense to have different locks for
      sendmsg() and recvmsg().  There's probably no need to make recvmsg() wait
      for sendmsg().  It does mean that recvmsg() can return MSG_EOR indicating
      that a call is dead before a sendmsg() to that call returns - but that can
      currently happen anyway.
      
      Without fix (2), something like the following can be induced:
      
      	WARNING: bad unlock balance detected!
      	5.16.0-rc6-syzkaller #0 Not tainted
      	-------------------------------------
      	syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
      	[<ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	but there are no more locks to release!
      
      	other info that might help us debug this:
      	no locks held by syz-executor011/3597.
      	...
      	Call Trace:
      	 <TASK>
      	 __dump_stack lib/dump_stack.c:88 [inline]
      	 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      	 print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
      	 __lock_release kernel/locking/lockdep.c:5306 [inline]
      	 lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
      	 __mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
      	 rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
      	 rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
      	 sock_sendmsg_nosec net/socket.c:704 [inline]
      	 sock_sendmsg+0xcf/0x120 net/socket.c:724
      	 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
      	 ___sys_sendmsg+0xf3/0x170 net/socket.c:2463
      	 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
      	 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      	 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
      	 entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      [Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]
      
      Fixes: bc5e3a54 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
      Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
      cc: Hawkins Jiawei <yin31149@gmail.com>
      cc: Khalid Masum <khalid.masum.92@gmail.com>
      cc: Dan Carpenter <dan.carpenter@oracle.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b0f571ec
    • Alexei Starovoitov's avatar
      Merge branch 'Fix incorrect pruning for ARG_CONST_ALLOC_SIZE_OR_ZERO' · cb15c734
      Alexei Starovoitov authored
      Kumar Kartikeya Dwivedi says:
      
      ====================
      
      A fix for a missing mark_chain_precision call that leads to eager pruning and
      loading of invalid programs when the more permissive case is in the straight
      line exploration. Please see the commit log for details, and selftest for an
      example.
      ====================
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cb15c734
    • Kumar Kartikeya Dwivedi's avatar
      selftests/bpf: Add regression test for pruning fix · 1800b2ac
      Kumar Kartikeya Dwivedi authored
      Add a test to ensure we do mark_chain_precision for the argument type
      ARG_CONST_ALLOC_SIZE_OR_ZERO. For other argument types, this was already
      done, but propagation for missing for this case. Without the fix, this
      test case loads successfully.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20220823185500.467-1-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1800b2ac
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Do mark_chain_precision for ARG_CONST_ALLOC_SIZE_OR_ZERO · 2fc31465
      Kumar Kartikeya Dwivedi authored
      Precision markers need to be propagated whenever we have an ARG_CONST_*
      style argument, as the verifier cannot consider imprecise scalars to be
      equivalent for the purposes of states_equal check when such arguments
      refine the return value (in this case, set mem_size for PTR_TO_MEM). The
      resultant mem_size for the R0 is derived from the constant value, and if
      the verifier incorrectly prunes states considering them equivalent where
      such arguments exist (by seeing that both registers have reg->precise as
      false in regsafe), we can end up with invalid programs passing the
      verifier which can do access beyond what should have been the correct
      mem_size in that explored state.
      
      To show a concrete example of the problem:
      
      0000000000000000 <prog>:
             0:       r2 = *(u32 *)(r1 + 80)
             1:       r1 = *(u32 *)(r1 + 76)
             2:       r3 = r1
             3:       r3 += 4
             4:       if r3 > r2 goto +18 <LBB5_5>
             5:       w2 = 0
             6:       *(u32 *)(r1 + 0) = r2
             7:       r1 = *(u32 *)(r1 + 0)
             8:       r2 = 1
             9:       if w1 == 0 goto +1 <LBB5_3>
            10:       r2 = -1
      
      0000000000000058 <LBB5_3>:
            11:       r1 = 0 ll
            13:       r3 = 0
            14:       call bpf_ringbuf_reserve
            15:       if r0 == 0 goto +7 <LBB5_5>
            16:       r1 = r0
            17:       r1 += 16777215
            18:       w2 = 0
            19:       *(u8 *)(r1 + 0) = r2
            20:       r1 = r0
            21:       r2 = 0
            22:       call bpf_ringbuf_submit
      
      00000000000000b8 <LBB5_5>:
            23:       w0 = 0
            24:       exit
      
      For the first case, the single line execution's exploration will prune
      the search at insn 14 for the branch insn 9's second leg as it will be
      verified first using r2 = -1 (UINT_MAX), while as w1 at insn 9 will
      always be 0 so at runtime we don't get error for being greater than
      UINT_MAX/4 from bpf_ringbuf_reserve. The verifier during regsafe just
      sees reg->precise as false for both r2 registers in both states, hence
      considers them equal for purposes of states_equal.
      
      If we propagated precise markers using the backtracking support, we
      would use the precise marking to then ensure that old r2 (UINT_MAX) was
      within the new r2 (1) and this would never be true, so the verification
      would rightfully fail.
      
      The end result is that the out of bounds access at instruction 19 would
      be permitted without this fix.
      
      Note that reg->precise is always set to true when user does not have
      CAP_BPF (or when subprog count is greater than 1 (i.e. use of any static
      or global functions)), hence this is only a problem when precision marks
      need to be explicitly propagated (i.e. privileged users with CAP_BPF).
      
      A simplified test case has been included in the next patch to prevent
      future regressions.
      
      Fixes: 457f4436 ("bpf: Implement BPF ring buffer and verifier support for it")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20220823185300.406-2-memxor@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2fc31465
    • Linus Torvalds's avatar
      Merge tag 'cgroup-for-6.0-rc2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 3f5c2005
      Linus Torvalds authored
      Pull another cgroup fix from Tejun Heo:
       "Commit 4f7e7236 ("cgroup: Fix threadgroup_rwsem <->
        cpus_read_lock() deadlock") required the cgroup
        core to grab cpus_read_lock() before invoking ->attach().
      
        Unfortunately, it missed adding cpus_read_lock() in
        cgroup_attach_task_all(). Fix it"
      
      * tag 'cgroup-for-6.0-rc2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
      3f5c2005
    • Tetsuo Handa's avatar
      cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all() · 43626dad
      Tetsuo Handa authored
      syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
      cpuset_attach() [1], for commit 4f7e7236 ("cgroup: Fix
      threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that
      cpuset_attach() is also called from cgroup_attach_task_all().
      Add cpus_read_lock() like what cgroup_procs_write_start() does.
      
      Link: https://syzkaller.appspot.com/bug?extid=29d3a3b4d86c8136ad9e [1]
      Reported-by: default avatarsyzbot <syzbot+29d3a3b4d86c8136ad9e@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Fixes: 4f7e7236 ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      43626dad
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2 · 0cf731f9
      Lorenzo Bianconi authored
      Properly report hw rx hash for mt7986 chipset accroding to the new dma
      descriptor layout.
      
      Fixes: 197c9e9b ("net: ethernet: mtk_eth_soc: introduce support for mt7986 chipset")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/091394ea4e705fbb35f828011d98d0ba33808f69.1661257293.git.lorenzo@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0cf731f9
    • Dan Carpenter's avatar
      wifi: mac80211: potential NULL dereference in ieee80211_tx_control_port() · 55f0a489
      Dan Carpenter authored
      The ieee80211_lookup_ra_sta() function will sometimes set "sta" to NULL
      so add this NULL check to prevent an Oops.
      
      Fixes: 9dd19538 ("wifi: nl80211/mac80211: clarify link ID in control port TX")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/YuKcTAyO94YOy0Bu@kiliSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      55f0a489