1. 11 May, 2023 4 commits
    • Jakub Kicinski's avatar
      Merge branch 'af_unix-fix-two-data-races-reported-by-kcsan' · 33dcee99
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      af_unix: Fix two data races reported by KCSAN.
      
      KCSAN reported data races around these two fields for AF_UNIX sockets.
      
        * sk->sk_receive_queue->qlen
        * sk->sk_shutdown
      
      Let's annotate them properly.
      ====================
      
      Link: https://lore.kernel.org/r/20230510003456.42357-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      33dcee99
    • Kuniyuki Iwashima's avatar
      af_unix: Fix data races around sk->sk_shutdown. · e1d09c2c
      Kuniyuki Iwashima authored
      KCSAN found a data race around sk->sk_shutdown where unix_release_sock()
      and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll()
      and unix_dgram_poll() read it locklessly.
      
      We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE().
      
      BUG: KCSAN: data-race in unix_poll / unix_release_sock
      
      write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0:
       unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
       unix_release+0x59/0x80 net/unix/af_unix.c:1042
       __sock_release+0x7d/0x170 net/socket.c:653
       sock_close+0x19/0x30 net/socket.c:1397
       __fput+0x179/0x5e0 fs/file_table.c:321
       ____fput+0x15/0x20 fs/file_table.c:349
       task_work_run+0x116/0x1a0 kernel/task_work.c:179
       resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
       exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
       __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
       syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
       do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1:
       unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170
       sock_poll+0xcf/0x2b0 net/socket.c:1385
       vfs_poll include/linux/poll.h:88 [inline]
       ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855
       ep_send_events fs/eventpoll.c:1694 [inline]
       ep_poll fs/eventpoll.c:1823 [inline]
       do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258
       __do_sys_epoll_wait fs/eventpoll.c:2270 [inline]
       __se_sys_epoll_wait fs/eventpoll.c:2265 [inline]
       __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      value changed: 0x00 -> 0x03
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 3c73419c ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e1d09c2c
    • Kuniyuki Iwashima's avatar
      af_unix: Fix a data race of sk->sk_receive_queue->qlen. · 679ed006
      Kuniyuki Iwashima authored
      KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg()
      updates qlen under the queue lock and sendmsg() checks qlen under
      unix_state_sock(), not the queue lock, so the reader side needs
      READ_ONCE().
      
      BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer
      
      write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0:
       __skb_unlink include/linux/skbuff.h:2347 [inline]
       __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197
       __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263
       __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452
       unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549
       sock_recvmsg_nosec net/socket.c:1019 [inline]
       ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720
       ___sys_recvmsg+0xc8/0x150 net/socket.c:2764
       do_recvmmsg+0x182/0x560 net/socket.c:2858
       __sys_recvmmsg net/socket.c:2937 [inline]
       __do_sys_recvmmsg net/socket.c:2960 [inline]
       __se_sys_recvmmsg net/socket.c:2953 [inline]
       __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1:
       skb_queue_len include/linux/skbuff.h:2127 [inline]
       unix_recvq_full net/unix/af_unix.c:229 [inline]
       unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445
       unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048
       sock_sendmsg_nosec net/socket.c:724 [inline]
       sock_sendmsg+0x148/0x160 net/socket.c:747
       ____sys_sendmsg+0x20e/0x620 net/socket.c:2503
       ___sys_sendmsg+0xc6/0x140 net/socket.c:2557
       __sys_sendmmsg+0x11d/0x370 net/socket.c:2643
       __do_sys_sendmmsg net/socket.c:2672 [inline]
       __se_sys_sendmmsg net/socket.c:2669 [inline]
       __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      value changed: 0x0000000b -> 0x00000001
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      679ed006
    • Eric Dumazet's avatar
      net: datagram: fix data-races in datagram_poll() · 5bca1d08
      Eric Dumazet authored
      datagram_poll() runs locklessly, we should add READ_ONCE()
      annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5bca1d08
  2. 10 May, 2023 19 commits
    • Colin Foster's avatar
      net: mscc: ocelot: fix stat counter register values · cdc2e28e
      Colin Foster authored
      Commit d4c36765 ("net: mscc: ocelot: keep ocelot_stat_layout by reg
      address, not offset") organized the stats counters for Ocelot chips, namely
      the VSC7512 and VSC7514. A few of the counter offsets were incorrect, and
      were caught by this warning:
      
      WARNING: CPU: 0 PID: 24 at drivers/net/ethernet/mscc/ocelot_stats.c:909
      ocelot_stats_init+0x1fc/0x2d8
      reg 0x5000078 had address 0x220 but reg 0x5000079 has address 0x214,
      bulking broken!
      
      Fix these register offsets.
      
      Fixes: d4c36765 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
      Signed-off-by: default avatarColin Foster <colin.foster@in-advantage.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdc2e28e
    • t.feng's avatar
      ipvlan:Fix out-of-bounds caused by unclear skb->cb · 90cbed52
      t.feng authored
      If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
      is actually skb->cb, and IPCB(skb_in)->opt will be used in
      __ip_options_echo. It is possible that memcpy is out of bounds and lead
      to stack overflow.
      We should clear skb->cb before ip_local_out or ip6_local_out.
      
      v2:
      1. clean the stack info
      2. use IPCB/IP6CB instead of skb->cb
      
      crash on stable-5.10(reproduce in kasan kernel).
      Stack info:
      [ 2203.651571] BUG: KASAN: stack-out-of-bounds in
      __ip_options_echo+0x589/0x800
      [ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
      swapper/3/0
      [ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
      5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
      [ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
      [ 2203.655475] Call Trace:
      [ 2203.655481]  <IRQ>
      [ 2203.655501]  dump_stack+0x9c/0xd3
      [ 2203.655514]  print_address_description.constprop.0+0x19/0x170
      [ 2203.655530]  __kasan_report.cold+0x6c/0x84
      [ 2203.655586]  kasan_report+0x3a/0x50
      [ 2203.655594]  check_memory_region+0xfd/0x1f0
      [ 2203.655601]  memcpy+0x39/0x60
      [ 2203.655608]  __ip_options_echo+0x589/0x800
      [ 2203.655654]  __icmp_send+0x59a/0x960
      [ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
      [ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
      [ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
      [ 2203.655821]  nf_hook_slow+0x69/0x100
      [ 2203.655828]  __ip_local_out+0x21e/0x2b0
      [ 2203.655857]  ip_local_out+0x28/0x90
      [ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
      [ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
      [ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
      [ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
      [ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
      [ 2203.655992]  dev_hard_start_xmit+0x62/0x100
      [ 2203.656000]  sch_direct_xmit+0x215/0x640
      [ 2203.656028]  __qdisc_run+0x153/0x1f0
      [ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
      [ 2203.656173]  ip_finish_output2+0x59b/0xc20
      [ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
      [ 2203.656312]  ip_finish_output+0x168/0x190
      [ 2203.656320]  ip_output+0x12d/0x220
      [ 2203.656357]  __ip_queue_xmit+0x392/0x880
      [ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
      [ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
      [ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
      [ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
      [ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
      [ 2203.656529]  tcp_write_timer+0x55/0x180
      [ 2203.656542]  call_timer_fn+0x3f/0x1d0
      [ 2203.656555]  expire_timers+0x160/0x200
      [ 2203.656562]  run_timer_softirq+0x1f4/0x480
      [ 2203.656606]  __do_softirq+0xfd/0x402
      [ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
      [ 2203.656617]  </IRQ>
      [ 2203.656623]  do_softirq_own_stack+0x37/0x50
      [ 2203.656631]  irq_exit_rcu+0x134/0x1a0
      [ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
      [ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
      [ 2203.656654] RIP: 0010:default_idle+0x13/0x20
      [ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
      cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
      f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
      [ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
      [ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
      ffffffffaf290191
      [ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
      ffff88811a3c4f60
      [ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
      ffff88811a3c4f63
      [ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
      0000000000000003
      [ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
      0000000000000000
      [ 2203.656729]  default_idle_call+0x5a/0x150
      [ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
      [ 2203.656780]  do_idle+0xab/0x100
      [ 2203.656786]  cpu_startup_entry+0x19/0x20
      [ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb
      
      [ 2203.657409] The buggy address belongs to the page:
      [ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
      mapping:0000000000000000 index:0x0 pfn:0x11a388
      [ 2203.658665] flags:
      0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
      [ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
      0000000000000000
      [ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
      0000000000000000
      [ 2203.658686] page dumped because: kasan: bad access detected
      
      To reproduce(ipvlan with IPVLAN_MODE_L3):
      Env setting:
      =======================================================
      modprobe ipvlan ipvlan_default_mode=1
      sysctl net.ipv4.conf.eth0.forwarding=1
      iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
      MASQUERADE
      ip link add gw link eth0 type ipvlan
      ip -4 addr add 20.0.0.254/24 dev gw
      ip netns add net1
      ip link add ipv1 link eth0 type ipvlan
      ip link set ipv1 netns net1
      ip netns exec net1 ip link set ipv1 up
      ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
      ip netns exec net1 route add default gw 20.0.0.254
      ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
      ifconfig gw up
      iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
      icmp-port-unreachable
      =======================================================
      And then excute the shell(curl any address of eth0 can reach):
      
      for((i=1;i<=100000;i++))
      do
              ip netns exec net1 curl x.x.x.x:8888
      done
      =======================================================
      
      Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.")
      Signed-off-by: default avatar"t.feng" <fengtao40@huawei.com>
      Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90cbed52
    • Randy Dunlap's avatar
      docs: networking: fix x25-iface.rst heading & index order · 77c964da
      Randy Dunlap authored
      Fix the chapter heading for "X.25 Device Driver Interface" so that it
      does not contain a trailing '-' character, which makes Sphinx
      omit this heading from the contents.
      
      Reverse the order of the x25.rst and x25-iface.rst files in the index
      so that the project introduction (x25.rst) comes first.
      
      Fixes: 883780af ("docs: networking: convert x25-iface.txt to ReST")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: linux-doc@vger.kernel.org
      Cc: Martin Schiller <ms@dev.tdt.de>
      Cc: linux-x25@vger.kernel.org
      Reviewed-by: default avatarBagas Sanjaya <bagasdotme@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77c964da
    • Ziwei Xiao's avatar
      gve: Remove the code of clearing PBA bit · f4c2e67c
      Ziwei Xiao authored
      Clearing the PBA bit from the driver is race prone and it may lead to
      dropped interrupt events. This could potentially lead to the traffic
      being completely halted.
      
      Fixes: 5e8c5adf ("gve: DQO: Add core netdev features")
      Signed-off-by: default avatarZiwei Xiao <ziweixiao@google.com>
      Signed-off-by: default avatarBailey Forrest <bcf@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f4c2e67c
    • Eric Dumazet's avatar
      tcp: add annotations around sk->sk_shutdown accesses · e14cadfd
      Eric Dumazet authored
      Now sk->sk_shutdown is no longer a bitfield, we can add
      standard READ_ONCE()/WRITE_ONCE() annotations to silence
      KCSAN reports like the following:
      
      BUG: KCSAN: data-race in tcp_disconnect / tcp_poll
      
      write to 0xffff88814588582c of 1 bytes by task 3404 on cpu 1:
      tcp_disconnect+0x4d6/0xdb0 net/ipv4/tcp.c:3121
      __inet_stream_connect+0x5dd/0x6e0 net/ipv4/af_inet.c:715
      inet_stream_connect+0x48/0x70 net/ipv4/af_inet.c:727
      __sys_connect_file net/socket.c:2001 [inline]
      __sys_connect+0x19b/0x1b0 net/socket.c:2018
      __do_sys_connect net/socket.c:2028 [inline]
      __se_sys_connect net/socket.c:2025 [inline]
      __x64_sys_connect+0x41/0x50 net/socket.c:2025
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88814588582c of 1 bytes by task 3374 on cpu 0:
      tcp_poll+0x2e6/0x7d0 net/ipv4/tcp.c:562
      sock_poll+0x253/0x270 net/socket.c:1383
      vfs_poll include/linux/poll.h:88 [inline]
      io_poll_check_events io_uring/poll.c:281 [inline]
      io_poll_task_func+0x15a/0x820 io_uring/poll.c:333
      handle_tw_list io_uring/io_uring.c:1184 [inline]
      tctx_task_work+0x1fe/0x4d0 io_uring/io_uring.c:1246
      task_work_run+0x123/0x160 kernel/task_work.c:179
      get_signal+0xe64/0xff0 kernel/signal.c:2635
      arch_do_signal_or_restart+0x89/0x2a0 arch/x86/kernel/signal.c:306
      exit_to_user_mode_loop+0x6f/0xe0 kernel/entry/common.c:168
      exit_to_user_mode_prepare+0x6c/0xb0 kernel/entry/common.c:204
      __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
      syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
      do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x03 -> 0x00
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e14cadfd
    • Eric Dumazet's avatar
      net: add vlan_get_protocol_and_depth() helper · 4063384e
      Eric Dumazet authored
      Before blamed commit, pskb_may_pull() was used instead
      of skb_header_pointer() in __vlan_get_protocol() and friends.
      
      Few callers depended on skb->head being populated with MAC header,
      syzbot caught one of them (skb_mac_gso_segment())
      
      Add vlan_get_protocol_and_depth() to make the intent clearer
      and use it where sensible.
      
      This is a more generic fix than commit e9d3f809
      ("net/af_packet: make sure to pull mac header") which was
      dealing with a similar issue.
      
      kernel BUG at include/linux/skbuff.h:2655 !
      invalid opcode: 0000 [#1] SMP KASAN
      CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
      RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline]
      RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136
      Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
      RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286
      RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9
      R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012
      R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7
      FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      [<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419
      [<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline]
      [<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725
      [<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313
      [<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029
      [<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline]
      [<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142
      [<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline]
      [<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline]
      [<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139
      [<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline]
      [<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline]
      [<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
      [<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      [<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
      [<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 469acedd ("vlan: consolidate VLAN parsing code and limit max parsing depth")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Toke Høiland-Jørgensen <toke@redhat.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4063384e
    • Russell King (Oracle)'s avatar
      net: pcs: xpcs: fix incorrect number of interfaces · 43fb622d
      Russell King (Oracle) authored
      In synopsys_xpcs_compat[], the DW_XPCS_2500BASEX entry was setting
      the number of interfaces using the xpcs_2500basex_features array
      rather than xpcs_2500basex_interfaces. This causes us to overflow
      the array of interfaces. Fix this.
      
      Fixes: f27abde3 ("net: pcs: add 2500BASEX support for Intel mGbE controller")
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43fb622d
    • Eric Dumazet's avatar
      net: deal with most data-races in sk_wait_event() · d0ac89f6
      Eric Dumazet authored
      __condition is evaluated twice in sk_wait_event() macro.
      
      First invocation is lockless, and reads can race with writes,
      as spotted by syzbot.
      
      BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect
      
      write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
      tcp_disconnect+0x2cd/0xdb0
      inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
      __sys_shutdown_sock net/socket.c:2343 [inline]
      __sys_shutdown net/socket.c:2355 [inline]
      __do_sys_shutdown net/socket.c:2363 [inline]
      __se_sys_shutdown+0xf8/0x140 net/socket.c:2361
      __x64_sys_shutdown+0x31/0x40 net/socket.c:2361
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
      sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
      tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
      tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
      inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      __sys_sendto+0x246/0x300 net/socket.c:2142
      __do_sys_sendto net/socket.c:2154 [inline]
      __se_sys_sendto net/socket.c:2150 [inline]
      __x64_sys_sendto+0x78/0x90 net/socket.c:2150
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00000000 -> 0x00000068
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0ac89f6
    • Eric Dumazet's avatar
      net: annotate sk->sk_err write from do_recvmmsg() · e05a5f51
      Eric Dumazet authored
      do_recvmmsg() can write to sk->sk_err from multiple threads.
      
      As said before, many other points reading or writing sk_err
      need annotations.
      
      Fixes: 34b88a68 ("net: Fix use after free in the recvmmsg exit path")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e05a5f51
    • Eric Dumazet's avatar
      netlink: annotate accesses to nlk->cb_running · a939d149
      Eric Dumazet authored
      Both netlink_recvmsg() and netlink_native_seq_show() read
      nlk->cb_running locklessly. Use READ_ONCE() there.
      
      Add corresponding WRITE_ONCE() to netlink_dump() and
      __netlink_dump_start()
      
      syzbot reported:
      BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg
      
      write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0:
      __netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399
      netlink_dump_start include/linux/netlink.h:308 [inline]
      rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      sock_write_iter+0x1aa/0x230 net/socket.c:1138
      call_write_iter include/linux/fs.h:1851 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x463/0x760 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1:
      netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022
      sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017
      ____sys_recvmsg+0x2db/0x310 net/socket.c:2718
      ___sys_recvmsg net/socket.c:2762 [inline]
      do_recvmmsg+0x2e5/0x710 net/socket.c:2856
      __sys_recvmmsg net/socket.c:2935 [inline]
      __do_sys_recvmmsg net/socket.c:2958 [inline]
      __se_sys_recvmmsg net/socket.c:2951 [inline]
      __x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00 -> 0x01
      
      Fixes: 16b304f3 ("netlink: Eliminate kmalloc in netlink dump operation.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a939d149
    • David S. Miller's avatar
      Merge branch 'bonding-overflow' · a5b3363d
      David S. Miller authored
      Hangbin Liu says:
      
      ====================
      bonding: fix send_peer_notif overflow
      
      Bonding send_peer_notif was defined as u8. But the value is
      num_peer_notif multiplied by peer_notif_delay, which is u8 * u32.
      This would cause the send_peer_notif overflow.
      
      Before the fix:
      TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
      4 garp packets sent on active slave eth1
      TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [FAIL]
      24 garp packets sent on active slave eth1
      TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [FAIL]
      
      After the fix:
      TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [ OK ]
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5b3363d
    • Hangbin Liu's avatar
      kselftest: bonding: add num_grat_arp test · 6cbe791c
      Hangbin Liu authored
      TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [ OK ]
      TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [ OK ]
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cbe791c
    • Hangbin Liu's avatar
      selftests: forwarding: lib: add netns support for tc rule handle stats get · b6d1599f
      Hangbin Liu authored
      When run the test in netns, it's not easy to get the tc stats via
      tc_rule_handle_stats_get(). With the new netns parameter, we can get
      stats from specific netns like
      
        num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6d1599f
    • Hangbin Liu's avatar
      Documentation: bonding: fix the doc of peer_notif_delay · 84df83e0
      Hangbin Liu authored
      Bonding only supports setting peer_notif_delay with miimon set.
      
      Fixes: 0307d589 ("bonding: add documentation for peer_notif_delay")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      84df83e0
    • Hangbin Liu's avatar
      bonding: fix send_peer_notif overflow · 9949e2ef
      Hangbin Liu authored
      Bonding send_peer_notif was defined as u8. Since commit 07a4ddec
      ("bonding: add an option to specify a delay between peer notifications").
      the bond->send_peer_notif will be num_peer_notif multiplied by
      peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif
      overflow easily. e.g.
      
        ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000
      
      To fix the overflow, let's set the send_peer_notif to u32 and limit
      peer_notif_delay to 300s.
      Reported-by: default avatarLiang Li <liali@redhat.com>
      Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053
      Fixes: 07a4ddec ("bonding: add an option to specify a delay between peer notifications")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9949e2ef
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: fix NULL pointer dereference · 7c83e28f
      Daniel Golle authored
      Check for NULL pointer to avoid kernel crashing in case of missing WO
      firmware in case only a single WEDv2 device has been initialized, e.g. on
      MT7981 which can connect just one wireless frontend.
      
      Fixes: 86ce0d09 ("net: ethernet: mtk_eth_soc: use WO firmware for MT7981")
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c83e28f
    • Florian Fainelli's avatar
      net: phy: bcm7xx: Correct read from expansion register · 582dbb2c
      Florian Fainelli authored
      Since the driver works in the "legacy" addressing mode, we need to write
      to the expansion register (0x17) with bits 11:8 set to 0xf to properly
      select the expansion register passed as argument.
      
      Fixes: f68d08c4 ("net: phy: bcm7xxx: Add EPHY entry for 72165")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230508231749.1681169-1-f.fainelli@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      582dbb2c
    • Kuniyuki Iwashima's avatar
      net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs(). · dfd9248c
      Kuniyuki Iwashima authored
      KCSAN found a data race in sock_recv_cmsgs() where the read access
      to sk->sk_stamp needs READ_ONCE().
      
      BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg
      
      write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0:
       sock_write_timestamp include/net/sock.h:2670 [inline]
       sock_recv_cmsgs include/net/sock.h:2722 [inline]
       packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489
       sock_recvmsg_nosec net/socket.c:1019 [inline]
       sock_recvmsg+0x11a/0x130 net/socket.c:1040
       sock_read_iter+0x176/0x220 net/socket.c:1118
       call_read_iter include/linux/fs.h:1845 [inline]
       new_sync_read fs/read_write.c:389 [inline]
       vfs_read+0x5e0/0x630 fs/read_write.c:470
       ksys_read+0x163/0x1a0 fs/read_write.c:613
       __do_sys_read fs/read_write.c:623 [inline]
       __se_sys_read fs/read_write.c:621 [inline]
       __x64_sys_read+0x41/0x50 fs/read_write.c:621
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1:
       sock_recv_cmsgs include/net/sock.h:2721 [inline]
       packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489
       sock_recvmsg_nosec net/socket.c:1019 [inline]
       sock_recvmsg+0x11a/0x130 net/socket.c:1040
       sock_read_iter+0x176/0x220 net/socket.c:1118
       call_read_iter include/linux/fs.h:1845 [inline]
       new_sync_read fs/read_write.c:389 [inline]
       vfs_read+0x5e0/0x630 fs/read_write.c:470
       ksys_read+0x163/0x1a0 fs/read_write.c:613
       __do_sys_read fs/read_write.c:623 [inline]
       __se_sys_read fs/read_write.c:621 [inline]
       __x64_sys_read+0x41/0x50 fs/read_write.c:621
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      value changed: 0xffffffffc4653600 -> 0x0000000000000000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 6c7c98ba ("sock: avoid dirtying sk_stamp, if possible")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230508175543.55756-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dfd9248c
    • Marek Vasut's avatar
      net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register · 8efbdbfa
      Marek Vasut authored
      Initialize MAC_ONEUS_TIC_COUNTER register with correct value derived
      from CSR clock, otherwise EEE is unstable on at least NXP i.MX8M Plus
      and Micrel KSZ9131RNX PHY, to the point where not even ARP request can
      be sent out.
      
      i.MX 8M Plus Applications Processor Reference Manual, Rev. 1, 06/2021
      11.7.6.1.34 One-microsecond Reference Timer (MAC_ONEUS_TIC_COUNTER)
      defines this register as:
      "
      This register controls the generation of the Reference time (1 microsecond
      tic) for all the LPI timers. This timer has to be programmed by the software
      initially.
      ...
      The application must program this counter so that the number of clock cycles
      of CSR clock is 1us. (Subtract 1 from the value before programming).
      For example if the CSR clock is 100MHz then this field needs to be programmed
      to value 100 - 1 = 99 (which is 0x63).
      This is required to generate the 1US events that are used to update some of
      the EEE related counters.
      "
      
      The reset value is 0x63 on i.MX8M Plus, which means expected CSR clock are
      100 MHz. However, the i.MX8M Plus "enet_qos_root_clk" are 266 MHz instead,
      which means the LPI timers reach their count much sooner on this platform.
      
      This is visible using a scope by monitoring e.g. exit from LPI mode on TX_CTL
      line from MAC to PHY. This should take 30us per STMMAC_DEFAULT_TWT_LS setting,
      during which the TX_CTL line transitions from tristate to low, and 30 us later
      from low to high. On i.MX8M Plus, this transition takes 11 us, which matches
      the 30us * 100/266 formula for misconfigured MAC_ONEUS_TIC_COUNTER register.
      
      Configure MAC_ONEUS_TIC_COUNTER based on CSR clock, so that the LPI timers
      have correct 1us reference. This then fixes EEE on i.MX8M Plus with Micrel
      KSZ9131RNX PHY.
      
      Fixes: 477286b5 ("stmmac: add GMAC4 core support")
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Tested-by: default avatarHarald Seiler <hws@denx.de>
      Reviewed-by: default avatarFrancesco Dolcini <francesco.dolcini@toradex.com>
      Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Verdin iMX8MP
      Reviewed-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Link: https://lore.kernel.org/r/20230506235845.246105-1-marex@denx.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8efbdbfa
  3. 09 May, 2023 1 commit
  4. 07 May, 2023 2 commits
    • Christophe JAILLET's avatar
      net: mdio: mvusb: Fix an error handling path in mvusb_mdio_probe() · 27c1eaa0
      Christophe JAILLET authored
      Should of_mdiobus_register() fail, a previous usb_get_dev() call should be
      undone as in the .disconnect function.
      
      Fixes: 04e37d92 ("net: phy: add marvell usb to mdio controller")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27c1eaa0
    • Eric Dumazet's avatar
      net: skb_partial_csum_set() fix against transport header magic value · 424f8416
      Eric Dumazet authored
      skb->transport_header uses the special 0xFFFF value
      to mark if the transport header was set or not.
      
      We must prevent callers to accidentaly set skb->transport_header
      to 0xFFFF. Note that only fuzzers can possibly do this today.
      
      syzbot reported:
      
      WARNING: CPU: 0 PID: 2340 at include/linux/skbuff.h:2847 skb_transport_offset include/linux/skbuff.h:2956 [inline]
      WARNING: CPU: 0 PID: 2340 at include/linux/skbuff.h:2847 virtio_net_hdr_to_skb+0xbcc/0x10c0 include/linux/virtio_net.h:103
      Modules linked in:
      CPU: 0 PID: 2340 Comm: syz-executor.0 Not tainted 6.3.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
      RIP: 0010:skb_transport_header include/linux/skbuff.h:2847 [inline]
      RIP: 0010:skb_transport_offset include/linux/skbuff.h:2956 [inline]
      RIP: 0010:virtio_net_hdr_to_skb+0xbcc/0x10c0 include/linux/virtio_net.h:103
      Code: 41 39 df 0f 82 c3 04 00 00 48 8b 7c 24 10 44 89 e6 e8 08 6e 59 ff 48 85 c0 74 54 e8 ce 36 7e fc e9 37 f8 ff ff e8 c4 36 7e fc <0f> 0b e9 93 f8 ff ff 44 89 f7 44 89 e6 e8 32 38 7e fc 45 39 e6 0f
      RSP: 0018:ffffc90004497880 EFLAGS: 00010293
      RAX: ffffffff84fea55c RBX: 000000000000ffff RCX: ffff888120be2100
      RDX: 0000000000000000 RSI: 000000000000ffff RDI: 000000000000ffff
      RBP: ffffc90004497990 R08: ffffffff84fe9de5 R09: 0000000000000034
      R10: ffffea00048ebd80 R11: 0000000000000034 R12: ffff88811dc2d9c8
      R13: dffffc0000000000 R14: ffff88811dc2d9ae R15: 1ffff11023b85b35
      FS: 00007f9211a59700(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000200002c0 CR3: 00000001215a5000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      packet_snd net/packet/af_packet.c:3076 [inline]
      packet_sendmsg+0x4590/0x61a0 net/packet/af_packet.c:3115
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      __sys_sendto+0x472/0x630 net/socket.c:2144
      __do_sys_sendto net/socket.c:2156 [inline]
      __se_sys_sendto net/socket.c:2152 [inline]
      __x64_sys_sendto+0xe5/0x100 net/socket.c:2152
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f9210c8c169
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f9211a59168 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007f9210dabf80 RCX: 00007f9210c8c169
      RDX: 000000000000ffed RSI: 00000000200000c0 RDI: 0000000000000003
      RBP: 00007f9210ce7ca1 R08: 0000000020000540 R09: 0000000000000014
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffe135d65cf R14: 00007f9211a59300 R15: 0000000000022000
      
      Fixes: 66e4c8d9 ("net: warn if transport header was not set")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      424f8416
  5. 06 May, 2023 3 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · ed23734c
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter.
      
        Current release - regressions:
      
         - sched: act_pedit: free pedit keys on bail from offset check
      
        Current release - new code bugs:
      
         - pds_core:
            - Kconfig fixes (DEBUGFS and AUXILIARY_BUS)
            - fix mutex double unlock in error path
      
        Previous releases - regressions:
      
         - sched: cls_api: remove block_cb from driver_list before freeing
      
         - nf_tables: fix ct untracked match breakage
      
         - eth: mtk_eth_soc: drop generic vlan rx offload
      
         - sched: flower: fix error handler on replace
      
        Previous releases - always broken:
      
         - tcp: fix skb_copy_ubufs() vs BIG TCP
      
         - ipv6: fix skb hash for some RST packets
      
         - af_packet: don't send zero-byte data in packet_sendmsg_spkt()
      
         - rxrpc: timeout handling fixes after moving client call connection
           to the I/O thread
      
         - ixgbe: fix panic during XDP_TX with > 64 CPUs
      
         - igc: RMW the SRRCTL register to prevent losing timestamp config
      
         - dsa: mt7530: fix corrupt frames using TRGMII on 40 MHz XTAL MT7621
      
         - r8152:
            - fix flow control issue of RTL8156A
            - fix the poor throughput for 2.5G devices
            - move setting r8153b_rx_agg_chg_indicate() to fix coalescing
            - enable autosuspend
      
         - ncsi: clear Tx enable mode when handling a Config required AEN
      
         - octeontx2-pf: macsec: fixes for CN10KB ASIC rev
      
        Misc:
      
         - 9p: remove INET dependency"
      
      * tag 'net-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
        net: bcmgenet: Remove phy_stop() from bcmgenet_netif_stop()
        pds_core: fix mutex double unlock in error path
        net/sched: flower: fix error handler on replace
        Revert "net/sched: flower: Fix wrong handle assignment during filter change"
        net/sched: flower: fix filter idr initialization
        net: fec: correct the counting of XDP sent frames
        bonding: add xdp_features support
        net: enetc: check the index of the SFI rather than the handle
        sfc: Add back mailing list
        virtio_net: suppress cpu stall when free_unused_bufs
        ice: block LAN in case of VF to VF offload
        net: dsa: mt7530: fix network connectivity with multiple CPU ports
        net: dsa: mt7530: fix corrupt frames using trgmii on 40 MHz XTAL MT7621
        9p: Remove INET dependency
        netfilter: nf_tables: fix ct untracked match breakage
        af_packet: Don't send zero-byte data in packet_sendmsg_spkt().
        igc: read before write to SRRCTL register
        pds_core: add AUXILIARY_BUS and NET_DEVLINK to Kconfig
        pds_core: remove CONFIG_DEBUG_FS from makefile
        ionic: catch failure from devlink_alloc
        ...
      ed23734c
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · a5e21900
      Linus Torvalds authored
      Pull more i2c updates from Wolfram Sang:
       "Some more driver bugfixes and a DT binding conversion"
      
      * tag 'i2c-for-6.4-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        dt-bindings: i2c: brcm,kona-i2c: convert to YAML
        i2c: gxp: fix build failure without CONFIG_I2C_SLAVE
        i2c: imx-lpi2c: avoid taking clk_prepare mutex in PM callbacks
        i2c: omap: Fix standard mode false ACK readings
        i2c: tegra: Fix PEC support for SMBUS block read
      a5e21900
    • Lukas Bulwahn's avatar
      s390: remove the unneeded select GCC12_NO_ARRAY_BOUNDS · c12753d5
      Lukas Bulwahn authored
      Commit 0da6e5fd ("gcc: disable '-Warray-bounds' for gcc-13 too") makes
      config GCC11_NO_ARRAY_BOUNDS to be for disabling -Warray-bounds in any gcc
      version 11 and upwards, and with that, removes the GCC12_NO_ARRAY_BOUNDS
      config as it is now covered by the semantics of GCC11_NO_ARRAY_BOUNDS.
      
      As GCC11_NO_ARRAY_BOUNDS is yes by default, there is no need for the s390
      architecture to explicitly select GCC11_NO_ARRAY_BOUNDS. Hence, the select
      GCC12_NO_ARRAY_BOUNDS in arch/s390/Kconfig can simply be dropped.
      
      Remove the unneeded "select GCC12_NO_ARRAY_BOUNDS".
      Signed-off-by: default avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c12753d5
  6. 05 May, 2023 11 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 644bca1d
      Jakub Kicinski authored
      There's a fix which landed in net-next, pull it in along
      with the couple of minor cleanups.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      644bca1d
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 418d5c98
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Add Conor Dooley as a DT binding maintainer
      
       - Swap the order of parsing /memreserve/ and /reserved-memory nodes so
         that the /reserved-memory nodes which have more information are
         handled first
      
       - Fix some property dependencies in riscv,pmu binding
      
       - Update maintainers entries on a couple of bindings
      
      * tag 'devicetree-fixes-for-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        MAINTAINERS: add Conor as a dt-bindings maintainer
        dt-bindings: perf: riscv,pmu: fix property dependencies
        dt-bindings: xilinx: Remove Naga from memory and mtd bindings
        of: fdt: Scan /memreserve/ last
        dt-bindings: clock: r9a06g032-sysctrl: Change maintainer to Fabrizio Castro
        dt-bindings: pinctrl: renesas,rzv2m: Change maintainer to Fabrizio Castro
        dt-bindings: pinctrl: renesas,rzn1: Change maintainer to Fabrizio Castro
        dt-bindings: i2c: renesas,rzv2m: Change maintainer to Fabrizio Castro
      418d5c98
    • Linus Torvalds's avatar
      Merge tag 'docs-6.4-2' of git://git.lwn.net/linux · 647681bf
      Linus Torvalds authored
      Pull more documentation updates from Jonathan Corbet:
       "A handful of late-arriving documentation fixes, plus one Spanish
        translation that has been ready for some time but got applied late"
      
      * tag 'docs-6.4-2' of git://git.lwn.net/linux:
        docs/sp_SP: Add translation of process/adding-syscalls
        CREDITS: Update email address for Mat Martineau
        Documentation: update kernel stack for x86_64
        docs: Remove unnecessary unicode character
        docs: fix "Reviewd" typo
        Documentation: timers: hrtimers: Make hybrid union historical
        docs/admin-guide/mm/ksm.rst fix intraface -> interface typo
        doc:it_IT: fix some typos
      647681bf
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · e919a3f7
      Linus Torvalds authored
      Pull more tracing updates from Steven Rostedt:
      
       - Make buffer_percent read/write.
      
         The buffer_percent file is how users can state how long to block on
         the tracing buffer depending on how much is in the buffer. When it
         hits the "buffer_percent" it will wake the task waiting on the
         buffer. For some reason it was set to read-only.
      
         This was not noticed because testing was done as root without
         SELinux, but with SELinux it will prevent even root to write to it
         without having CAP_DAC_OVERRIDE.
      
       - The "touched_functions" was added this merge window, but one of the
         reasons for adding it was not implemented.
      
         That was to show what functions were not only touched, but had either
         a direct trampoline attached to it, or a kprobe or live kernel
         patching that can "hijack" the function to run a different function.
         The point is to know if there's functions in the kernel that may not
         be behaving as the kernel code shows. This can be used for debugging.
      
         TODO: Add this information to kernel oops too.
      
      * tag 'trace-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        ftrace: Add MODIFIED flag to show if IPMODIFY or direct was attached
        tracing: Fix permissions for the buffer_percent file
      e919a3f7
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b115d85a
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
      
       - Introduce local{,64}_try_cmpxchg() - a slightly more optimal
         primitive, which will be used in perf events ring-buffer code
      
       - Simplify/modify rwsems on PREEMPT_RT, to address writer starvation
      
       - Misc cleanups/fixes
      
      * tag 'locking-core-2023-05-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/atomic: Correct (cmp)xchg() instrumentation
        locking/x86: Define arch_try_cmpxchg_local()
        locking/arch: Wire up local_try_cmpxchg()
        locking/generic: Wire up local{,64}_try_cmpxchg()
        locking/atomic: Add generic try_cmpxchg{,64}_local() support
        locking/rwbase: Mitigate indefinite writer starvation
        locking/arch: Rename all internal __xchg() names to __arch_xchg()
      b115d85a
    • Linus Torvalds's avatar
      Merge branch 'x86-uaccess-cleanup': x86 uaccess header cleanups · d5ed10bb
      Linus Torvalds authored
      Merge my x86 uaccess updates branch.
      
      The LAM ("Linear Address Masking") updates in this release made me
      unhappy about how "access_ok()" was done, and it actually turned out to
      have a couple of small bugs in it too.  This is my cleanup of the code:
      
       - use the sign bit of the __user pointer rather than masking the
         address and checking it against the TASK_SIZE range.
      
         We already did this part for the get/put_user() side, but
         'access_ok()' did the naïve "mask and range check" thing, which not
         only generates nasty code, but also ended up meaning that __access_ok
         itself didn't do a good job, and so copy_from_user_nmi() didn't get
         the check right.
      
       - move all the code that is 64-bit only into the 64-bit version of the
         header file, so that we don't unnecessarily pollute the shared x86
         code and make it look like LAM might work in 32-bit too.
      
       - fix a bug in the address masking (that doesn't end up mattering: in
         this case the fix was to just remove the buggy code entirely).
      
       - a couple of trivial cleanups and added commentary about the
         access_ok() rules.
      
      * x86-uaccess-cleanup:
        x86-64: mm: clarify the 'positive addresses' user address rules
        x86: mm: remove 'sign' games from LAM untagged_addr*() macros
        x86: uaccess: move 32-bit and 64-bit parts into proper <asm/uaccess_N.h> header
        x86: mm: remove architecture-specific 'access_ok()' define
        x86-64: make access_ok() independent of LAM
      d5ed10bb
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.4-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 982365a8
      Linus Torvalds authored
      Pull more RISC-V updates from Palmer Dabbelt:
      
       - Support for hibernation
      
       - The .rela.dyn section has been moved to the init area
      
       - A fix for the SBI probing to allow for implementation-defined
         behavior
      
       - Various other fixes and cleanups throughout the tree
      
      * tag 'riscv-for-linus-6.4-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: include cpufeature.h in cpufeature.c
        riscv: Move .rela.dyn to the init sections
        dt-bindings: riscv: explicitly mention assumption of Zicsr & Zifencei support
        riscv: compat_syscall_table: Fixup compile warning
        RISC-V: fixup in-flight collision with ARCH_WANT_OPTIMIZE_VMEMMAP rename
        RISC-V: fix sifive and thead section mismatches in errata
        RISC-V: Align SBI probe implementation with spec
        riscv: mm: remove redundant parameter of create_fdt_early_page_table
        riscv: Adjust dependencies of HAVE_DYNAMIC_FTRACE selection
        RISC-V: Add arch functions to support hibernation/suspend-to-disk
        RISC-V: mm: Enable huge page support to kernel_page_present() function
        RISC-V: Factor out common code of __cpu_resume_enter()
        RISC-V: Change suspend_save_csrs and suspend_restore_csrs to public function
      982365a8
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 493804a6
      Linus Torvalds authored
      Pull more kvm updates from Paolo Bonzini:
       "This includes the 6.4 changes for RISC-V, and a few bugfix patches for
        other architectures. For x86, this closes a longstanding performance
        issue in the newer and (usually) more scalable page table management
        code.
      
        RISC-V:
         - ONE_REG interface to enable/disable SBI extensions
         - Zbb extension for Guest/VM
         - AIA CSR virtualization
      
        x86:
         - Fix a long-standing TDP MMU flaw, where unloading roots on a vCPU
           can result in the root being freed even though the root is
           completely valid and can be reused as-is (with a TLB flush).
      
        s390:
         - A couple of bugfixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: fix race in gmap_make_secure()
        KVM: s390: pv: fix asynchronous teardown for small VMs
        KVM: x86: Preserve TDP MMU roots until they are explicitly invalidated
        RISC-V: KVM: Virtualize per-HART AIA CSRs
        RISC-V: KVM: Use bitmap for irqs_pending and irqs_pending_mask
        RISC-V: KVM: Add ONE_REG interface for AIA CSRs
        RISC-V: KVM: Implement subtype for CSR ONE_REG interface
        RISC-V: KVM: Initial skeletal support for AIA
        RISC-V: KVM: Drop the _MASK suffix from hgatp.VMID mask defines
        RISC-V: Detect AIA CSRs from ISA string
        RISC-V: Add AIA related CSR defines
        RISC-V: KVM: Allow Zbb extension for Guest/VM
        RISC-V: KVM: Add ONE_REG interface to enable/disable SBI extensions
        RISC-V: KVM: Alphabetize selects
        KVM: RISC-V: Retry fault if vma_lookup() results become invalid
      493804a6
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.4-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 7163a211
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "Remove an ACPI backlight quirk for Lenovo ThinkPad W530, added during
        the 6.3 cycle, that turned out to do more harm than help (Hans de
        Goede)"
      
      * tag 'acpi-6.4-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: video: Remove acpi_backlight=video quirk for Lenovo ThinkPad W530
      7163a211
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.4-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 817e1af1
      Linus Torvalds authored
      Pull thermal control fixes from Rafael Wysocki:
       "These fix a NULL pointer dereference in the Intel powerclamp driver
        introduced during the 6.3 cycle and update MAINTAINERS to match recent
        code changes.
      
        Specifics:
      
         - Fix NULL pointer access in the Intel powerclamp thermal driver that
           occurs on attempts to set the cooling device state to 0 in the
           default configuration (Srinivas Pandruvada)
      
         - Drop the stale MAINTAINERS entry for the Intel Menlow thermal
           driver that has been removed recently (Lukas Bulwahn)"
      
      * tag 'thermal-6.4-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        MAINTAINERS: remove section INTEL MENLOW THERMAL DRIVER
        thermal: intel: powerclamp: Fix NULL pointer access issue
      817e1af1
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · b49178e6
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - Fix for mediatek driver warning for variable used uninitialized and
         for wrong pll math
      
      * tag 'phy-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: mediatek: hdmi: mt8195: fix wrong pll calculus
        phy: mediatek: hdmi: mt8195: fix uninitialized variable usage in pll_calc
      b49178e6