1. 02 Apr, 2023 4 commits
    • Xin Long's avatar
      sctp: check send stream number after wait_for_sndbuf · 2584024b
      Xin Long authored
      This patch fixes a corner case where the asoc out stream count may change
      after wait_for_sndbuf.
      
      When the main thread in the client starts a connection, if its out stream
      count is set to N while the in stream count in the server is set to N - 2,
      another thread in the client keeps sending the msgs with stream number
      N - 1, and waits for sndbuf before processing INIT_ACK.
      
      However, after processing INIT_ACK, the out stream count in the client is
      shrunk to N - 2, the same to the in stream count in the server. The crash
      occurs when the thread waiting for sndbuf is awake and sends the msg in a
      non-existing stream(N - 1), the call trace is as below:
      
        KASAN: null-ptr-deref in range [0x0000000000000038-0x000000000000003f]
        Call Trace:
         <TASK>
         sctp_cmd_send_msg net/sctp/sm_sideeffect.c:1114 [inline]
         sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1777 [inline]
         sctp_side_effects net/sctp/sm_sideeffect.c:1199 [inline]
         sctp_do_sm+0x197d/0x5310 net/sctp/sm_sideeffect.c:1170
         sctp_primitive_SEND+0x9f/0xc0 net/sctp/primitive.c:163
         sctp_sendmsg_to_asoc+0x10eb/0x1a30 net/sctp/socket.c:1868
         sctp_sendmsg+0x8d4/0x1d90 net/sctp/socket.c:2026
         inet_sendmsg+0x9d/0xe0 net/ipv4/af_inet.c:825
         sock_sendmsg_nosec net/socket.c:722 [inline]
         sock_sendmsg+0xde/0x190 net/socket.c:745
      
      The fix is to add an unlikely check for the send stream number after the
      thread wakes up from the wait_for_sndbuf.
      
      Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations")
      Reported-by: syzbot+47c24ca20a2fa01f082e@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2584024b
    • Felix Fietkau's avatar
      net: ethernet: mtk_eth_soc: fix remaining throughput regression · e669ce46
      Felix Fietkau authored
      Based on further tests, it seems that the QDMA shaper is not able to
      perform shaping close to the MAC link rate without throughput loss.
      This cannot be compensated by increasing the shaping rate, so it seems
      to be an internal limit.
      
      Fix the remaining throughput regression by detecting that condition and
      limiting shaping to ports with lower link speed.
      
      This patch intentionally ignores link speed gain from TRGMII, because
      even on such links, shaping to 1000 Mbit/s incurs some throughput
      degradation.
      
      Fixes: f63959c7 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues")
      Tested-By: default avatarFrank Wunderlich <frank-w@public-files.de>
      Reported-by: default avatarFrank Wunderlich <frank-w@public-files.de>
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e669ce46
    • Gustav Ekelund's avatar
      net: dsa: mv88e6xxx: Reset mv88e6393x force WD event bit · 089b91a0
      Gustav Ekelund authored
      The force watchdog event bit is not cleared during SW reset in the
      mv88e6393x switch. This is a different behavior compared to mv886390 which
      clears the force WD event bit as advertised. This causes a force WD event
      to be handled over and over again as the SW reset following the event never
      clears the force WD event bit.
      
      Explicitly clear the watchdog event register to 0 in irq_action when
      handling an event to prevent the switch from sending continuous interrupts.
      Marvell aren't aware of any other stuck bits apart from the force WD
      bit.
      
      Fixes: de776d0d ("net: dsa: mv88e6xxx: add support for mv88e6393x family"
      Signed-off-by: default avatarGustav Ekelund <gustaek@axis.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      089b91a0
    • Jakub Kicinski's avatar
      net: don't let netpoll invoke NAPI if in xmit context · 275b471e
      Jakub Kicinski authored
      Commit 0db3dc73 ("[NETPOLL]: tx lock deadlock fix") narrowed
      down the region under netif_tx_trylock() inside netpoll_send_skb().
      (At that point in time netif_tx_trylock() would lock all queues of
      the device.) Taking the tx lock was problematic because driver's
      cleanup method may take the same lock. So the change made us hold
      the xmit lock only around xmit, and expected the driver to take
      care of locking within ->ndo_poll_controller().
      
      Unfortunately this only works if netpoll isn't itself called with
      the xmit lock already held. Netpoll code is careful and uses
      trylock(). The drivers, however, may be using plain lock().
      Printing while holding the xmit lock is going to result in rare
      deadlocks.
      
      Luckily we record the xmit lock owners, so we can scan all the queues,
      the same way we scan NAPI owners. If any of the xmit locks is held
      by the local CPU we better not attempt any polling.
      
      It would be nice if we could narrow down the check to only the NAPIs
      and the queue we're trying to use. I don't see a way to do that now.
      Reported-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Fixes: 0db3dc73 ("[NETPOLL]: tx lock deadlock fix")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      275b471e
  2. 01 Apr, 2023 2 commits
    • Eric Dumazet's avatar
      icmp: guard against too small mtu · 7d63b671
      Eric Dumazet authored
      syzbot was able to trigger a panic [1] in icmp_glue_bits(), or
      more exactly in skb_copy_and_csum_bits()
      
      There is no repro yet, but I think the issue is that syzbot
      manages to lower device mtu to a small value, fooling __icmp_send()
      
      __icmp_send() must make sure there is enough room for the
      packet to include at least the headers.
      
      We might in the future refactor skb_copy_and_csum_bits() and its
      callers to no longer crash when something bad happens.
      
      [1]
      kernel BUG at net/core/skbuff.c:3343 !
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 15766 Comm: syz-executor.0 Not tainted 6.3.0-rc4-syzkaller-00039-gffe78bbd #0
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
      RIP: 0010:skb_copy_and_csum_bits+0x798/0x860 net/core/skbuff.c:3343
      Code: f0 c1 c8 08 41 89 c6 e9 73 ff ff ff e8 61 48 d4 f9 e9 41 fd ff ff 48 8b 7c 24 48 e8 52 48 d4 f9 e9 c3 fc ff ff e8 c8 27 84 f9 <0f> 0b 48 89 44 24 28 e8 3c 48 d4 f9 48 8b 44 24 28 e9 9d fb ff ff
      RSP: 0018:ffffc90000007620 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000000000001e8 RCX: 0000000000000100
      RDX: ffff8880276f6280 RSI: ffffffff87fdd138 RDI: 0000000000000005
      RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
      R10: 00000000000001e8 R11: 0000000000000001 R12: 000000000000003c
      R13: 0000000000000000 R14: ffff888028244868 R15: 0000000000000b0e
      FS: 00007fbc81f1c700(0000) GS:ffff88802ca00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2df43000 CR3: 00000000744db000 CR4: 0000000000150ef0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <IRQ>
      icmp_glue_bits+0x7b/0x210 net/ipv4/icmp.c:353
      __ip_append_data+0x1d1b/0x39f0 net/ipv4/ip_output.c:1161
      ip_append_data net/ipv4/ip_output.c:1343 [inline]
      ip_append_data+0x115/0x1a0 net/ipv4/ip_output.c:1322
      icmp_push_reply+0xa8/0x440 net/ipv4/icmp.c:370
      __icmp_send+0xb80/0x1430 net/ipv4/icmp.c:765
      ipv4_send_dest_unreach net/ipv4/route.c:1239 [inline]
      ipv4_link_failure+0x5a9/0x9e0 net/ipv4/route.c:1246
      dst_link_failure include/net/dst.h:423 [inline]
      arp_error_report+0xcb/0x1c0 net/ipv4/arp.c:296
      neigh_invalidate+0x20d/0x560 net/core/neighbour.c:1079
      neigh_timer_handler+0xc77/0xff0 net/core/neighbour.c:1166
      call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
      expire_timers+0x29b/0x4b0 kernel/time/timer.c:1751
      __run_timers kernel/time/timer.c:2022 [inline]
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: syzbot+d373d60fddbdc915e666@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230330174502.1915328-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7d63b671
    • Jakub Kicinski's avatar
      Revert "net: netcp: MAX_SKB_FRAGS is now 'int'" · adef41b0
      Jakub Kicinski authored
      This reverts commit c5b959ee.
      
      Reverted change is required after commit 3948b059 ("net: introduce
      a config option to tweak MAX_SKB_FRAGS") which does not exist
      in this tree, yet. It's only present in -next trees at the time
      of writing.
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/all/20230331214444.GA1426512@dev-arch.thelio-3990X/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      adef41b0
  3. 31 Mar, 2023 11 commits
  4. 30 Mar, 2023 23 commits