1. 15 Dec, 2021 3 commits
    • Maxim Galaganov's avatar
      mptcp: fix deadlock in __mptcp_push_pending() · 3d79e375
      Maxim Galaganov authored
      __mptcp_push_pending() may call mptcp_flush_join_list() with subflow
      socket lock held. If such call hits mptcp_sockopt_sync_all() then
      subsequently __mptcp_sockopt_sync() could try to lock the subflow
      socket for itself, causing a deadlock.
      
      sysrq: Show Blocked State
      task:ss-server       state:D stack:    0 pid:  938 ppid:     1 flags:0x00000000
      Call Trace:
       <TASK>
       __schedule+0x2d6/0x10c0
       ? __mod_memcg_state+0x4d/0x70
       ? csum_partial+0xd/0x20
       ? _raw_spin_lock_irqsave+0x26/0x50
       schedule+0x4e/0xc0
       __lock_sock+0x69/0x90
       ? do_wait_intr_irq+0xa0/0xa0
       __lock_sock_fast+0x35/0x50
       mptcp_sockopt_sync_all+0x38/0xc0
       __mptcp_push_pending+0x105/0x200
       mptcp_sendmsg+0x466/0x490
       sock_sendmsg+0x57/0x60
       __sys_sendto+0xf0/0x160
       ? do_wait_intr_irq+0xa0/0xa0
       ? fpregs_restore_userregs+0x12/0xd0
       __x64_sys_sendto+0x20/0x30
       do_syscall_64+0x38/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f9ba546c2d0
      RSP: 002b:00007ffdc3b762d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007f9ba56c8060 RCX: 00007f9ba546c2d0
      RDX: 000000000000077a RSI: 0000000000e5e180 RDI: 0000000000000234
      RBP: 0000000000cc57f0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f9ba56c8060
      R13: 0000000000b6ba60 R14: 0000000000cc7840 R15: 41d8685b1d7901b8
       </TASK>
      
      Fix the issue by using __mptcp_flush_join_list() instead of plain
      mptcp_flush_join_list() inside __mptcp_push_pending(), as suggested by
      Florian. The sockopt sync will be deferred to the workqueue.
      
      Fixes: 1b3e7ede ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY")
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/244Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarMaxim Galaganov <max@internet.ru>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d79e375
    • Florian Westphal's avatar
      mptcp: clear 'kern' flag from fallback sockets · d6692b3b
      Florian Westphal authored
      The mptcp ULP extension relies on sk->sk_sock_kern being set correctly:
      It prevents setsockopt(fd, IPPROTO_TCP, TCP_ULP, "mptcp", 6); from
      working for plain tcp sockets (any userspace-exposed socket).
      
      But in case of fallback, accept() can return a plain tcp sk.
      In such case, sk is still tagged as 'kernel' and setsockopt will work.
      
      This will crash the kernel, The subflow extension has a NULL ctx->conn
      mptcp socket:
      
      BUG: KASAN: null-ptr-deref in subflow_data_ready+0x181/0x2b0
      Call Trace:
       tcp_data_ready+0xf8/0x370
       [..]
      
      Fixes: cf7da0d6 ("mptcp: Create SUBFLOW socket for incoming connections")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d6692b3b
    • Florian Westphal's avatar
      mptcp: remove tcp ulp setsockopt support · 404cd9a2
      Florian Westphal authored
      TCP_ULP setsockopt cannot be used for mptcp because its already
      used internally to plumb subflow (tcp) sockets to the mptcp layer.
      
      syzbot managed to trigger a crash for mptcp connections that are
      in fallback mode:
      
      KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027]
      CPU: 1 PID: 1083 Comm: syz-executor.3 Not tainted 5.16.0-rc2-syzkaller #0
      RIP: 0010:tls_build_proto net/tls/tls_main.c:776 [inline]
      [..]
       __tcp_set_ulp net/ipv4/tcp_ulp.c:139 [inline]
       tcp_set_ulp+0x428/0x4c0 net/ipv4/tcp_ulp.c:160
       do_tcp_setsockopt+0x455/0x37c0 net/ipv4/tcp.c:3391
       mptcp_setsockopt+0x1b47/0x2400 net/mptcp/sockopt.c:638
      
      Remove support for TCP_ULP setsockopt.
      
      Fixes: d9e4c129 ("mptcp: only admit explicitly supported sockopt")
      Reported-by: syzbot+1fd9b69cde42967d1add@syzkaller.appspotmail.com
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      404cd9a2
  2. 14 Dec, 2021 19 commits
  3. 13 Dec, 2021 10 commits
    • Stefan Assmann's avatar
      iavf: do not override the adapter state in the watchdog task (again) · fe523d7c
      Stefan Assmann authored
      The watchdog task incorrectly changes the state to __IAVF_RESETTING,
      instead of letting the reset task take care of that. This was already
      resolved by commit 22c8fd71 ("iavf: do not override the adapter
      state in the watchdog task") but the problem was reintroduced by the
      recent code refactoring in commit 45eebd62 ("iavf: Refactor iavf
      state machine tracking").
      
      Fixes: 45eebd62 ("iavf: Refactor iavf state machine tracking")
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      fe523d7c
    • Dan Carpenter's avatar
      iavf: missing unlocks in iavf_watchdog_task() · bc2f39a6
      Dan Carpenter authored
      This code was re-organized and there some unlocks missing now.
      
      Fixes: 898ef1cb ("iavf: Combine init and watchdog state machines")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      bc2f39a6
    • David Wu's avatar
      net: stmmac: Add GFP_DMA32 for rx buffers if no 64 capability · 884d2b84
      David Wu authored
      Use page_pool_alloc_pages instead of page_pool_dev_alloc_pages, which
      can give the gfp parameter, in the case of not supporting 64-bit width,
      using 32-bit address memory can reduce a copy from swiotlb.
      Signed-off-by: default avatarDavid Wu <david.wu@rock-chips.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      884d2b84
    • Russell King (Oracle)'s avatar
      net: phy: add a note about refcounting · d33dae51
      Russell King (Oracle) authored
      Recently, a patch has been submitted to "fix" the refcounting for a DT
      node in of_mdiobus_link_mdiodev(). This is not a leaked refcount. The
      refcount is passed to the new device.
      
      Sadly, coccicheck identifies this location as a leaked refcount, which
      means we're likely to keep getting patches to "fix" this. However,
      fixing this will cause breakage. Add a comment to state that the lack
      of of_node_put() here is intentional.
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d33dae51
    • Wang Qing's avatar
      net: ethernet: ti: add missing of_node_put before return · be565ec7
      Wang Qing authored
      Fix following coccicheck warning:
      WARNING: Function "for_each_child_of_node"
      should have of_node_put() before return.
      
      Early exits from for_each_child_of_node should decrement the
      node reference counter.
      Signed-off-by: default avatarWang Qing <wangqing@vivo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be565ec7
    • Hangbin Liu's avatar
      selftest/net/forwarding: declare NETIFS p9 p10 · 71da1aec
      Hangbin Liu authored
      The recent GRE selftests defined NUM_NETIFS=10. If the users copy
      forwarding.config.sample to forwarding.config directly, they will get
      error "Command line is not complete" when run the GRE tests, because
      create_netif_veth() failed with no interface name defined.
      
      Fix it by extending the NETIFS with p9 and p10.
      
      Fixes: 2800f248 ("selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnel")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71da1aec
    • Marek Behún's avatar
      net: dsa: mv88e6xxx: Unforce speed & duplex in mac_link_down() · 9d591fc0
      Marek Behún authored
      Commit 64d47d50 ("net: dsa: mv88e6xxx: configure interface settings
      in mac_config") removed forcing of speed and duplex from
      mv88e6xxx_mac_config(), where the link is forced down, and left it only
      in mv88e6xxx_mac_link_up(), by which time link is unforced.
      
      It seems that (at least on 88E6190) when changing cmode to 2500base-x,
      if the link is not forced down, but the speed or duplex are still
      forced, the forcing of new settings for speed & duplex doesn't take in
      mv88e6xxx_mac_link_up().
      
      Fix this by unforcing speed & duplex in mv88e6xxx_mac_link_down().
      
      Fixes: 64d47d50 ("net: dsa: mv88e6xxx: configure interface settings in mac_config")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d591fc0
    • Willem de Bruijn's avatar
      selftests/net: toeplitz: fix udp option · a8d13611
      Willem de Bruijn authored
      Tiny fix. Option -u ("use udp") does not take an argument.
      
      It can cause the next argument to silently be ignored.
      
      Fixes: 5ebfb4cc ("selftests/net: toeplitz test")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8d13611
    • Miaoqian Lin's avatar
      net: bcmgenet: Fix NULL vs IS_ERR() checking · ab8eb798
      Miaoqian Lin authored
      The phy_attach() function does not return NULL. It returns error pointers.
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab8eb798
    • Davide Caratti's avatar
      net/sched: sch_ets: don't remove idle classes from the round-robin list · c062f2a0
      Davide Caratti authored
      Shuang reported that the following script:
      
       1) tc qdisc add dev ddd0 handle 10: parent 1: ets bands 8 strict 4 priomap 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7
       2) mausezahn ddd0  -A 10.10.10.1 -B 10.10.10.2 -c 0 -a own -b 00:c1:a0:c1:a0:00 -t udp &
       3) tc qdisc change dev ddd0 handle 10: ets bands 4 strict 2 quanta 2500 2500 priomap 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
      
      crashes systematically when line 2) is commented:
      
       list_del corruption, ffff8e028404bd30->next is LIST_POISON1 (dead000000000100)
       ------------[ cut here ]------------
       kernel BUG at lib/list_debug.c:47!
       invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
       CPU: 0 PID: 954 Comm: tc Not tainted 5.16.0-rc4+ #478
       Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
       RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x47
       Code: fe ff 0f 0b 48 89 c1 4c 89 c6 48 c7 c7 08 42 1b 87 e8 1d c5 fe ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 98 42 1b 87 e8 09 c5 fe ff <0f> 0b 48 c7 c7 48 43 1b 87 e8 fb c4 fe ff 0f 0b 48 89 f2 48 89 fe
       RSP: 0018:ffffae46807a3888 EFLAGS: 00010246
       RAX: 000000000000004e RBX: 0000000000000007 RCX: 0000000000000202
       RDX: 0000000000000000 RSI: ffffffff871ac536 RDI: 00000000ffffffff
       RBP: ffffae46807a3a10 R08: 0000000000000000 R09: c0000000ffff7fff
       R10: 0000000000000001 R11: ffffae46807a36a8 R12: ffff8e028404b800
       R13: ffff8e028404bd30 R14: dead000000000100 R15: ffff8e02fafa2400
       FS:  00007efdc92e4480(0000) GS:ffff8e02fb600000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000682f48 CR3: 00000001058be000 CR4: 0000000000350ef0
       Call Trace:
        <TASK>
        ets_qdisc_change+0x58b/0xa70 [sch_ets]
        tc_modify_qdisc+0x323/0x880
        rtnetlink_rcv_msg+0x169/0x4a0
        netlink_rcv_skb+0x50/0x100
        netlink_unicast+0x1a5/0x280
        netlink_sendmsg+0x257/0x4d0
        sock_sendmsg+0x5b/0x60
        ____sys_sendmsg+0x1f2/0x260
        ___sys_sendmsg+0x7c/0xc0
        __sys_sendmsg+0x57/0xa0
        do_syscall_64+0x3a/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
       RIP: 0033:0x7efdc8031338
       Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 43 2c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 41 89 d4 55
       RSP: 002b:00007ffdf1ce9828 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 0000000061b37a97 RCX: 00007efdc8031338
       RDX: 0000000000000000 RSI: 00007ffdf1ce9890 RDI: 0000000000000003
       RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000078a940
       R10: 000000000000000c R11: 0000000000000246 R12: 0000000000000001
       R13: 0000000000688880 R14: 0000000000000000 R15: 0000000000000000
        </TASK>
       Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt iTCO_vendor_support intel_rapl_msr intel_rapl_common joydev pcspkr i2c_i801 virtio_balloon i2c_smbus lpc_ich ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel serio_raw ghash_clmulni_intel ahci libahci libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: sch_ets]
       ---[ end trace f35878d1912655c2 ]---
       RIP: 0010:__list_del_entry_valid.cold.1+0x12/0x47
       Code: fe ff 0f 0b 48 89 c1 4c 89 c6 48 c7 c7 08 42 1b 87 e8 1d c5 fe ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 98 42 1b 87 e8 09 c5 fe ff <0f> 0b 48 c7 c7 48 43 1b 87 e8 fb c4 fe ff 0f 0b 48 89 f2 48 89 fe
       RSP: 0018:ffffae46807a3888 EFLAGS: 00010246
       RAX: 000000000000004e RBX: 0000000000000007 RCX: 0000000000000202
       RDX: 0000000000000000 RSI: ffffffff871ac536 RDI: 00000000ffffffff
       RBP: ffffae46807a3a10 R08: 0000000000000000 R09: c0000000ffff7fff
       R10: 0000000000000001 R11: ffffae46807a36a8 R12: ffff8e028404b800
       R13: ffff8e028404bd30 R14: dead000000000100 R15: ffff8e02fafa2400
       FS:  00007efdc92e4480(0000) GS:ffff8e02fb600000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000682f48 CR3: 00000001058be000 CR4: 0000000000350ef0
       Kernel panic - not syncing: Fatal exception in interrupt
       Kernel Offset: 0x4e00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
       ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      we can remove 'q->classes[i].alist' only if DRR class 'i' was part of the
      active list. In the ETS scheduler DRR classes belong to that list only if
      the queue length is greater than zero: we need to test for non-zero value
      of 'q->classes[i].qdisc->q.qlen' before removing from the list, similarly
      to what has been done elsewhere in the ETS code.
      
      Fixes: de6d2592 ("net/sched: sch_ets: don't peek at classes beyond 'nbands'")
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c062f2a0
  4. 12 Dec, 2021 7 commits
  5. 11 Dec, 2021 1 commit