1. 20 Dec, 2023 1 commit
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: set transport offset from mac header for netdev/egress · 0ae8e4cc
      Pablo Neira Ayuso authored
      Before this patch, transport offset (pkt->thoff) provides an offset
      relative to the network header. This is fine for the inet families
      because skb->data points to the network header in such case. However,
      from netdev/egress, skb->data points to the mac header (if available),
      thus, pkt->thoff is missing the mac header length.
      
      Add skb_network_offset() to the transport offset (pkt->thoff) for
      netdev, so transport header mangling works as expected. Adjust payload
      fast eval function to use skb->data now that pkt->thoff provides an
      absolute offset. This explains why users report that matching on
      egress/netdev works but payload mangling does not.
      
      This patch implicitly fixes payload mangling for IPv4 packets in
      netdev/egress given skb_store_bits() requires an offset from skb->data
      to reach the transport header.
      
      I suspect that nft_exthdr and the trace infra were also broken from
      netdev/egress because they also take skb->data as start, and pkt->thoff
      was not correct.
      
      Note that IPv6 is fine because ipv6_find_hdr() already provides a
      transport offset starting from skb->data, which includes
      skb_network_offset().
      
      The bridge family also uses nft_set_pktinfo_ipv4_validate(), but there
      skb_network_offset() is zero, so the update in this patch does not alter
      the existing behaviour.
      
      Fixes: 42df6e1d ("netfilter: Introduce egress hook")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0ae8e4cc
  2. 19 Dec, 2023 6 commits
    • Paolo Abeni's avatar
      Merge branch 'check-vlan-filter-feature-in-vlan_vids_add_by_dev-and-vlan_vids_del_by_dev' · 8353c2ab
      Paolo Abeni authored
      Liu Jian says:
      
      ====================
      check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()
      
      v2->v3:
      	Filter using vlan_hw_filter_capable().
      	Add one basic test.
      ====================
      
      Link: https://lore.kernel.org/r/20231216075219.2379123-1-liujian56@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8353c2ab
    • Liu Jian's avatar
      selftests: add vlan hw filter tests · 2258b666
      Liu Jian authored
      Add one basic vlan hw filter test.
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2258b666
    • Liu Jian's avatar
      net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() · 01a564ba
      Liu Jian authored
      I got the below warning trace:
      
      WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
      CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
      Call Trace:
       rtnl_dellink
       rtnetlink_rcv_msg
       netlink_rcv_skb
       netlink_unicast
       netlink_sendmsg
       __sock_sendmsg
       ____sys_sendmsg
       ___sys_sendmsg
       __sys_sendmsg
       do_syscall_64
       entry_SYSCALL_64_after_hwframe
      
      It can be repoduced via:
      
          ip netns add ns1
          ip netns exec ns1 ip link add bond0 type bond mode 0
          ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
          ip netns exec ns1 ip link set bond_slave_1 master bond0
      [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
      [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
      [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
      [4] ip netns exec ns1 ip link set bond_slave_1 nomaster
      [5] ip netns exec ns1 ip link del veth2
          ip netns del ns1
      
      This is all caused by command [1] turning off the rx-vlan-filter function
      of bond0. The reason is the same as commit 01f4fd27 ("bonding: Fix
      incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
      [2] [3] add the same vid to slave and master respectively, causing
      command [4] to empty slave->vlan_info. The following command [5] triggers
      this problem.
      
      To fix this problem, we should add VLAN_FILTER feature checks in
      vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
      addition or deletion of vlan_vid information.
      
      Fixes: 348a1443 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      01a564ba
    • Jijie Shao's avatar
      net: hns3: add new maintainer for the HNS3 ethernet driver · fa94a0c8
      Jijie Shao authored
      Jijie Shao will be responsible for
      maintaining the hns3 driver's code in the future,
      so add Jijie to the hns3 driver's matainer list.
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Link: https://lore.kernel.org/r/20231216070413.233668-1-shaojijie@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fa94a0c8
    • Yury Norov's avatar
      net: mana: select PAGE_POOL · 340943fb
      Yury Norov authored
      Mana uses PAGE_POOL API. x86_64 defconfig doesn't select it:
      
      ld: vmlinux.o: in function `mana_create_page_pool.isra.0':
      mana_en.c:(.text+0x9ae36f): undefined reference to `page_pool_create'
      ld: vmlinux.o: in function `mana_get_rxfrag':
      mana_en.c:(.text+0x9afed1): undefined reference to `page_pool_alloc_pages'
      make[3]: *** [/home/yury/work/linux/scripts/Makefile.vmlinux:37: vmlinux] Error 1
      make[2]: *** [/home/yury/work/linux/Makefile:1154: vmlinux] Error 2
      make[1]: *** [/home/yury/work/linux/Makefile:234: __sub-make] Error 2
      make[1]: Leaving directory '/home/yury/work/build-linux-x86_64'
      make: *** [Makefile:234: __sub-make] Error 2
      
      So we need to select it explicitly.
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: Simon Horman <horms@kernel.org> # build-tested
      Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter")
      Link: https://lore.kernel.org/r/20231215203353.635379-1-yury.norov@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      340943fb
    • Ronald Wahl's avatar
      net: ks8851: Fix TX stall caused by TX buffer overrun · 3dc5d445
      Ronald Wahl authored
      There is a bug in the ks8851 Ethernet driver that more data is written
      to the hardware TX buffer than actually available. This is caused by
      wrong accounting of the free TX buffer space.
      
      The driver maintains a tx_space variable that represents the TX buffer
      space that is deemed to be free. The ks8851_start_xmit_spi() function
      adds an SKB to a queue if tx_space is large enough and reduces tx_space
      by the amount of buffer space it will later need in the TX buffer and
      then schedules a work item. If there is not enough space then the TX
      queue is stopped.
      
      The worker function ks8851_tx_work() dequeues all the SKBs and writes
      the data into the hardware TX buffer. The last packet will trigger an
      interrupt after it was send. Here it is assumed that all data fits into
      the TX buffer.
      
      In the interrupt routine (which runs asynchronously because it is a
      threaded interrupt) tx_space is updated with the current value from the
      hardware. Also the TX queue is woken up again.
      
      Now it could happen that after data was sent to the hardware and before
      handling the TX interrupt new data is queued in ks8851_start_xmit_spi()
      when the TX buffer space had still some space left. When the interrupt
      is actually handled tx_space is updated from the hardware but now we
      already have new SKBs queued that have not been written to the hardware
      TX buffer yet. Since tx_space has been overwritten by the value from the
      hardware the space is not accounted for.
      
      Now we have more data queued then buffer space available in the hardware
      and ks8851_tx_work() will potentially overrun the hardware TX buffer. In
      many cases it will still work because often the buffer is written out
      fast enough so that no overrun occurs but for example if the peer
      throttles us via flow control then an overrun may happen.
      
      This can be fixed in different ways. The most simple way would be to set
      tx_space to 0 before writing data to the hardware TX buffer preventing
      the queuing of more SKBs until the TX interrupt has been handled. I have
      chosen a slightly more efficient (and still rather simple) way and
      track the amount of data that is already queued and not yet written to
      the hardware. When new SKBs are to be queued the already queued amount
      of data is honoured when checking free TX buffer space.
      
      I tested this with a setup of two linked KS8851 running iperf3 between
      the two in bidirectional mode. Before the fix I got a stall after some
      minutes. With the fix I saw now issues anymore after hours.
      
      Fixes: 3ba81f3e ("net: Micrel KS8851 SPI network driver")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Ben Dooks <ben.dooks@codethink.co.uk>
      Cc: Tristram Ha <Tristram.Ha@microchip.com>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231214181112.76052-1-rwahl@gmx.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3dc5d445
  3. 17 Dec, 2023 5 commits
    • David S. Miller's avatar
      Merge branch 'mptcp-misc-fixes' · 979e9017
      David S. Miller authored
      Matthieu Baerts says:
      
      ====================
      mptcp: misc. fixes for v6.7
      
      Here are a few fixes related to MPTCP:
      
      Patch 1 avoids skipping some subtests of the MPTCP Join selftest by
      mistake when using older versions of GCC. This fixes a patch introduced
      in v6.4, backported up to v6.1.
      
      Patch 2 fixes an inconsistent state when using MPTCP + FastOpen. A fix
      for v6.2.
      
      Patch 3 adds a description for MPTCP Kunit test modules to avoid a
      warning.
      
      Patch 4 adds an entry to the mailmap file for Geliang's email addresses.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      979e9017
    • Geliang Tang's avatar
      mailmap: add entries for Geliang Tang · 356c71c4
      Geliang Tang authored
      Map Geliang's old mail addresses to his @linux.dev one.
      Suggested-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarGeliang Tang <geliang.tang@linux.dev>
      Reviewed-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      356c71c4
    • Matthieu Baerts's avatar
      mptcp: fill in missing MODULE_DESCRIPTION() · a8f570b2
      Matthieu Baerts authored
      W=1 builds warn on missing MODULE_DESCRIPTION, add them here in MPTCP.
      
      Only two were missing: two modules with different KUnit tests for MPTCP.
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8f570b2
    • Paolo Abeni's avatar
      mptcp: fix inconsistent state on fastopen race · 4fd19a30
      Paolo Abeni authored
      The netlink PM can race with fastopen self-connect attempts, shutting
      down the first subflow via:
      
      MPTCP_PM_CMD_DEL_ADDR -> mptcp_nl_remove_id_zero_address ->
        mptcp_pm_nl_rm_subflow_received -> mptcp_close_ssk
      
      and transitioning such subflow to FIN_WAIT1 status before the syn-ack
      packet is processed. The MPTCP code does not react to such state change,
      leaving the connection in not-fallback status and the subflow handshake
      uncompleted, triggering the following splat:
      
        WARNING: CPU: 0 PID: 10630 at net/mptcp/subflow.c:1405 subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
        Modules linked in:
        CPU: 0 PID: 10630 Comm: kworker/u4:11 Not tainted 6.6.0-syzkaller-14500-g1c410411 #0
        Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
        Workqueue: bat_events batadv_nc_worker
        RIP: 0010:subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
        Code: 18 89 ee e8 e3 d2 21 f7 40 84 ed 75 1f e8 a9 d7 21 f7 44 89 fe bf 07 00 00 00 e8 0c d3 21 f7 41 83 ff 07 74 07 e8 91 d7 21 f7 <0f> 0b e8 8a d7 21 f7 48 89 df e8 d2 b2 ff ff 31 ff 89 c5 89 c6 e8
        RSP: 0018:ffffc90000007448 EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff888031efc700 RCX: ffffffff8a65baf4
        RDX: ffff888043222140 RSI: ffffffff8a65baff RDI: 0000000000000005
        RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
        R10: 000000000000000b R11: 0000000000000000 R12: 1ffff92000000e89
        R13: ffff88807a534d80 R14: ffff888021c11a00 R15: 000000000000000b
        FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007fa19a0ffc81 CR3: 000000007a2db000 CR4: 00000000003506f0
        DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Call Trace:
         <IRQ>
         tcp_data_ready+0x14c/0x5b0 net/ipv4/tcp_input.c:5128
         tcp_data_queue+0x19c3/0x5190 net/ipv4/tcp_input.c:5208
         tcp_rcv_state_process+0x11ef/0x4e10 net/ipv4/tcp_input.c:6844
         tcp_v4_do_rcv+0x369/0xa10 net/ipv4/tcp_ipv4.c:1929
         tcp_v4_rcv+0x3888/0x3b30 net/ipv4/tcp_ipv4.c:2329
         ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205
         ip_local_deliver_finish+0x2e4/0x510 net/ipv4/ip_input.c:233
         NF_HOOK include/linux/netfilter.h:314 [inline]
         NF_HOOK include/linux/netfilter.h:308 [inline]
         ip_local_deliver+0x1b6/0x550 net/ipv4/ip_input.c:254
         dst_input include/net/dst.h:461 [inline]
         ip_rcv_finish+0x1c4/0x2e0 net/ipv4/ip_input.c:449
         NF_HOOK include/linux/netfilter.h:314 [inline]
         NF_HOOK include/linux/netfilter.h:308 [inline]
         ip_rcv+0xce/0x440 net/ipv4/ip_input.c:569
         __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527
         __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641
         process_backlog+0x101/0x6b0 net/core/dev.c:5969
         __napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6531
         napi_poll net/core/dev.c:6600 [inline]
         net_rx_action+0x956/0xe90 net/core/dev.c:6733
         __do_softirq+0x21a/0x968 kernel/softirq.c:553
         do_softirq kernel/softirq.c:454 [inline]
         do_softirq+0xaa/0xe0 kernel/softirq.c:441
         </IRQ>
         <TASK>
         __local_bh_enable_ip+0xf8/0x120 kernel/softirq.c:381
         spin_unlock_bh include/linux/spinlock.h:396 [inline]
         batadv_nc_purge_paths+0x1ce/0x3c0 net/batman-adv/network-coding.c:471
         batadv_nc_worker+0x9b1/0x10e0 net/batman-adv/network-coding.c:722
         process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
         process_scheduled_works kernel/workqueue.c:2703 [inline]
         worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
         kthread+0x33c/0x440 kernel/kthread.c:388
         ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
         ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
         </TASK>
      
      To address the issue, catch the racing subflow state change and
      use it to cause the MPTCP fallback. Such fallback is also used to
      cause the first subflow state propagation to the msk socket via
      mptcp_set_connected(). After this change, the first subflow can
      additionally propagate the TCP_FIN_WAIT1 state, so rename the
      helper accordingly.
      
      Finally, if the state propagation is delayed to the msk release
      callback, the first subflow can change to a different state in between.
      Cache the relevant target state in a new msk-level field and use
      such value to update the msk state at release time.
      
      Fixes: 1e777f39 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
      Cc: stable@vger.kernel.org
      Reported-by: <syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fd19a30
    • Geliang Tang's avatar
      selftests: mptcp: join: fix subflow_send_ack lookup · c8f021ee
      Geliang Tang authored
      MPC backups tests will skip unexpected sometimes (For example, when
      compiling kernel with an older version of gcc, such as gcc-8), since
      static functions like mptcp_subflow_send_ack also be listed in
      /proc/kallsyms, with a 't' in front of it, not 'T' ('T' is for a global
      function):
      
       > grep "mptcp_subflow_send_ack" /proc/kallsyms
      
       0000000000000000 T __pfx___mptcp_subflow_send_ack
       0000000000000000 T __mptcp_subflow_send_ack
       0000000000000000 t __pfx_mptcp_subflow_send_ack
       0000000000000000 t mptcp_subflow_send_ack
      
      In this case, mptcp_lib_kallsyms_doesnt_have "mptcp_subflow_send_ack$"
      will be false, MPC backups tests will skip. This is not what we expected.
      
      The correct logic here should be: if mptcp_subflow_send_ack is not a
      global function in /proc/kallsyms, do these MPC backups tests. So a 'T'
      must be added in front of mptcp_subflow_send_ack.
      
      Fixes: 632978f0 ("selftests: mptcp: join: skip MPC backups tests if not supported")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGeliang Tang <geliang.tang@linux.dev>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matttbe@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8f021ee
  4. 16 Dec, 2023 1 commit
    • Daniel Golle's avatar
      net: phy: skip LED triggers on PHYs on SFP modules · b1dfc0f7
      Daniel Golle authored
      Calling led_trigger_register() when attaching a PHY located on an SFP
      module potentially (and practically) leads into a deadlock.
      Fix this by not calling led_trigger_register() for PHYs localted on SFP
      modules as such modules actually never got any LEDs.
      
      ======================================================
      WARNING: possible circular locking dependency detected
      6.7.0-rc4-next-20231208+ #0 Tainted: G           O
      ------------------------------------------------------
      kworker/u8:2/43 is trying to acquire lock:
      ffffffc08108c4e8 (triggers_list_lock){++++}-{3:3}, at: led_trigger_register+0x4c/0x1a8
      
      but task is already holding lock:
      ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #3 (&sfp->sm_mutex){+.+.}-{3:3}:
             __mutex_lock+0x88/0x7a0
             mutex_lock_nested+0x20/0x28
             cleanup_module+0x2ae0/0x3120 [sfp]
             sfp_register_bus+0x5c/0x9c
             sfp_register_socket+0x48/0xd4
             cleanup_module+0x271c/0x3120 [sfp]
             platform_probe+0x64/0xb8
             really_probe+0x17c/0x3c0
             __driver_probe_device+0x78/0x164
             driver_probe_device+0x3c/0xd4
             __driver_attach+0xec/0x1f0
             bus_for_each_dev+0x60/0xa0
             driver_attach+0x20/0x28
             bus_add_driver+0x108/0x208
             driver_register+0x5c/0x118
             __platform_driver_register+0x24/0x2c
             init_module+0x28/0xa7c [sfp]
             do_one_initcall+0x70/0x2ec
             do_init_module+0x54/0x1e4
             load_module+0x1b78/0x1c8c
             __do_sys_init_module+0x1bc/0x2cc
             __arm64_sys_init_module+0x18/0x20
             invoke_syscall.constprop.0+0x4c/0xdc
             do_el0_svc+0x3c/0xbc
             el0_svc+0x34/0x80
             el0t_64_sync_handler+0xf8/0x124
             el0t_64_sync+0x150/0x154
      
      -> #2 (rtnl_mutex){+.+.}-{3:3}:
             __mutex_lock+0x88/0x7a0
             mutex_lock_nested+0x20/0x28
             rtnl_lock+0x18/0x20
             set_device_name+0x30/0x130
             netdev_trig_activate+0x13c/0x1ac
             led_trigger_set+0x118/0x234
             led_trigger_write+0x104/0x17c
             sysfs_kf_bin_write+0x64/0x80
             kernfs_fop_write_iter+0x128/0x1b4
             vfs_write+0x178/0x2a4
             ksys_write+0x58/0xd4
             __arm64_sys_write+0x18/0x20
             invoke_syscall.constprop.0+0x4c/0xdc
             do_el0_svc+0x3c/0xbc
             el0_svc+0x34/0x80
             el0t_64_sync_handler+0xf8/0x124
             el0t_64_sync+0x150/0x154
      
      -> #1 (&led_cdev->trigger_lock){++++}-{3:3}:
             down_write+0x4c/0x13c
             led_trigger_write+0xf8/0x17c
             sysfs_kf_bin_write+0x64/0x80
             kernfs_fop_write_iter+0x128/0x1b4
             vfs_write+0x178/0x2a4
             ksys_write+0x58/0xd4
             __arm64_sys_write+0x18/0x20
             invoke_syscall.constprop.0+0x4c/0xdc
             do_el0_svc+0x3c/0xbc
             el0_svc+0x34/0x80
             el0t_64_sync_handler+0xf8/0x124
             el0t_64_sync+0x150/0x154
      
      -> #0 (triggers_list_lock){++++}-{3:3}:
             __lock_acquire+0x12a0/0x2014
             lock_acquire+0x100/0x2ac
             down_write+0x4c/0x13c
             led_trigger_register+0x4c/0x1a8
             phy_led_triggers_register+0x9c/0x214
             phy_attach_direct+0x154/0x36c
             phylink_attach_phy+0x30/0x60
             phylink_sfp_connect_phy+0x140/0x510
             sfp_add_phy+0x34/0x50
             init_module+0x15c/0xa7c [sfp]
             cleanup_module+0x1d94/0x3120 [sfp]
             cleanup_module+0x2bb4/0x3120 [sfp]
             process_one_work+0x1f8/0x4ec
             worker_thread+0x1e8/0x3d8
             kthread+0x104/0x110
             ret_from_fork+0x10/0x20
      
      other info that might help us debug this:
      
      Chain exists of:
        triggers_list_lock --> rtnl_mutex --> &sfp->sm_mutex
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&sfp->sm_mutex);
                                     lock(rtnl_mutex);
                                     lock(&sfp->sm_mutex);
        lock(triggers_list_lock);
      
       *** DEADLOCK ***
      
      4 locks held by kworker/u8:2/43:
       #0: ffffff80c000f938 ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec
       #1: ffffffc08214bde8 ((work_completion)(&(&sfp->timeout)->work)){+.+.}-{0:0}, at: process_one_work+0x150/0x4ec
       #2: ffffffc0810902f8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x18/0x20
       #3: ffffff80c5c6f318 (&sfp->sm_mutex){+.+.}-{3:3}, at: cleanup_module+0x2ba8/0x3120 [sfp]
      
      stack backtrace:
      CPU: 0 PID: 43 Comm: kworker/u8:2 Tainted: G           O       6.7.0-rc4-next-20231208+ #0
      Hardware name: Bananapi BPI-R4 (DT)
      Workqueue: events_power_efficient cleanup_module [sfp]
      Call trace:
       dump_backtrace+0xa8/0x10c
       show_stack+0x14/0x1c
       dump_stack_lvl+0x5c/0xa0
       dump_stack+0x14/0x1c
       print_circular_bug+0x328/0x430
       check_noncircular+0x124/0x134
       __lock_acquire+0x12a0/0x2014
       lock_acquire+0x100/0x2ac
       down_write+0x4c/0x13c
       led_trigger_register+0x4c/0x1a8
       phy_led_triggers_register+0x9c/0x214
       phy_attach_direct+0x154/0x36c
       phylink_attach_phy+0x30/0x60
       phylink_sfp_connect_phy+0x140/0x510
       sfp_add_phy+0x34/0x50
       init_module+0x15c/0xa7c [sfp]
       cleanup_module+0x1d94/0x3120 [sfp]
       cleanup_module+0x2bb4/0x3120 [sfp]
       process_one_work+0x1f8/0x4ec
       worker_thread+0x1e8/0x3d8
       kthread+0x104/0x110
       ret_from_fork+0x10/0x20
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Fixes: 01e5b728 ("net: phy: Add a binding for PHY LEDs")
      Link: https://lore.kernel.org/r/102a9dce38bdf00215735d04cd4704458273ad9c.1702339354.git.daniel@makrotopia.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b1dfc0f7
  5. 15 Dec, 2023 12 commits
    • Andy Gospodarek's avatar
      bnxt_en: do not map packet buffers twice · 23c93c3b
      Andy Gospodarek authored
      Remove double-mapping of DMA buffers as it can prevent page pool entries
      from being freed.  Mapping is managed by page pool infrastructure and
      was previously managed by the driver in __bnxt_alloc_rx_page before
      allowing the page pool infrastructure to manage it.
      
      Fixes: 578fcfd2 ("bnxt_en: Let the page pool manage the DMA mapping")
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Reviewed-by: default avatarDavid Wei <dw@davidwei.uk>
      Link: https://lore.kernel.org/r/20231214213138.98095-1-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      23c93c3b
    • Eric Dumazet's avatar
      net/rose: fix races in rose_kill_by_device() · 64b8bc7d
      Eric Dumazet authored
      syzbot found an interesting netdev refcounting issue in
      net/rose/af_rose.c, thanks to CONFIG_NET_DEV_REFCNT_TRACKER=y [1]
      
      Problem is that rose_kill_by_device() can change rose->device
      while other threads do not expect the pointer to be changed.
      
      We have to first collect sockets in a temporary array,
      then perform the changes while holding the socket
      lock and rose_list_lock spinlock (in this order)
      
      Change rose_release() to also acquire rose_list_lock
      before releasing the netdev refcount.
      
      [1]
      
      [ 1185.055088][ T7889] ref_tracker: reference already released.
      [ 1185.061476][ T7889] ref_tracker: allocated in:
      [ 1185.066081][ T7889]  rose_bind+0x4ab/0xd10
      [ 1185.070446][ T7889]  __sys_bind+0x1ec/0x220
      [ 1185.074818][ T7889]  __x64_sys_bind+0x72/0xb0
      [ 1185.079356][ T7889]  do_syscall_64+0x40/0x110
      [ 1185.083897][ T7889]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
      [ 1185.089835][ T7889] ref_tracker: freed in:
      [ 1185.094088][ T7889]  rose_release+0x2f5/0x570
      [ 1185.098629][ T7889]  __sock_release+0xae/0x260
      [ 1185.103262][ T7889]  sock_close+0x1c/0x20
      [ 1185.107453][ T7889]  __fput+0x270/0xbb0
      [ 1185.111467][ T7889]  task_work_run+0x14d/0x240
      [ 1185.116085][ T7889]  get_signal+0x106f/0x2790
      [ 1185.120622][ T7889]  arch_do_signal_or_restart+0x90/0x7f0
      [ 1185.126205][ T7889]  exit_to_user_mode_prepare+0x121/0x240
      [ 1185.131846][ T7889]  syscall_exit_to_user_mode+0x1e/0x60
      [ 1185.137293][ T7889]  do_syscall_64+0x4d/0x110
      [ 1185.141783][ T7889]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
      [ 1185.148085][ T7889] ------------[ cut here ]------------
      
      WARNING: CPU: 1 PID: 7889 at lib/ref_tracker.c:255 ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
      Modules linked in:
      CPU: 1 PID: 7889 Comm: syz-executor.2 Not tainted 6.7.0-rc4-syzkaller-00162-g65c95f78 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
      RIP: 0010:ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
      Code: 00 44 8b 6b 18 31 ff 44 89 ee e8 21 62 f5 fc 45 85 ed 0f 85 a6 00 00 00 e8 a3 66 f5 fc 48 8b 34 24 48 89 ef e8 27 5f f1 05 90 <0f> 0b 90 bb ea ff ff ff e9 52 fd ff ff e8 84 66 f5 fc 4c 8d 6d 44
      RSP: 0018:ffffc90004917850 EFLAGS: 00010202
      RAX: 0000000000000201 RBX: ffff88802618f4c0 RCX: 0000000000000000
      RDX: 0000000000000202 RSI: ffffffff8accb920 RDI: 0000000000000001
      RBP: ffff8880269ea5b8 R08: 0000000000000001 R09: fffffbfff23e35f6
      R10: ffffffff91f1afb7 R11: 0000000000000001 R12: 1ffff92000922f0c
      R13: 0000000005a2039b R14: ffff88802618f4d8 R15: 00000000ffffffff
      FS: 00007f0a720ef6c0(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f43a819d988 CR3: 0000000076c64000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      netdev_tracker_free include/linux/netdevice.h:4127 [inline]
      netdev_put include/linux/netdevice.h:4144 [inline]
      netdev_put include/linux/netdevice.h:4140 [inline]
      rose_kill_by_device net/rose/af_rose.c:195 [inline]
      rose_device_event+0x25d/0x330 net/rose/af_rose.c:218
      notifier_call_chain+0xb6/0x3b0 kernel/notifier.c:93
      call_netdevice_notifiers_info+0xbe/0x130 net/core/dev.c:1967
      call_netdevice_notifiers_extack net/core/dev.c:2005 [inline]
      call_netdevice_notifiers net/core/dev.c:2019 [inline]
      __dev_notify_flags+0x1f5/0x2e0 net/core/dev.c:8646
      dev_change_flags+0x122/0x170 net/core/dev.c:8682
      dev_ifsioc+0x9ad/0x1090 net/core/dev_ioctl.c:529
      dev_ioctl+0x224/0x1090 net/core/dev_ioctl.c:786
      sock_do_ioctl+0x198/0x270 net/socket.c:1234
      sock_ioctl+0x22e/0x6b0 net/socket.c:1339
      vfs_ioctl fs/ioctl.c:51 [inline]
      __do_sys_ioctl fs/ioctl.c:871 [inline]
      __se_sys_ioctl fs/ioctl.c:857 [inline]
      __x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7f0a7147cba9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f0a720ef0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00007f0a7159bf80 RCX: 00007f0a7147cba9
      RDX: 0000000020000040 RSI: 0000000000008914 RDI: 0000000000000004
      RBP: 00007f0a714c847a R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000000b R14: 00007f0a7159bf80 R15: 00007ffc8bb3a5f8
      </TASK>
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Bernard Pidoux <f6bvp@free.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64b8bc7d
    • Zhipeng Lu's avatar
      ethernet: atheros: fix a memleak in atl1e_setup_ring_resources · 309fdb1c
      Zhipeng Lu authored
      In the error handling of 'offset > adapter->ring_size', the
      tx_ring->tx_buffer allocated by kzalloc should be freed,
      instead of 'goto failed' instantly.
      
      Fixes: a6a53252 ("atl1e: Atheros L1E Gigabit Ethernet driver")
      Signed-off-by: default avatarZhipeng Lu <alexious@zju.edu.cn>
      Reviewed-by: default avatarSuman Ghosh <sumang@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      309fdb1c
    • Eric Dumazet's avatar
      net: sched: ife: fix potential use-after-free · 19391a2c
      Eric Dumazet authored
      ife_decode() calls pskb_may_pull() two times, we need to reload
      ifehdr after the second one, or risk use-after-free as reported
      by syzbot:
      
      BUG: KASAN: slab-use-after-free in __ife_tlv_meta_valid net/ife/ife.c:108 [inline]
      BUG: KASAN: slab-use-after-free in ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
      Read of size 2 at addr ffff88802d7300a4 by task syz-executor.5/22323
      
      CPU: 0 PID: 22323 Comm: syz-executor.5 Not tainted 6.7.0-rc3-syzkaller-00804-g074ac38d #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      __ife_tlv_meta_valid net/ife/ife.c:108 [inline]
      ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
      tcf_ife_decode net/sched/act_ife.c:739 [inline]
      tcf_ife_act+0x4e3/0x1cd0 net/sched/act_ife.c:879
      tc_act include/net/tc_wrapper.h:221 [inline]
      tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
      tcf_exts_exec include/net/pkt_cls.h:344 [inline]
      mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
      tc_classify include/net/tc_wrapper.h:227 [inline]
      __tcf_classify net/sched/cls_api.c:1703 [inline]
      tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
      hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
      hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
      dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
      __dev_xmit_skb net/core/dev.c:3828 [inline]
      __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
      dev_queue_xmit include/linux/netdevice.h:3165 [inline]
      packet_xmit+0x237/0x350 net/packet/af_packet.c:276
      packet_snd net/packet/af_packet.c:3081 [inline]
      packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      __sys_sendto+0x255/0x340 net/socket.c:2190
      __do_sys_sendto net/socket.c:2202 [inline]
      __se_sys_sendto net/socket.c:2198 [inline]
      __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7fe9acc7cae9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fe9ada450c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007fe9acd9bf80 RCX: 00007fe9acc7cae9
      RDX: 000000000000fce0 RSI: 00000000200002c0 RDI: 0000000000000003
      RBP: 00007fe9accc847a R08: 0000000020000140 R09: 0000000000000014
      R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000000b R14: 00007fe9acd9bf80 R15: 00007ffd5427ae78
      </TASK>
      
      Allocated by task 22323:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      ____kasan_kmalloc mm/kasan/common.c:374 [inline]
      __kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383
      kasan_kmalloc include/linux/kasan.h:198 [inline]
      __do_kmalloc_node mm/slab_common.c:1007 [inline]
      __kmalloc_node_track_caller+0x5a/0x90 mm/slab_common.c:1027
      kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
      __alloc_skb+0x12b/0x330 net/core/skbuff.c:651
      alloc_skb include/linux/skbuff.h:1298 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      packet_alloc_skb net/packet/af_packet.c:2930 [inline]
      packet_snd net/packet/af_packet.c:3024 [inline]
      packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      __sys_sendto+0x255/0x340 net/socket.c:2190
      __do_sys_sendto net/socket.c:2202 [inline]
      __se_sys_sendto net/socket.c:2198 [inline]
      __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      Freed by task 22323:
      kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
      kasan_set_track+0x25/0x30 mm/kasan/common.c:52
      kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
      ____kasan_slab_free mm/kasan/common.c:236 [inline]
      ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
      kasan_slab_free include/linux/kasan.h:164 [inline]
      slab_free_hook mm/slub.c:1800 [inline]
      slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
      slab_free mm/slub.c:3809 [inline]
      __kmem_cache_free+0xc0/0x180 mm/slub.c:3822
      skb_kfree_head net/core/skbuff.c:950 [inline]
      skb_free_head+0x110/0x1b0 net/core/skbuff.c:962
      pskb_expand_head+0x3c5/0x1170 net/core/skbuff.c:2130
      __pskb_pull_tail+0xe1/0x1830 net/core/skbuff.c:2655
      pskb_may_pull_reason include/linux/skbuff.h:2685 [inline]
      pskb_may_pull include/linux/skbuff.h:2693 [inline]
      ife_decode+0x394/0x4f0 net/ife/ife.c:82
      tcf_ife_decode net/sched/act_ife.c:727 [inline]
      tcf_ife_act+0x43b/0x1cd0 net/sched/act_ife.c:879
      tc_act include/net/tc_wrapper.h:221 [inline]
      tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
      tcf_exts_exec include/net/pkt_cls.h:344 [inline]
      mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
      tc_classify include/net/tc_wrapper.h:227 [inline]
      __tcf_classify net/sched/cls_api.c:1703 [inline]
      tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
      hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
      hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
      dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
      __dev_xmit_skb net/core/dev.c:3828 [inline]
      __dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
      dev_queue_xmit include/linux/netdevice.h:3165 [inline]
      packet_xmit+0x237/0x350 net/packet/af_packet.c:276
      packet_snd net/packet/af_packet.c:3081 [inline]
      packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      __sys_sendto+0x255/0x340 net/socket.c:2190
      __do_sys_sendto net/socket.c:2202 [inline]
      __se_sys_sendto net/socket.c:2198 [inline]
      __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      The buggy address belongs to the object at ffff88802d730000
      which belongs to the cache kmalloc-8k of size 8192
      The buggy address is located 164 bytes inside of
      freed 8192-byte region [ffff88802d730000, ffff88802d732000)
      
      The buggy address belongs to the physical page:
      page:ffffea0000b5cc00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d730
      head:ffffea0000b5cc00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
      flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff)
      page_type: 0xffffffff()
      raw: 00fff00000000840 ffff888013042280 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      page_owner tracks the page as allocated
      page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 22323, tgid 22320 (syz-executor.5), ts 950317230369, free_ts 950233467461
      set_page_owner include/linux/page_owner.h:31 [inline]
      post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1544
      prep_new_page mm/page_alloc.c:1551 [inline]
      get_page_from_freelist+0xa28/0x3730 mm/page_alloc.c:3319
      __alloc_pages+0x22e/0x2420 mm/page_alloc.c:4575
      alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
      alloc_slab_page mm/slub.c:1870 [inline]
      allocate_slab mm/slub.c:2017 [inline]
      new_slab+0x283/0x3c0 mm/slub.c:2070
      ___slab_alloc+0x979/0x1500 mm/slub.c:3223
      __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
      __slab_alloc_node mm/slub.c:3375 [inline]
      slab_alloc_node mm/slub.c:3468 [inline]
      __kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517
      __do_kmalloc_node mm/slab_common.c:1006 [inline]
      __kmalloc_node_track_caller+0x4a/0x90 mm/slab_common.c:1027
      kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
      __alloc_skb+0x12b/0x330 net/core/skbuff.c:651
      alloc_skb include/linux/skbuff.h:1298 [inline]
      alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
      sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
      packet_alloc_skb net/packet/af_packet.c:2930 [inline]
      packet_snd net/packet/af_packet.c:3024 [inline]
      packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      __sys_sendto+0x255/0x340 net/socket.c:2190
      page last free stack trace:
      reset_page_owner include/linux/page_owner.h:24 [inline]
      free_pages_prepare mm/page_alloc.c:1144 [inline]
      free_unref_page_prepare+0x53c/0xb80 mm/page_alloc.c:2354
      free_unref_page+0x33/0x3b0 mm/page_alloc.c:2494
      __unfreeze_partials+0x226/0x240 mm/slub.c:2655
      qlink_free mm/kasan/quarantine.c:168 [inline]
      qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
      kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
      __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
      kasan_slab_alloc include/linux/kasan.h:188 [inline]
      slab_post_alloc_hook mm/slab.h:763 [inline]
      slab_alloc_node mm/slub.c:3478 [inline]
      slab_alloc mm/slub.c:3486 [inline]
      __kmem_cache_alloc_lru mm/slub.c:3493 [inline]
      kmem_cache_alloc_lru+0x219/0x6f0 mm/slub.c:3509
      alloc_inode_sb include/linux/fs.h:2937 [inline]
      ext4_alloc_inode+0x28/0x650 fs/ext4/super.c:1408
      alloc_inode+0x5d/0x220 fs/inode.c:261
      new_inode_pseudo fs/inode.c:1006 [inline]
      new_inode+0x22/0x260 fs/inode.c:1032
      __ext4_new_inode+0x333/0x5200 fs/ext4/ialloc.c:958
      ext4_symlink+0x5d7/0xa20 fs/ext4/namei.c:3398
      vfs_symlink fs/namei.c:4464 [inline]
      vfs_symlink+0x3e5/0x620 fs/namei.c:4448
      do_symlinkat+0x25f/0x310 fs/namei.c:4490
      __do_sys_symlinkat fs/namei.c:4506 [inline]
      __se_sys_symlinkat fs/namei.c:4503 [inline]
      __x64_sys_symlinkat+0x97/0xc0 fs/namei.c:4503
      do_syscall_x64 arch/x86/entry/common.c:51 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
      
      Fixes: d57493d6 ("net: sched: ife: check on metadata length")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Alexander Aring <aahringo@redhat.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19391a2c
    • Shigeru Yoshida's avatar
      net: Return error from sk_stream_wait_connect() if sk_wait_event() fails · cac23b7d
      Shigeru Yoshida authored
      The following NULL pointer dereference issue occurred:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      <...>
      RIP: 0010:ccid_hc_tx_send_packet net/dccp/ccid.h:166 [inline]
      RIP: 0010:dccp_write_xmit+0x49/0x140 net/dccp/output.c:356
      <...>
      Call Trace:
       <TASK>
       dccp_sendmsg+0x642/0x7e0 net/dccp/proto.c:801
       inet_sendmsg+0x63/0x90 net/ipv4/af_inet.c:846
       sock_sendmsg_nosec net/socket.c:730 [inline]
       __sock_sendmsg+0x83/0xe0 net/socket.c:745
       ____sys_sendmsg+0x443/0x510 net/socket.c:2558
       ___sys_sendmsg+0xe5/0x150 net/socket.c:2612
       __sys_sendmsg+0xa6/0x120 net/socket.c:2641
       __do_sys_sendmsg net/socket.c:2650 [inline]
       __se_sys_sendmsg net/socket.c:2648 [inline]
       __x64_sys_sendmsg+0x45/0x50 net/socket.c:2648
       do_syscall_x64 arch/x86/entry/common.c:51 [inline]
       do_syscall_64+0x43/0x110 arch/x86/entry/common.c:82
       entry_SYSCALL_64_after_hwframe+0x63/0x6b
      
      sk_wait_event() returns an error (-EPIPE) if disconnect() is called on the
      socket waiting for the event. However, sk_stream_wait_connect() returns
      success, i.e. zero, even if sk_wait_event() returns -EPIPE, so a function
      that waits for a connection with sk_stream_wait_connect() may misbehave.
      
      In the case of the above DCCP issue, dccp_sendmsg() is waiting for the
      connection. If disconnect() is called in concurrently, the above issue
      occurs.
      
      This patch fixes the issue by returning error from sk_stream_wait_connect()
      if sk_wait_event() fails.
      
      Fixes: 419ce133 ("tcp: allow again tcp_disconnect() when threads are waiting")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reported-by: syzbot+c71bc336c5061153b502@syzkaller.appspotmail.com
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cac23b7d
    • Suman Ghosh's avatar
      octeontx2-pf: Fix graceful exit during PFC configuration failure · 8c97ab54
      Suman Ghosh authored
      During PFC configuration failure the code was not handling a graceful
      exit. This patch fixes the same and add proper code for a graceful exit.
      
      Fixes: 99c969a8 ("octeontx2-pf: Add egress PFC support")
      Signed-off-by: default avatarSuman Ghosh <sumang@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c97ab54
    • duanqiangwen's avatar
      net: libwx: fix memory leak on free page · 738b54b9
      duanqiangwen authored
      ifconfig ethx up, will set page->refcount larger than 1,
      and then ifconfig ethx down, calling __page_frag_cache_drain()
      to free pages, it is not compatible with page pool.
      So deleting codes which changing page->refcount.
      
      Fixes: 3c47e8ae ("net: libwx: Support to receive packets in NAPI")
      Signed-off-by: default avatarduanqiangwen <duanqiangwen@net-swift.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      738b54b9
    • Jakub Kicinski's avatar
      Merge tag 'wireless-2023-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · 0225191a
      Jakub Kicinski authored
      Johannes Berg says:
      
      ====================
       * add (and fix) certificate for regdb handover to Chen-Yu Tsai
       * fix rfkill GPIO handling
       * a few driver (iwlwifi, mt76) crash fixes
       * logic fixes in the stack
      
      * tag 'wireless-2023-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        wifi: cfg80211: fix certs build to not depend on file order
        wifi: mt76: fix crash with WED rx support enabled
        wifi: iwlwifi: pcie: avoid a NULL pointer dereference
        wifi: mac80211: mesh_plink: fix matches_local logic
        wifi: mac80211: mesh: check element parsing succeeded
        wifi: mac80211: check defragmentation succeeded
        wifi: mac80211: don't re-add debugfs during reconfig
        net: rfkill: gpio: set GPIO direction
        wifi: mac80211: check if the existing link config remains unchanged
        wifi: cfg80211: Add my certificate
        wifi: iwlwifi: pcie: add another missing bh-disable for rxq->lock
        wifi: ieee80211: don't require protected vendor action frames
      ====================
      
      Link: https://lore.kernel.org/r/20231214111515.60626-3-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0225191a
    • Jakub Kicinski's avatar
      Merge tag 'mlx5-fixes-2023-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · e9b797dc
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2023-12-13
      
      This series provides bug fixes to mlx5 driver.
      
      * tag 'mlx5-fixes-2023-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        net/mlx5e: Correct snprintf truncation handling for fw_version buffer used by representors
        net/mlx5e: Correct snprintf truncation handling for fw_version buffer
        net/mlx5e: Fix error codes in alloc_branch_attr()
        net/mlx5e: Fix error code in mlx5e_tc_action_miss_mapping_get()
        net/mlx5: Refactor mlx5_flow_destination->rep pointer to vport num
        net/mlx5: Fix fw tracer first block check
        net/mlx5e: XDP, Drop fragmented packets larger than MTU size
        net/mlx5e: Decrease num_block_tc when unblock tc offload
        net/mlx5e: Fix overrun reported by coverity
        net/mlx5e: fix a potential double-free in fs_udp_create_groups
        net/mlx5e: Fix a race in command alloc flow
        net/mlx5e: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()
        net/mlx5e: fix double free of encap_header
        Revert "net/mlx5e: fix double free of encap_header"
        Revert "net/mlx5e: fix double free of encap_header in update funcs"
      ====================
      
      Link: https://lore.kernel.org/r/20231214012505.42666-1-saeed@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e9b797dc
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 2c1a4185
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-12-13 (ice, i40e)
      
      This series contains updates to ice and i40e drivers.
      
      Michal Schmidt prevents possible out-of-bounds access for ice.
      
      Ivan Vecera corrects value for MDIO clause 45 on i40e.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        i40e: Fix ST code value for Clause 45
        ice: fix theoretical out-of-bounds access in ethtool link modes
      ====================
      
      Link: https://lore.kernel.org/r/20231213220827.1311772-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2c1a4185
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix pMAC TX RMON stats for bucket 256-511 and above · 70f010da
      Vladimir Oltean authored
      The typo from ocelot_port_rmon_stats_cb() was also carried over to
      ocelot_port_pmac_rmon_stats_cb() as well, leading to incorrect TX RMON
      stats for the pMAC too.
      
      Fixes: ab3f97a9 ("net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://lore.kernel.org/r/20231214000902.545625-2-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      70f010da
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix eMAC TX RMON stats for bucket 256-511 and above · 52eda464
      Vladimir Oltean authored
      There is a typo in the driver due to which we report incorrect TX RMON
      counters for the 256-511 octet bucket and all the other buckets larger
      than that.
      
      Bug found with the selftest at
      https://patchwork.kernel.org/project/netdevbpf/patch/20231211223346.2497157-9-tobias@waldekranz.com/
      
      Fixes: e32036e1 ("net: mscc: ocelot: add support for all sorts of standardized counters present in DSA")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://lore.kernel.org/r/20231214000902.545625-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      52eda464
  6. 14 Dec, 2023 15 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · c7402612
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
      "Current release - regressions:
      
         - tcp: fix tcp_disordered_ack() vs usec TS resolution
      
        Current release - new code bugs:
      
         - dpll: sanitize possible null pointer dereference in
           dpll_pin_parent_pin_set()
      
         - eth: octeon_ep: initialise control mbox tasks before using APIs
      
        Previous releases - regressions:
      
         - io_uring/af_unix: disable sending io_uring over sockets
      
         - eth: mlx5e:
             - TC, don't offload post action rule if not supported
             - fix possible deadlock on mlx5e_tx_timeout_work
      
         - eth: iavf: fix iavf_shutdown to call iavf_remove instead iavf_close
      
         - eth: bnxt_en: fix skb recycling logic in bnxt_deliver_skb()
      
         - eth: ena: fix DMA syncing in XDP path when SWIOTLB is on
      
         - eth: team: fix use-after-free when an option instance allocation
           fails
      
        Previous releases - always broken:
      
         - neighbour: don't let neigh_forced_gc() disable preemption for long
      
         - net: prevent mss overflow in skb_segment()
      
         - ipv6: support reporting otherwise unknown prefix flags in
           RTM_NEWPREFIX
      
         - tcp: remove acked SYN flag from packet in the transmit queue
           correctly
      
         - eth: octeontx2-af:
             - fix a use-after-free in rvu_nix_register_reporters
             - fix promisc mcam entry action
      
         - eth: dwmac-loongson: make sure MDIO is initialized before use
      
         - eth: atlantic: fix double free in ring reinit logic"
      
      * tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
        net: atlantic: fix double free in ring reinit logic
        appletalk: Fix Use-After-Free in atalk_ioctl
        net: stmmac: Handle disabled MDIO busses from devicetree
        net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX
        dpaa2-switch: do not ask for MDB, VLAN and FDB replay
        dpaa2-switch: fix size of the dma_unmap
        net: prevent mss overflow in skb_segment()
        vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space()
        Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set"
        MIPS: dts: loongson: drop incorrect dwmac fallback compatible
        stmmac: dwmac-loongson: drop useless check for compatible fallback
        stmmac: dwmac-loongson: Make sure MDIO is initialized before use
        tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set
        dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set()
        net: ena: Fix XDP redirection error
        net: ena: Fix DMA syncing in XDP path when SWIOTLB is on
        net: ena: Fix xdp drops handling due to multibuf packets
        net: ena: Destroy correct number of xdp queues upon failure
        net: Remove acked SYN flag from packet in the transmit queue correctly
        qed: Fix a potential use-after-free in qed_cxt_tables_alloc
        ...
      c7402612
    • Linus Torvalds's avatar
      Merge tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · bdb2701f
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
        "Some fixes to quota accounting code, mostly around error handling and
         correctness:
      
         - free reserves on various error paths, after IO errors or
           transaction abort
      
         - don't clear reserved range at the folio release time, it'll be
           properly cleared after final write
      
         - fix integer overflow due to int used when passing around size of
           freed reservations
      
         - fix a regression in squota accounting that missed some cases with
           delayed refs"
      
      * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: ensure releasing squota reserve on head refs
        btrfs: don't clear qgroup reserved bit in release_folio
        btrfs: free qgroup pertrans reserve on transaction abort
        btrfs: fix qgroup_free_reserved_data int overflow
        btrfs: free qgroup reserve when ORDERED_IOERR is set
      bdb2701f
    • Igor Russkikh's avatar
      net: atlantic: fix double free in ring reinit logic · 7bb26ea7
      Igor Russkikh authored
      Driver has a logic leak in ring data allocation/free,
      where double free may happen in aq_ring_free if system is under
      stress and driver init/deinit is happening.
      
      The probability is higher to get this during suspend/resume cycle.
      
      Verification was done simulating same conditions with
      
          stress -m 2000 --vm-bytes 20M --vm-hang 10 --backoff 1000
          while true; do sudo ifconfig enp1s0 down; sudo ifconfig enp1s0 up; done
      
      Fixed by explicitly clearing pointers to NULL on deallocation
      
      Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Closes: https://lore.kernel.org/netdev/CAHk-=wiZZi7FcvqVSUirHBjx0bBUZ4dFrMDVLc3+3HCrtq0rBA@mail.gmail.com/Signed-off-by: default avatarIgor Russkikh <irusskikh@marvell.com>
      Link: https://lore.kernel.org/r/20231213094044.22988-1-irusskikh@marvell.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7bb26ea7
    • Hyunwoo Kim's avatar
      appletalk: Fix Use-After-Free in atalk_ioctl · 189ff167
      Hyunwoo Kim authored
      Because atalk_ioctl() accesses sk->sk_receive_queue
      without holding a sk->sk_receive_queue.lock, it can
      cause a race with atalk_recvmsg().
      A use-after-free for skb occurs with the following flow.
      ```
      atalk_ioctl() -> skb_peek()
      atalk_recvmsg() -> skb_recv_datagram() -> skb_free_datagram()
      ```
      Add sk->sk_receive_queue.lock to atalk_ioctl() to fix this issue.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarHyunwoo Kim <v4bel@theori.io>
      Link: https://lore.kernel.org/r/20231213041056.GA519680@v4bel-B760M-AORUS-ELITE-AXSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      189ff167
    • Andrew Halaney's avatar
      net: stmmac: Handle disabled MDIO busses from devicetree · e23c0d21
      Andrew Halaney authored
      Many hardware configurations have the MDIO bus disabled, and are instead
      using some other MDIO bus to talk to the MAC's phy.
      
      of_mdiobus_register() returns -ENODEV in this case. Let's handle it
      gracefully instead of failing to probe the MAC.
      
      Fixes: 47dd7a54 ("net: add support for STMicroelectronics Ethernet controllers.")
      Signed-off-by: default avatarAndrew Halaney <ahalaney@redhat.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Link: https://lore.kernel.org/r/20231212-b4-stmmac-handle-mdio-enodev-v2-1-600171acf79f@redhat.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e23c0d21
    • Sneh Shah's avatar
      net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX · 981d947b
      Sneh Shah authored
      In 10M SGMII mode all the packets are being dropped due to wrong Rx clock.
      SGMII 10MBPS mode needs RX clock divider programmed to avoid drops in Rx.
      Update configure SGMII function with Rx clk divider programming.
      
      Fixes: 463120c3 ("net: stmmac: dwmac-qcom-ethqos: add support for SGMII")
      Tested-by: default avatarAndrew Halaney <ahalaney@redhat.com>
      Signed-off-by: default avatarSneh Shah <quic_snehshah@quicinc.com>
      Reviewed-by: default avatarBjorn Andersson <quic_bjorande@quicinc.com>
      Link: https://lore.kernel.org/r/20231212092208.22393-1-quic_snehshah@quicinc.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      981d947b
    • Johannes Berg's avatar
      wifi: cfg80211: fix certs build to not depend on file order · 3c2a8ebe
      Johannes Berg authored
      The file for the new certificate (Chen-Yu Tsai's) didn't
      end with a comma, so depending on the file order in the
      build rule, we'd end up with invalid C when concatenating
      the (now two) certificates. Fix that.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Fixes: fb768d3b ("wifi: cfg80211: Add my certificate")
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      3c2a8ebe
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 89e0c646
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-12-12 (iavf)
      
      This series contains updates to iavf driver only.
      
      Piotr reworks Flow Director states to deal with issues in restoring
      filters.
      
      Slawomir fixes shutdown processing as it was missing needed calls.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        iavf: Fix iavf_shutdown to call iavf_remove instead iavf_close
        iavf: Handle ntuple on/off based on new state machines for flow director
        iavf: Introduce new state machines for flow director
      ====================
      
      Link: https://lore.kernel.org/r/20231212203613.513423-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      89e0c646
    • Jakub Kicinski's avatar
      Merge branch 'dpaa2-switch-various-fixes' · dc84bb19
      Jakub Kicinski authored
      Ioana Ciornei says:
      
      ====================
      dpaa2-switch: various fixes
      
      The first patch fixes the size passed to two dma_unmap_single() calls
      which was wrongly put as the size of the pointer.
      
      The second patch is new to this series and reverts the behavior of the
      dpaa2-switch driver to not ask for object replay upon offloading so that
      we avoid the errors encountered when a VLAN is installed multiple times
      on the same port.
      ====================
      
      Link: https://lore.kernel.org/r/20231212164326.2753457-1-ioana.ciornei@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dc84bb19
    • Ioana Ciornei's avatar
      dpaa2-switch: do not ask for MDB, VLAN and FDB replay · f24a49a3
      Ioana Ciornei authored
      Starting with commit 4e51bf44 ("net: bridge: move the switchdev
      object replay helpers to "push" mode") the switchdev_bridge_port_offload()
      helper was extended with the intention to provide switchdev drivers easy
      access to object addition and deletion replays. This works by calling
      the replay helpers with non-NULL notifier blocks.
      
      In the same commit, the dpaa2-switch driver was updated so that it
      passes valid notifier blocks to the helper. At that moment, no
      regression was identified through testing.
      
      In the meantime, the blamed commit changed the behavior in terms of
      which ports get hit by the replay. Before this commit, only the initial
      port which identified itself as offloaded through
      switchdev_bridge_port_offload() got a replay of all port objects and
      FDBs. After this, the newly joining port will trigger a replay of
      objects on all bridge ports and on the bridge itself.
      
      This behavior leads to errors in dpaa2_switch_port_vlans_add() when a
      VLAN gets installed on the same interface multiple times.
      
      The intended mechanism to address this is to pass a non-NULL ctx to the
      switchdev_bridge_port_offload() helper and then check it against the
      port's private structure. But since the driver does not have any use for
      the replayed port objects and FDBs until it gains support for LAG
      offload, it's better to fix the issue by reverting the dpaa2-switch
      driver to not ask for replay. The pointers will be added back when we
      are prepared to ignore replays on unrelated ports.
      
      Fixes: b28d580e ("net: bridge: switchdev: replay all VLAN groups")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Link: https://lore.kernel.org/r/20231212164326.2753457-3-ioana.ciornei@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f24a49a3
    • Ioana Ciornei's avatar
      dpaa2-switch: fix size of the dma_unmap · 2aad7d41
      Ioana Ciornei authored
      The size of the DMA unmap was wrongly put as a sizeof of a pointer.
      Change the value of the DMA unmap to be the actual macro used for the
      allocation and the DMA map.
      
      Fixes: 1110318d ("dpaa2-switch: add tc flower hardware offload on ingress traffic")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Link: https://lore.kernel.org/r/20231212164326.2753457-2-ioana.ciornei@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2aad7d41
    • Eric Dumazet's avatar
      net: prevent mss overflow in skb_segment() · 23d05d56
      Eric Dumazet authored
      Once again syzbot is able to crash the kernel in skb_segment() [1]
      
      GSO_BY_FRAGS is a forbidden value, but unfortunately the following
      computation in skb_segment() can reach it quite easily :
      
      	mss = mss * partial_segs;
      
      65535 = 3 * 5 * 17 * 257, so many initial values of mss can lead to
      a bad final result.
      
      Make sure to limit segmentation so that the new mss value is smaller
      than GSO_BY_FRAGS.
      
      [1]
      
      general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
      CPU: 1 PID: 5079 Comm: syz-executor993 Not tainted 6.7.0-rc4-syzkaller-00141-g1ae4cd3c #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
      RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
      Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
      RSP: 0018:ffffc900043473d0 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597
      RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070
      RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff
      R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0
      R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046
      FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      udp6_ufo_fragment+0xa0e/0xd00 net/ipv6/udp_offload.c:109
      ipv6_gso_segment+0x534/0x17e0 net/ipv6/ip6_offload.c:120
      skb_mac_gso_segment+0x290/0x610 net/core/gso.c:53
      __skb_gso_segment+0x339/0x710 net/core/gso.c:124
      skb_gso_segment include/net/gso.h:83 [inline]
      validate_xmit_skb+0x36c/0xeb0 net/core/dev.c:3626
      __dev_queue_xmit+0x6f3/0x3d60 net/core/dev.c:4338
      dev_queue_xmit include/linux/netdevice.h:3134 [inline]
      packet_xmit+0x257/0x380 net/packet/af_packet.c:276
      packet_snd net/packet/af_packet.c:3087 [inline]
      packet_sendmsg+0x24c6/0x5220 net/packet/af_packet.c:3119
      sock_sendmsg_nosec net/socket.c:730 [inline]
      __sock_sendmsg+0xd5/0x180 net/socket.c:745
      __sys_sendto+0x255/0x340 net/socket.c:2190
      __do_sys_sendto net/socket.c:2202 [inline]
      __se_sys_sendto net/socket.c:2198 [inline]
      __x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
      entry_SYSCALL_64_after_hwframe+0x63/0x6b
      RIP: 0033:0x7f8692032aa9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fff8d685418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8692032aa9
      RDX: 0000000000010048 RSI: 00000000200000c0 RDI: 0000000000000003
      RBP: 00000000000f4240 R08: 0000000020000540 R09: 0000000000000014
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff8d685480
      R13: 0000000000000001 R14: 00007fff8d685480 R15: 0000000000000003
      </TASK>
      Modules linked in:
      ---[ end trace 0000000000000000 ]---
      RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
      Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
      RSP: 0018:ffffc900043473d0 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597
      RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070
      RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff
      R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0
      R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046
      FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: 3953c46c ("sk_buff: allow segmenting based on frag sizes")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20231212164621.4131800-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      23d05d56
    • Nikolay Kuratov's avatar
      vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space() · 60316d7f
      Nikolay Kuratov authored
      We need to do signed arithmetic if we expect condition
      `if (bytes < 0)` to be possible
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE
      
      Fixes: 06a8fc78 ("VSOCK: Introduce virtio_vsock_common.ko")
      Signed-off-by: default avatarNikolay Kuratov <kniv@yandex-team.ru>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Link: https://lore.kernel.org/r/20231211162317.4116625-1-kniv@yandex-team.ruSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      60316d7f
    • Rahul Rameshbabu's avatar
      net/mlx5e: Correct snprintf truncation handling for fw_version buffer used by representors · b13559b7
      Rahul Rameshbabu authored
      snprintf returns the length of the formatted string, excluding the trailing
      null, without accounting for truncation. This means that is the return
      value is greater than or equal to the size parameter, the fw_version string
      was truncated.
      
      Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf
      Fixes: 1b2bd0c0 ("net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors")
      Signed-off-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      b13559b7
    • Rahul Rameshbabu's avatar
      net/mlx5e: Correct snprintf truncation handling for fw_version buffer · ad436b9c
      Rahul Rameshbabu authored
      snprintf returns the length of the formatted string, excluding the trailing
      null, without accounting for truncation. This means that is the return
      value is greater than or equal to the size parameter, the fw_version string
      was truncated.
      Reported-by: default avatarDavid Laight <David.Laight@ACULAB.COM>
      Closes: https://lore.kernel.org/netdev/81cae734ee1b4cde9b380a9a31006c1a@AcuMS.aculab.com/
      Link: https://docs.kernel.org/core-api/kernel-api.html#c.snprintf
      Fixes: 41e63c2b ("net/mlx5e: Check return value of snprintf writing to fw_version buffer")
      Signed-off-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ad436b9c