1. 17 Oct, 2024 13 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 07d6bf63
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Current release - new code bugs:
      
         - eth: mlx5: HWS, don't destroy more bwc queue locks than allocated
      
        Previous releases - regressions:
      
         - ipv4: give an IPv4 dev to blackhole_netdev
      
         - udp: compute L4 checksum as usual when not segmenting the skb
      
         - tcp/dccp: don't use timer_pending() in reqsk_queue_unlink().
      
         - eth: mlx5e: don't call cleanup on profile rollback failure
      
         - eth: microchip: vcap api: fix memory leaks in
           vcap_api_encode_rule_test()
      
         - eth: enetc: disable Tx BD rings after they are empty
      
         - eth: macb: avoid 20s boot delay by skipping MDIO bus registration
           for fixed-link PHY
      
        Previous releases - always broken:
      
         - posix-clock: fix missing timespec64 check in pc_clock_settime()
      
         - genetlink: hold RCU in genlmsg_mcast()
      
         - mptcp: prevent MPC handshake on port-based signal endpoints
      
         - eth: vmxnet3: fix packet corruption in vmxnet3_xdp_xmit_frame
      
         - eth: stmmac: dwmac-tegra: fix link bring-up sequence
      
         - eth: bcmasp: fix potential memory leak in bcmasp_xmit()
      
        Misc:
      
         - add Andrew Lunn as a co-maintainer of all networking drivers"
      
      * tag 'net-6.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
        net/mlx5e: Don't call cleanup on profile rollback failure
        net/mlx5: Unregister notifier on eswitch init failure
        net/mlx5: Fix command bitmask initialization
        net/mlx5: Check for invalid vector index on EQ creation
        net/mlx5: HWS, use lock classes for bwc locks
        net/mlx5: HWS, don't destroy more bwc queue locks than allocated
        net/mlx5: HWS, fixed double free in error flow of definer layout
        net/mlx5: HWS, removed wrong access to a number of rules variable
        mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow
        net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init
        vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame
        net: dsa: vsc73xx: fix reception from VLAN-unaware bridges
        net: ravb: Only advertise Rx/Tx timestamps if hardware supports it
        net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test()
        net: phy: mdio-bcm-unimac: Add BCM6846 support
        dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio
        udp: Compute L4 checksum as usual when not segmenting the skb
        genetlink: hold RCU in genlmsg_mcast()
        net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361
        tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink().
        ...
      07d6bf63
    • Paolo Abeni's avatar
      Merge branch 'mlx5-misc-fixes-2024-10-15' · cb560795
      Paolo Abeni authored
      Tariq Toukan says:
      
      ====================
      mlx5 misc fixes 2024-10-15
      
      This patchset provides misc bug fixes from the team to the mlx5 core and
      Eth drivers.
      
      Series generated against:
      commit 174714f0 ("selftests: drivers: net: fix name not defined")
      ====================
      
      Link: https://patch.msgid.link/20241015093208.197603-1-tariqt@nvidia.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      cb560795
    • Cosmin Ratiu's avatar
      net/mlx5e: Don't call cleanup on profile rollback failure · 4dbc1d1a
      Cosmin Ratiu authored
      When profile rollback fails in mlx5e_netdev_change_profile, the netdev
      profile var is left set to NULL. Avoid a crash when unloading the driver
      by not calling profile->cleanup in such a case.
      
      This was encountered while testing, with the original trigger that
      the wq rescuer thread creation got interrupted (presumably due to
      Ctrl+C-ing modprobe), which gets converted to ENOMEM (-12) by
      mlx5e_priv_init, the profile rollback also fails for the same reason
      (signal still active) so the profile is left as NULL, leading to a crash
      later in _mlx5e_remove.
      
       [  732.473932] mlx5_core 0000:08:00.1: E-Switch: Unload vfs: mode(OFFLOADS), nvfs(2), necvfs(0), active vports(2)
       [  734.525513] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
       [  734.557372] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12
       [  734.559187] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: new profile init failed, -12
       [  734.560153] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR
       [  734.589378] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12
       [  734.591136] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12
       [  745.537492] BUG: kernel NULL pointer dereference, address: 0000000000000008
       [  745.538222] #PF: supervisor read access in kernel mode
      <snipped>
       [  745.551290] Call Trace:
       [  745.551590]  <TASK>
       [  745.551866]  ? __die+0x20/0x60
       [  745.552218]  ? page_fault_oops+0x150/0x400
       [  745.555307]  ? exc_page_fault+0x79/0x240
       [  745.555729]  ? asm_exc_page_fault+0x22/0x30
       [  745.556166]  ? mlx5e_remove+0x6b/0xb0 [mlx5_core]
       [  745.556698]  auxiliary_bus_remove+0x18/0x30
       [  745.557134]  device_release_driver_internal+0x1df/0x240
       [  745.557654]  bus_remove_device+0xd7/0x140
       [  745.558075]  device_del+0x15b/0x3c0
       [  745.558456]  mlx5_rescan_drivers_locked.part.0+0xb1/0x2f0 [mlx5_core]
       [  745.559112]  mlx5_unregister_device+0x34/0x50 [mlx5_core]
       [  745.559686]  mlx5_uninit_one+0x46/0xf0 [mlx5_core]
       [  745.560203]  remove_one+0x4e/0xd0 [mlx5_core]
       [  745.560694]  pci_device_remove+0x39/0xa0
       [  745.561112]  device_release_driver_internal+0x1df/0x240
       [  745.561631]  driver_detach+0x47/0x90
       [  745.562022]  bus_remove_driver+0x84/0x100
       [  745.562444]  pci_unregister_driver+0x3b/0x90
       [  745.562890]  mlx5_cleanup+0xc/0x1b [mlx5_core]
       [  745.563415]  __x64_sys_delete_module+0x14d/0x2f0
       [  745.563886]  ? kmem_cache_free+0x1b0/0x460
       [  745.564313]  ? lockdep_hardirqs_on_prepare+0xe2/0x190
       [  745.564825]  do_syscall_64+0x6d/0x140
       [  745.565223]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
       [  745.565725] RIP: 0033:0x7f1579b1288b
      
      Fixes: 3ef14e46 ("net/mlx5e: Separate between netdev objects and mlx5e profiles initialization")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Reviewed-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4dbc1d1a
    • Cosmin Ratiu's avatar
      net/mlx5: Unregister notifier on eswitch init failure · 1da9cfd6
      Cosmin Ratiu authored
      It otherwise remains registered and a subsequent attempt at eswitch
      enabling might trigger warnings of the sort:
      
      [  682.589148] ------------[ cut here ]------------
      [  682.590204] notifier callback eswitch_vport_event [mlx5_core] already registered
      [  682.590256] WARNING: CPU: 13 PID: 2660 at kernel/notifier.c:31 notifier_chain_register+0x3e/0x90
      [...snipped]
      [  682.610052] Call Trace:
      [  682.610369]  <TASK>
      [  682.610663]  ? __warn+0x7c/0x110
      [  682.611050]  ? notifier_chain_register+0x3e/0x90
      [  682.611556]  ? report_bug+0x148/0x170
      [  682.611977]  ? handle_bug+0x36/0x70
      [  682.612384]  ? exc_invalid_op+0x13/0x60
      [  682.612817]  ? asm_exc_invalid_op+0x16/0x20
      [  682.613284]  ? notifier_chain_register+0x3e/0x90
      [  682.613789]  atomic_notifier_chain_register+0x25/0x40
      [  682.614322]  mlx5_eswitch_enable_locked+0x1d4/0x3b0 [mlx5_core]
      [  682.614965]  mlx5_eswitch_enable+0xc9/0x100 [mlx5_core]
      [  682.615551]  mlx5_device_enable_sriov+0x25/0x340 [mlx5_core]
      [  682.616170]  mlx5_core_sriov_configure+0x50/0x170 [mlx5_core]
      [  682.616789]  sriov_numvfs_store+0xb0/0x1b0
      [  682.617248]  kernfs_fop_write_iter+0x117/0x1a0
      [  682.617734]  vfs_write+0x231/0x3f0
      [  682.618138]  ksys_write+0x63/0xe0
      [  682.618536]  do_syscall_64+0x4c/0x100
      [  682.618958]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
      
      Fixes: 7624e58a ("net/mlx5: E-switch, register event handler before arming the event")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1da9cfd6
    • Shay Drory's avatar
      net/mlx5: Fix command bitmask initialization · d62b1404
      Shay Drory authored
      Command bitmask have a dedicated bit for MANAGE_PAGES command, this bit
      isn't Initialize during command bitmask Initialization, only during
      MANAGE_PAGES.
      
      In addition, mlx5_cmd_trigger_completions() is trying to trigger
      completion for MANAGE_PAGES command as well.
      
      Hence, in case health error occurred before any MANAGE_PAGES command
      have been invoke (for example, during mlx5_enable_hca()),
      mlx5_cmd_trigger_completions() will try to trigger completion for
      MANAGE_PAGES command, which will result in null-ptr-deref error.[1]
      
      Fix it by Initialize command bitmask correctly.
      
      While at it, re-write the code for better understanding.
      
      [1]
      BUG: KASAN: null-ptr-deref in mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core]
      Write of size 4 at addr 0000000000000214 by task kworker/u96:2/12078
      CPU: 10 PID: 12078 Comm: kworker/u96:2 Not tainted 6.9.0-rc2_for_upstream_debug_2024_04_07_19_01 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      Workqueue: mlx5_health0000:08:00.0 mlx5_fw_fatal_reporter_err_work [mlx5_core]
      Call Trace:
       <TASK>
       dump_stack_lvl+0x7e/0xc0
       kasan_report+0xb9/0xf0
       kasan_check_range+0xec/0x190
       mlx5_cmd_trigger_completions+0x1db/0x600 [mlx5_core]
       mlx5_cmd_flush+0x94/0x240 [mlx5_core]
       enter_error_state+0x6c/0xd0 [mlx5_core]
       mlx5_fw_fatal_reporter_err_work+0xf3/0x480 [mlx5_core]
       process_one_work+0x787/0x1490
       ? lockdep_hardirqs_on_prepare+0x400/0x400
       ? pwq_dec_nr_in_flight+0xda0/0xda0
       ? assign_work+0x168/0x240
       worker_thread+0x586/0xd30
       ? rescuer_thread+0xae0/0xae0
       kthread+0x2df/0x3b0
       ? kthread_complete_and_exit+0x20/0x20
       ret_from_fork+0x2d/0x70
       ? kthread_complete_and_exit+0x20/0x20
       ret_from_fork_asm+0x11/0x20
       </TASK>
      
      Fixes: 9b98d395 ("net/mlx5: Start health poll at earlier stage of driver load")
      Signed-off-by: default avatarShay Drory <shayd@nvidia.com>
      Reviewed-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d62b1404
    • Maher Sanalla's avatar
      net/mlx5: Check for invalid vector index on EQ creation · d4f25be2
      Maher Sanalla authored
      Currently, mlx5 driver does not enforce vector index to be lower than
      the maximum number of supported completion vectors when requesting a
      new completion EQ. Thus, mlx5_comp_eqn_get() fails when trying to
      acquire an IRQ with an improper vector index.
      
      To prevent the case above, enforce that vector index value is
      valid and lower than maximum in mlx5_comp_eqn_get() before handling the
      request.
      
      Fixes: f14c1a14 ("net/mlx5: Allocate completion EQs dynamically")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d4f25be2
    • Cosmin Ratiu's avatar
      net/mlx5: HWS, use lock classes for bwc locks · 9addffa3
      Cosmin Ratiu authored
      The HWS BWC API uses one lock per queue and usually acquires one of
      them, except when doing changes which require locking all queues in
      order. Naturally, lockdep isn't too happy about acquiring the same lock
      class multiple times, so inform it that each queue lock is a different
      class to avoid false positives.
      
      Fixes: 2ca62599 ("net/mlx5: HWS, added send engine and context handling")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9addffa3
    • Cosmin Ratiu's avatar
      net/mlx5: HWS, don't destroy more bwc queue locks than allocated · 45bcbd49
      Cosmin Ratiu authored
      hws_send_queues_bwc_locks_destroy destroyed more queue locks than
      allocated, leading to memory corruption (occasionally) and warnings such
      as DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock)) in __mutex_destroy because
      sometimes, the 'mutex' being destroyed was random memory.
      The severity of this problem is proportional to the number of queues
      configured because the code overreaches beyond the end of the
      bwc_send_queue_locks array by 2x its length.
      
      Fix that by using the correct number of bwc queues.
      
      Fixes: 2ca62599 ("net/mlx5: HWS, added send engine and context handling")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      45bcbd49
    • Yevgeny Kliteynik's avatar
      net/mlx5: HWS, fixed double free in error flow of definer layout · 5aa2184e
      Yevgeny Kliteynik authored
      Fix error flow bug that could lead to double free of a buffer
      during a failure to calculate a suitable definer layout.
      
      Fixes: 74a778b4 ("net/mlx5: HWS, added definers handling")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: default avatarItamar Gozlan <igozlan@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5aa2184e
    • Yevgeny Kliteynik's avatar
      net/mlx5: HWS, removed wrong access to a number of rules variable · 65b4eb9f
      Yevgeny Kliteynik authored
      Removed wrong access to the num_of_rules field of the matcher.
      This is a usual u32 variable, but the access was as if it was atomic.
      
      This fixes the following CI warnings:
        mlx5hws_bwc.c:708:17: warning: large atomic operation may incur significant performance penalty;
        the access size (4 bytes) exceeds the max lock-free size (0 bytes) [-Watomic-alignment]
      
      Fixes: 510f9f61 ("net/mlx5: HWS, added API and enabled HWS support")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202409291101.6NdtMFVC-lkp@intel.com/Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Reviewed-by: default avatarItamar Gozlan <igozlan@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      65b4eb9f
    • Matthieu Baerts (NGI0)'s avatar
      mptcp: pm: fix UaF read in mptcp_pm_nl_rm_addr_or_subflow · 7decd1f5
      Matthieu Baerts (NGI0) authored
      Syzkaller reported this splat:
      
        ==================================================================
        BUG: KASAN: slab-use-after-free in mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
        Read of size 4 at addr ffff8880569ac858 by task syz.1.2799/14662
      
        CPU: 0 UID: 0 PID: 14662 Comm: syz.1.2799 Not tainted 6.12.0-rc2-syzkaller-00307-g36c25451 #0
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
        Call Trace:
         <TASK>
         __dump_stack lib/dump_stack.c:94 [inline]
         dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
         print_address_description mm/kasan/report.c:377 [inline]
         print_report+0xc3/0x620 mm/kasan/report.c:488
         kasan_report+0xd9/0x110 mm/kasan/report.c:601
         mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
         mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
         mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
         mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
         genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
         genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
         genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
         netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
         genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
         netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
         netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
         netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
         sock_sendmsg_nosec net/socket.c:729 [inline]
         __sock_sendmsg net/socket.c:744 [inline]
         ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
         ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
         __sys_sendmsg+0x117/0x1f0 net/socket.c:2690
         do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
         __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
         do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
         entry_SYSENTER_compat_after_hwframe+0x84/0x8e
        RIP: 0023:0xf7fe4579
        Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
        RSP: 002b:00000000f574556c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
        RAX: ffffffffffffffda RBX: 000000000000000b RCX: 0000000020000140
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
        RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
        R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
         </TASK>
      
        Allocated by task 5387:
         kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
         kasan_save_track+0x14/0x30 mm/kasan/common.c:68
         poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
         __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:394
         kmalloc_noprof include/linux/slab.h:878 [inline]
         kzalloc_noprof include/linux/slab.h:1014 [inline]
         subflow_create_ctx+0x87/0x2a0 net/mptcp/subflow.c:1803
         subflow_ulp_init+0xc3/0x4d0 net/mptcp/subflow.c:1956
         __tcp_set_ulp net/ipv4/tcp_ulp.c:146 [inline]
         tcp_set_ulp+0x326/0x7f0 net/ipv4/tcp_ulp.c:167
         mptcp_subflow_create_socket+0x4ae/0x10a0 net/mptcp/subflow.c:1764
         __mptcp_subflow_connect+0x3cc/0x1490 net/mptcp/subflow.c:1592
         mptcp_pm_create_subflow_or_signal_addr+0xbda/0x23a0 net/mptcp/pm_netlink.c:642
         mptcp_pm_nl_fully_established net/mptcp/pm_netlink.c:650 [inline]
         mptcp_pm_nl_work+0x3a1/0x4f0 net/mptcp/pm_netlink.c:943
         mptcp_worker+0x15a/0x1240 net/mptcp/protocol.c:2777
         process_one_work+0x958/0x1b30 kernel/workqueue.c:3229
         process_scheduled_works kernel/workqueue.c:3310 [inline]
         worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
         kthread+0x2c1/0x3a0 kernel/kthread.c:389
         ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
         ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
        Freed by task 113:
         kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
         kasan_save_track+0x14/0x30 mm/kasan/common.c:68
         kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
         poison_slab_object mm/kasan/common.c:247 [inline]
         __kasan_slab_free+0x51/0x70 mm/kasan/common.c:264
         kasan_slab_free include/linux/kasan.h:230 [inline]
         slab_free_hook mm/slub.c:2342 [inline]
         slab_free mm/slub.c:4579 [inline]
         kfree+0x14f/0x4b0 mm/slub.c:4727
         kvfree+0x47/0x50 mm/util.c:701
         kvfree_rcu_list+0xf5/0x2c0 kernel/rcu/tree.c:3423
         kvfree_rcu_drain_ready kernel/rcu/tree.c:3563 [inline]
         kfree_rcu_monitor+0x503/0x8b0 kernel/rcu/tree.c:3632
         kfree_rcu_shrink_scan+0x245/0x3a0 kernel/rcu/tree.c:3966
         do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
         shrink_slab+0x32b/0x12a0 mm/shrinker.c:662
         shrink_one+0x47e/0x7b0 mm/vmscan.c:4818
         shrink_many mm/vmscan.c:4879 [inline]
         lru_gen_shrink_node mm/vmscan.c:4957 [inline]
         shrink_node+0x2452/0x39d0 mm/vmscan.c:5937
         kswapd_shrink_node mm/vmscan.c:6765 [inline]
         balance_pgdat+0xc19/0x18f0 mm/vmscan.c:6957
         kswapd+0x5ea/0xbf0 mm/vmscan.c:7226
         kthread+0x2c1/0x3a0 kernel/kthread.c:389
         ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
         ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
      
        Last potentially related work creation:
         kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
         __kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:541
         kvfree_call_rcu+0x74/0xbe0 kernel/rcu/tree.c:3810
         subflow_ulp_release+0x2ae/0x350 net/mptcp/subflow.c:2009
         tcp_cleanup_ulp+0x7c/0x130 net/ipv4/tcp_ulp.c:124
         tcp_v4_destroy_sock+0x1c5/0x6a0 net/ipv4/tcp_ipv4.c:2541
         inet_csk_destroy_sock+0x1a3/0x440 net/ipv4/inet_connection_sock.c:1293
         tcp_done+0x252/0x350 net/ipv4/tcp.c:4870
         tcp_rcv_state_process+0x379b/0x4f30 net/ipv4/tcp_input.c:6933
         tcp_v4_do_rcv+0x1ad/0xa90 net/ipv4/tcp_ipv4.c:1938
         sk_backlog_rcv include/net/sock.h:1115 [inline]
         __release_sock+0x31b/0x400 net/core/sock.c:3072
         __tcp_close+0x4f3/0xff0 net/ipv4/tcp.c:3142
         __mptcp_close_ssk+0x331/0x14d0 net/mptcp/protocol.c:2489
         mptcp_close_ssk net/mptcp/protocol.c:2543 [inline]
         mptcp_close_ssk+0x150/0x220 net/mptcp/protocol.c:2526
         mptcp_pm_nl_rm_addr_or_subflow+0x2be/0xcc0 net/mptcp/pm_netlink.c:878
         mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
         mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
         mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
         genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
         genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
         genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
         netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
         genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
         netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
         netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
         netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
         sock_sendmsg_nosec net/socket.c:729 [inline]
         __sock_sendmsg net/socket.c:744 [inline]
         ____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
         ___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
         __sys_sendmsg+0x117/0x1f0 net/socket.c:2690
         do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
         __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
         do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
         entry_SYSENTER_compat_after_hwframe+0x84/0x8e
      
        The buggy address belongs to the object at ffff8880569ac800
         which belongs to the cache kmalloc-512 of size 512
        The buggy address is located 88 bytes inside of
         freed 512-byte region [ffff8880569ac800, ffff8880569aca00)
      
        The buggy address belongs to the physical page:
        page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x569ac
        head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
        flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
        page_type: f5(slab)
        raw: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
        raw: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
        head: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
        head: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
        head: 04fff00000000002 ffffea00015a6b01 ffffffffffffffff 0000000000000000
        head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
        page dumped because: kasan: bad access detected
        page_owner tracks the page as allocated
        page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 10238, tgid 10238 (kworker/u32:6), ts 597403252405, free_ts 597177952947
         set_page_owner include/linux/page_owner.h:32 [inline]
         post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1537
         prep_new_page mm/page_alloc.c:1545 [inline]
         get_page_from_freelist+0x101e/0x3070 mm/page_alloc.c:3457
         __alloc_pages_noprof+0x223/0x25a0 mm/page_alloc.c:4733
         alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265
         alloc_slab_page mm/slub.c:2412 [inline]
         allocate_slab mm/slub.c:2578 [inline]
         new_slab+0x2ba/0x3f0 mm/slub.c:2631
         ___slab_alloc+0xd1d/0x16f0 mm/slub.c:3818
         __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908
         __slab_alloc_node mm/slub.c:3961 [inline]
         slab_alloc_node mm/slub.c:4122 [inline]
         __kmalloc_cache_noprof+0x2c5/0x310 mm/slub.c:4290
         kmalloc_noprof include/linux/slab.h:878 [inline]
         kzalloc_noprof include/linux/slab.h:1014 [inline]
         mld_add_delrec net/ipv6/mcast.c:743 [inline]
         igmp6_leave_group net/ipv6/mcast.c:2625 [inline]
         igmp6_group_dropped+0x4ab/0xe40 net/ipv6/mcast.c:723
         __ipv6_dev_mc_dec+0x281/0x360 net/ipv6/mcast.c:979
         addrconf_leave_solict net/ipv6/addrconf.c:2253 [inline]
         __ipv6_ifa_notify+0x3f6/0xc30 net/ipv6/addrconf.c:6283
         addrconf_ifdown.isra.0+0xef9/0x1a20 net/ipv6/addrconf.c:3982
         addrconf_notify+0x220/0x19c0 net/ipv6/addrconf.c:3781
         notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
         call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1996
         call_netdevice_notifiers_extack net/core/dev.c:2034 [inline]
         call_netdevice_notifiers net/core/dev.c:2048 [inline]
         dev_close_many+0x333/0x6a0 net/core/dev.c:1589
        page last free pid 13136 tgid 13136 stack trace:
         reset_page_owner include/linux/page_owner.h:25 [inline]
         free_pages_prepare mm/page_alloc.c:1108 [inline]
         free_unref_page+0x5f4/0xdc0 mm/page_alloc.c:2638
         stack_depot_save_flags+0x2da/0x900 lib/stackdepot.c:666
         kasan_save_stack+0x42/0x60 mm/kasan/common.c:48
         kasan_save_track+0x14/0x30 mm/kasan/common.c:68
         unpoison_slab_object mm/kasan/common.c:319 [inline]
         __kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345
         kasan_slab_alloc include/linux/kasan.h:247 [inline]
         slab_post_alloc_hook mm/slub.c:4085 [inline]
         slab_alloc_node mm/slub.c:4134 [inline]
         kmem_cache_alloc_noprof+0x121/0x2f0 mm/slub.c:4141
         skb_clone+0x190/0x3f0 net/core/skbuff.c:2084
         do_one_broadcast net/netlink/af_netlink.c:1462 [inline]
         netlink_broadcast_filtered+0xb11/0xef0 net/netlink/af_netlink.c:1540
         netlink_broadcast+0x39/0x50 net/netlink/af_netlink.c:1564
         uevent_net_broadcast_untagged lib/kobject_uevent.c:331 [inline]
         kobject_uevent_net_broadcast lib/kobject_uevent.c:410 [inline]
         kobject_uevent_env+0xacd/0x1670 lib/kobject_uevent.c:608
         device_del+0x623/0x9f0 drivers/base/core.c:3882
         snd_card_disconnect.part.0+0x58a/0x7c0 sound/core/init.c:546
         snd_card_disconnect+0x1f/0x30 sound/core/init.c:495
         snd_usx2y_disconnect+0xe9/0x1f0 sound/usb/usx2y/usbusx2y.c:417
         usb_unbind_interface+0x1e8/0x970 drivers/usb/core/driver.c:461
         device_remove drivers/base/dd.c:569 [inline]
         device_remove+0x122/0x170 drivers/base/dd.c:561
      
      That's because 'subflow' is used just after 'mptcp_close_ssk(subflow)',
      which will initiate the release of its memory. Even if it is very likely
      the release and the re-utilisation will be done later on, it is of
      course better to avoid any issues and read the content of 'subflow'
      before closing it.
      
      Fixes: 1c1f7213 ("mptcp: pm: only decrement add_addr_accepted for MPJ req")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+3c8b7a8e7df6a2a226ca@syzkaller.appspotmail.com
      Closes: https://lore.kernel.org/670d7337.050a0220.4cbc0.004f.GAE@google.comSigned-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://patch.msgid.link/20241015-net-mptcp-uaf-pm-rm-v1-1-c4ee5d987a64@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7decd1f5
    • Felix Fietkau's avatar
      net: ethernet: mtk_eth_soc: fix memory corruption during fq dma init · 88806efc
      Felix Fietkau authored
      The loop responsible for allocating up to MTK_FQ_DMA_LENGTH buffers must
      only touch as many descriptors, otherwise it ends up corrupting unrelated
      memory. Fix the loop iteration count accordingly.
      
      Fixes: c57e5581 ("net: ethernet: mtk_eth_soc: handle dma buffer size soc specific")
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20241015081755.31060-1-nbd@nbd.nameSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      88806efc
    • Daniel Borkmann's avatar
      vmxnet3: Fix packet corruption in vmxnet3_xdp_xmit_frame · 4678adf9
      Daniel Borkmann authored
      Andrew and Nikolay reported connectivity issues with Cilium's service
      load-balancing in case of vmxnet3.
      
      If a BPF program for native XDP adds an encapsulation header such as
      IPIP and transmits the packet out the same interface, then in case
      of vmxnet3 a corrupted packet is being sent and subsequently dropped
      on the path.
      
      vmxnet3_xdp_xmit_frame() which is called e.g. via vmxnet3_run_xdp()
      through vmxnet3_xdp_xmit_back() calculates an incorrect DMA address:
      
        page = virt_to_page(xdpf->data);
        tbi->dma_addr = page_pool_get_dma_addr(page) +
                        VMXNET3_XDP_HEADROOM;
        dma_sync_single_for_device(&adapter->pdev->dev,
                                   tbi->dma_addr, buf_size,
                                   DMA_TO_DEVICE);
      
      The above assumes a fixed offset (VMXNET3_XDP_HEADROOM), but the XDP
      BPF program could have moved xdp->data. While the passed buf_size is
      correct (xdpf->len), the dma_addr needs to have a dynamic offset which
      can be calculated as xdpf->data - (void *)xdpf, that is, xdp->data -
      xdp->data_hard_start.
      
      Fixes: 54f00cce ("vmxnet3: Add XDP support.")
      Reported-by: default avatarAndrew Sauber <andrew.sauber@isovalent.com>
      Reported-by: default avatarNikolay Nikolaev <nikolay.nikolaev@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarNikolay Nikolaev <nikolay.nikolaev@isovalent.com>
      Acked-by: default avatarAnton Protopopov <aspsk@isovalent.com>
      Cc: William Tu <witu@nvidia.com>
      Cc: Ronak Doshi <ronak.doshi@broadcom.com>
      Link: https://patch.msgid.link/a0888656d7f09028f9984498cc698bb5364d89fc.1728931137.git.daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4678adf9
  2. 16 Oct, 2024 16 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · c964ced7
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Several miscellaneous fixes. A lot of bnxt_re activity, there will be
        more rc patches there coming.
      
         - Many bnxt_re bug fixes - Memory leaks, kasn, NULL pointer deref,
           soft lockups, error unwinding and some small functional issues
      
         - Error unwind bug in rdma netlink
      
         - Two issues with incorrect VLAN detection for iWarp
      
         - skb_splice_from_iter() splat in siw
      
         - Give SRP slab caches unique names to resolve the merge window
           WARN_ON regression"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/bnxt_re: Fix the GID table length
        RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages
        RDMA/bnxt_re: Change the sequence of updating the CQ toggle value
        RDMA/bnxt_re: Fix an error path in bnxt_re_add_device
        RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop
        RDMA/bnxt_re: Fix a possible NULL pointer dereference
        RDMA/bnxt_re: Return more meaningful error
        RDMA/bnxt_re: Fix incorrect dereference of srq in async event
        RDMA/bnxt_re: Fix out of bound check
        RDMA/bnxt_re: Fix the max CQ WQEs for older adapters
        RDMA/srpt: Make slab cache names unique
        RDMA/irdma: Fix misspelling of "accept*"
        RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP
        RDMA/siw: Add sendpage_ok() check to disable MSG_SPLICE_PAGES
        RDMA/core: Fix ENODEV error for iWARP test over vlan
        RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event
        RDMA/bnxt_re: Fix the max WQEs used in Static WQE mode
        RDMA/bnxt_re: Add a check for memory allocation
        RDMA/bnxt_re: Fix incorrect AVID type in WQE structure
        RDMA/bnxt_re: Fix a possible memory leak
      c964ced7
    • Linus Torvalds's avatar
      Merge tag 'for-6.12-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 667b1d41
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
      
       - regression fix: dirty extents tracked in xarray for qgroups must be
         adjusted for 32bit platforms
      
       - fix potentially freeing uninitialized name in fscrypt structure
      
       - fix warning about unneeded variable in a send callback
      
      * tag 'for-6.12-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix uninitialized pointer free on read_alloc_one_name() error
        btrfs: send: cleanup unneeded return variable in changed_verity()
        btrfs: fix uninitialized pointer free in add_inode_ref()
        btrfs: use sector numbers as keys for the dirty extents xarray
      667b1d41
    • Linus Torvalds's avatar
      Merge tag 'v6.12-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd · 9f635d44
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
      
       - fix race between session setup and session logoff
      
       - add supplementary group support
      
      * tag 'v6.12-rc3-ksmbd-fixes' of git://git.samba.org/ksmbd:
        ksmbd: add support for supplementary groups
        ksmbd: fix user-after-free from session log off
      9f635d44
    • Linus Torvalds's avatar
      Merge tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 6f6fc393
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - Remove bogus testmgr ENOENT error messages
      
       - Ensure algorithm is still alive before marking it as tested
      
       - Disable buggy hash algorithms in marvell/cesa
      
      * tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: marvell/cesa - Disable hash algorithms
        crypto: testmgr - Hide ENOENT errors better
        crypto: api - Fix liveliness check in crypto_alg_tested
      6f6fc393
    • Linus Torvalds's avatar
      Merge tag 'sched_ext-for-6.12-rc3-fixes' of... · dff65843
      Linus Torvalds authored
      Merge tag 'sched_ext-for-6.12-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
      
      Pull sched_ext fixes from Tejun Heo:
      
       - More issues reported in the enable/disable paths on large machines
         with many tasks due to scx_tasks_lock being held too long. Break up
         the task iterations
      
       - Remove ops.select_cpu() dependency in bypass mode so that a
         misbehaving implementation can't live-lock the machine by pushing all
         tasks to few CPUs in bypass mode
      
       - Other misc fixes
      
      * tag 'sched_ext-for-6.12-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
        sched_ext: Remove unnecessary cpu_relax()
        sched_ext: Don't hold scx_tasks_lock for too long
        sched_ext: Move scx_tasks_lock handling into scx_task_iter helpers
        sched_ext: bypass mode shouldn't depend on ops.select_cpu()
        sched_ext: Move scx_buildin_idle_enabled check to scx_bpf_select_cpu_dfl()
        sched_ext: Start schedulers with consistent p->scx.slice values
        Revert "sched_ext: Use shorter slice while bypassing"
        sched_ext: use correct function name in pick_task_scx() warning message
        selftests: sched_ext: Add sched_ext as proper selftest target
      dff65843
    • Vladimir Oltean's avatar
      net: dsa: vsc73xx: fix reception from VLAN-unaware bridges · 11d06f0a
      Vladimir Oltean authored
      Similar to the situation described for sja1105 in commit 1f9fc48f
      ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"), the
      vsc73xx driver uses tag_8021q and doesn't need the ds->untag_bridge_pvid
      request. In fact, this option breaks packet reception.
      
      The ds->untag_bridge_pvid option strips VLANs from packets received on
      VLAN-unaware bridge ports. But those VLANs should already be stripped
      by tag_vsc73xx_8021q.c as part of vsc73xx_rcv() - they are not VLANs in
      VLAN-unaware mode, but DSA tags. Thus, dsa_software_vlan_untag() tries
      to untag a VLAN that doesn't exist, corrupting the packet.
      
      Fixes: 93e4649e ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges")
      Tested-by: default avatarPawel Dembicki <paweldembicki@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Link: https://patch.msgid.link/20241014153041.1110364-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      11d06f0a
    • Niklas Söderlund's avatar
      net: ravb: Only advertise Rx/Tx timestamps if hardware supports it · 126e7996
      Niklas Söderlund authored
      Recent work moving the reporting of Rx software timestamps to the core
      [1] highlighted an issue where hardware time stamping was advertised
      for the platforms where it is not supported.
      
      Fix this by covering advertising support for hardware timestamps only if
      the hardware supports it. Due to the Tx implementation in RAVB software
      Tx timestamping is also only considered if the hardware supports
      hardware timestamps. This should be addressed in future, but this fix
      only reflects what the driver currently implements.
      
      1. Commit 277901ee ("ravb: Remove setting of RX software timestamp")
      
      Fixes: 7e09a052 ("ravb: Exclude gPTP feature support for RZ/G2L")
      Signed-off-by: default avatarNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Reviewed-by: default avatarPaul Barker <paul.barker.ct@bp.renesas.com>
      Tested-by: default avatarPaul Barker <paul.barker.ct@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://patch.msgid.link/20241014124343.3875285-1-niklas.soderlund+renesas@ragnatech.seSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      126e7996
    • Jinjie Ruan's avatar
      net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test() · 217a3d98
      Jinjie Ruan authored
      Commit a3c1e451 ("net: microchip: vcap: Fix use-after-free error in
      kunit test") fixed the use-after-free error, but introduced below
      memory leaks by removing necessary vcap_free_rule(), add it to fix it.
      
      	unreferenced object 0xffffff80ca58b700 (size 192):
      	  comm "kunit_try_catch", pid 1215, jiffies 4294898264
      	  hex dump (first 32 bytes):
      	    00 12 7a 00 05 00 00 00 0a 00 00 00 64 00 00 00  ..z.........d...
      	    00 00 00 00 00 00 00 00 00 04 0b cc 80 ff ff ff  ................
      	  backtrace (crc 9c09c3fe):
      	    [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
      	    [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
      	    [<0000000040a01b8d>] vcap_alloc_rule+0x3cc/0x9c4
      	    [<000000003fe86110>] vcap_api_encode_rule_test+0x1ac/0x16b0
      	    [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
      	    [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
      	    [<00000000c5d82c9a>] kthread+0x2e8/0x374
      	    [<00000000f4287308>] ret_from_fork+0x10/0x20
      	unreferenced object 0xffffff80cc0b0400 (size 64):
      	  comm "kunit_try_catch", pid 1215, jiffies 4294898265
      	  hex dump (first 32 bytes):
      	    80 04 0b cc 80 ff ff ff 18 b7 58 ca 80 ff ff ff  ..........X.....
      	    39 00 00 00 02 00 00 00 06 05 04 03 02 01 ff ff  9...............
      	  backtrace (crc daf014e9):
      	    [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
      	    [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
      	    [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
      	    [<00000000dfdb1e81>] vcap_api_encode_rule_test+0x224/0x16b0
      	    [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
      	    [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
      	    [<00000000c5d82c9a>] kthread+0x2e8/0x374
      	    [<00000000f4287308>] ret_from_fork+0x10/0x20
      	unreferenced object 0xffffff80cc0b0700 (size 64):
      	  comm "kunit_try_catch", pid 1215, jiffies 4294898265
      	  hex dump (first 32 bytes):
      	    80 07 0b cc 80 ff ff ff 28 b7 58 ca 80 ff ff ff  ........(.X.....
      	    3c 00 00 00 00 00 00 00 01 2f 03 b3 ec ff ff ff  <......../......
      	  backtrace (crc 8d877792):
      	    [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
      	    [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
      	    [<000000006eadfab7>] vcap_rule_add_action+0x2d0/0x52c
      	    [<00000000323475d1>] vcap_api_encode_rule_test+0x4d4/0x16b0
      	    [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
      	    [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
      	    [<00000000c5d82c9a>] kthread+0x2e8/0x374
      	    [<00000000f4287308>] ret_from_fork+0x10/0x20
      	unreferenced object 0xffffff80cc0b0900 (size 64):
      	  comm "kunit_try_catch", pid 1215, jiffies 4294898266
      	  hex dump (first 32 bytes):
      	    80 09 0b cc 80 ff ff ff 80 06 0b cc 80 ff ff ff  ................
      	    7d 00 00 00 01 00 00 00 00 00 00 00 ff 00 00 00  }...............
      	  backtrace (crc 34181e56):
      	    [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
      	    [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
      	    [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
      	    [<00000000991e3564>] vcap_val_rule+0xcf0/0x13e8
      	    [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0
      	    [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
      	    [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
      	    [<00000000c5d82c9a>] kthread+0x2e8/0x374
      	    [<00000000f4287308>] ret_from_fork+0x10/0x20
      	unreferenced object 0xffffff80cc0b0980 (size 64):
      	  comm "kunit_try_catch", pid 1215, jiffies 4294898266
      	  hex dump (first 32 bytes):
      	    18 b7 58 ca 80 ff ff ff 00 09 0b cc 80 ff ff ff  ..X.............
      	    67 00 00 00 00 00 00 00 01 01 74 88 c0 ff ff ff  g.........t.....
      	  backtrace (crc 275fd9be):
      	    [<0000000052a0be73>] kmemleak_alloc+0x34/0x40
      	    [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4
      	    [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528
      	    [<000000001396a1a2>] test_add_def_fields+0xb0/0x100
      	    [<000000006e7621f0>] vcap_val_rule+0xa98/0x13e8
      	    [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0
      	    [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac
      	    [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec
      	    [<00000000c5d82c9a>] kthread+0x2e8/0x374
      	    [<00000000f4287308>] ret_from_fork+0x10/0x20
      	......
      
      Cc: stable@vger.kernel.org
      Fixes: a3c1e451 ("net: microchip: vcap: Fix use-after-free error in kunit test")
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarJens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
      Signed-off-by: default avatarJinjie Ruan <ruanjinjie@huawei.com>
      Link: https://patch.msgid.link/20241014121922.1280583-1-ruanjinjie@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      217a3d98
    • Jakub Kicinski's avatar
      Merge branch 'net-phy-mdio-bcm-unimac-add-bcm6846-variant' · 9626c182
      Jakub Kicinski authored
      Linus Walleij says:
      
      ====================
      net: phy: mdio-bcm-unimac: Add BCM6846 variant
      
      As pointed out by Florian:
      https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/
      
      The BCM6846 has a few extra registers and cannot reuse the
      compatible string from other variants of the Unimac
      MDIO block: we need to be able to tell them apart.
      ====================
      
      Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-0-c703ca83e962@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9626c182
    • Linus Walleij's avatar
      net: phy: mdio-bcm-unimac: Add BCM6846 support · 906b77ca
      Linus Walleij authored
      Add Unimac mdio compatible string for the special BCM6846
      variant.
      
      This variant has a few extra registers compared to other
      versions.
      Suggested-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-2-c703ca83e962@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      906b77ca
    • Linus Walleij's avatar
      dt-bindings: net: brcm,unimac-mdio: Add bcm6846-mdio · 6ed97afd
      Linus Walleij authored
      The MDIO block in the BCM6846 is not identical to any of the
      previous versions, but has extended registers not present in
      the other variants. For this reason we need to use a new
      compatible especially for this SoC.
      Suggested-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Acked-by: default avatarRob Herring (Arm) <robh@kernel.org>
      Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-1-c703ca83e962@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6ed97afd
    • Jakub Sitnicki's avatar
      udp: Compute L4 checksum as usual when not segmenting the skb · d96016a7
      Jakub Sitnicki authored
      If:
      
        1) the user requested USO, but
        2) there is not enough payload for GSO to kick in, and
        3) the egress device doesn't offer checksum offload, then
      
      we want to compute the L4 checksum in software early on.
      
      In the case when we are not taking the GSO path, but it has been requested,
      the software checksum fallback in skb_segment doesn't get a chance to
      compute the full checksum, if the egress device can't do it. As a result we
      end up sending UDP datagrams with only a partial checksum filled in, which
      the peer will discard.
      
      Fixes: 10154dbd ("udp: Allow GSO transmit from devices with no checksum offload")
      Reported-by: default avatarIvan Babrou <ivan@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: default avatarWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Cc: stable@vger.kernel.org
      Link: https://patch.msgid.link/20241011-uso-swcsum-fixup-v2-1-6e1ddc199af9@cloudflare.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d96016a7
    • Eric Dumazet's avatar
      genetlink: hold RCU in genlmsg_mcast() · 56440d7e
      Eric Dumazet authored
      While running net selftests with CONFIG_PROVE_RCU_LIST=y I saw
      one lockdep splat [1].
      
      genlmsg_mcast() uses for_each_net_rcu(), and must therefore hold RCU.
      
      Instead of letting all callers guard genlmsg_multicast_allns()
      with a rcu_read_lock()/rcu_read_unlock() pair, do it in genlmsg_mcast().
      
      This also means the @flags parameter is useless, we need to always use
      GFP_ATOMIC.
      
      [1]
      [10882.424136] =============================
      [10882.424166] WARNING: suspicious RCU usage
      [10882.424309] 6.12.0-rc2-virtme #1156 Not tainted
      [10882.424400] -----------------------------
      [10882.424423] net/netlink/genetlink.c:1940 RCU-list traversed in non-reader section!!
      [10882.424469]
      other info that might help us debug this:
      
      [10882.424500]
      rcu_scheduler_active = 2, debug_locks = 1
      [10882.424744] 2 locks held by ip/15677:
      [10882.424791] #0: ffffffffb6b491b0 (cb_lock){++++}-{3:3}, at: genl_rcv (net/netlink/genetlink.c:1219)
      [10882.426334] #1: ffffffffb6b49248 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg (net/netlink/genetlink.c:61 net/netlink/genetlink.c:57 net/netlink/genetlink.c:1209)
      [10882.426465]
      stack backtrace:
      [10882.426805] CPU: 14 UID: 0 PID: 15677 Comm: ip Not tainted 6.12.0-rc2-virtme #1156
      [10882.426919] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
      [10882.427046] Call Trace:
      [10882.427131]  <TASK>
      [10882.427244] dump_stack_lvl (lib/dump_stack.c:123)
      [10882.427335] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
      [10882.427387] genlmsg_multicast_allns (net/netlink/genetlink.c:1940 (discriminator 7) net/netlink/genetlink.c:1977 (discriminator 7))
      [10882.427436] l2tp_tunnel_notify.constprop.0 (net/l2tp/l2tp_netlink.c:119) l2tp_netlink
      [10882.427683] l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:253) l2tp_netlink
      [10882.427748] genl_family_rcv_msg_doit (net/netlink/genetlink.c:1115)
      [10882.427834] genl_rcv_msg (net/netlink/genetlink.c:1195 net/netlink/genetlink.c:1210)
      [10882.427877] ? __pfx_l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:186) l2tp_netlink
      [10882.427927] ? __pfx_genl_rcv_msg (net/netlink/genetlink.c:1201)
      [10882.427959] netlink_rcv_skb (net/netlink/af_netlink.c:2551)
      [10882.428069] genl_rcv (net/netlink/genetlink.c:1220)
      [10882.428095] netlink_unicast (net/netlink/af_netlink.c:1332 net/netlink/af_netlink.c:1357)
      [10882.428140] netlink_sendmsg (net/netlink/af_netlink.c:1901)
      [10882.428210] ____sys_sendmsg (net/socket.c:729 (discriminator 1) net/socket.c:744 (discriminator 1) net/socket.c:2607 (discriminator 1))
      
      Fixes: 33f72e6f ("l2tp : multicast notification to the registered listeners")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Cc: Tom Parkin <tparkin@katalix.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Link: https://patch.msgid.link/20241011171217.3166614-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      56440d7e
    • Peter Rashleigh's avatar
      net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361 · 1833d8a2
      Peter Rashleigh authored
      According to the Marvell datasheet the 88E6361 has two VTU pages
      (4k VIDs per page) so the max_vid should be 8191, not 4095.
      
      In the current implementation mv88e6xxx_vtu_walk() gives unexpected
      results because of this error. I verified that mv88e6xxx_vtu_walk()
      works correctly on the MV88E6361 with this patch in place.
      
      Fixes: 12899f29 ("net: dsa: mv88e6xxx: enable support for 88E6361 switch")
      Signed-off-by: default avatarPeter Rashleigh <peter@rashleigh.ca>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://patch.msgid.link/20241014204342.5852-1-peter@rashleigh.caSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1833d8a2
    • Kuniyuki Iwashima's avatar
      tcp/dccp: Don't use timer_pending() in reqsk_queue_unlink(). · e8c526f2
      Kuniyuki Iwashima authored
      Martin KaFai Lau reported use-after-free [0] in reqsk_timer_handler().
      
        """
        We are seeing a use-after-free from a bpf prog attached to
        trace_tcp_retransmit_synack. The program passes the req->sk to the
        bpf_sk_storage_get_tracing kernel helper which does check for null
        before using it.
        """
      
      The commit 83fccfc3 ("inet: fix potential deadlock in
      reqsk_queue_unlink()") added timer_pending() in reqsk_queue_unlink() not
      to call del_timer_sync() from reqsk_timer_handler(), but it introduced a
      small race window.
      
      Before the timer is called, expire_timers() calls detach_timer(timer, true)
      to clear timer->entry.pprev and marks it as not pending.
      
      If reqsk_queue_unlink() checks timer_pending() just after expire_timers()
      calls detach_timer(), TCP will miss del_timer_sync(); the reqsk timer will
      continue running and send multiple SYN+ACKs until it expires.
      
      The reported UAF could happen if req->sk is close()d earlier than the timer
      expiration, which is 63s by default.
      
      The scenario would be
      
        1. inet_csk_complete_hashdance() calls inet_csk_reqsk_queue_drop(),
           but del_timer_sync() is missed
      
        2. reqsk timer is executed and scheduled again
      
        3. req->sk is accept()ed and reqsk_put() decrements rsk_refcnt, but
           reqsk timer still has another one, and inet_csk_accept() does not
           clear req->sk for non-TFO sockets
      
        4. sk is close()d
      
        5. reqsk timer is executed again, and BPF touches req->sk
      
      Let's not use timer_pending() by passing the caller context to
      __inet_csk_reqsk_queue_drop().
      
      Note that reqsk timer is pinned, so the issue does not happen in most
      use cases. [1]
      
      [0]
      BUG: KFENCE: use-after-free read in bpf_sk_storage_get_tracing+0x2e/0x1b0
      
      Use-after-free read at 0x00000000a891fb3a (in kfence-#1):
      bpf_sk_storage_get_tracing+0x2e/0x1b0
      bpf_prog_5ea3e95db6da0438_tcp_retransmit_synack+0x1d20/0x1dda
      bpf_trace_run2+0x4c/0xc0
      tcp_rtx_synack+0xf9/0x100
      reqsk_timer_handler+0xda/0x3d0
      run_timer_softirq+0x292/0x8a0
      irq_exit_rcu+0xf5/0x320
      sysvec_apic_timer_interrupt+0x6d/0x80
      asm_sysvec_apic_timer_interrupt+0x16/0x20
      intel_idle_irq+0x5a/0xa0
      cpuidle_enter_state+0x94/0x273
      cpu_startup_entry+0x15e/0x260
      start_secondary+0x8a/0x90
      secondary_startup_64_no_verify+0xfa/0xfb
      
      kfence-#1: 0x00000000a72cc7b6-0x00000000d97616d9, size=2376, cache=TCPv6
      
      allocated by task 0 on cpu 9 at 260507.901592s:
      sk_prot_alloc+0x35/0x140
      sk_clone_lock+0x1f/0x3f0
      inet_csk_clone_lock+0x15/0x160
      tcp_create_openreq_child+0x1f/0x410
      tcp_v6_syn_recv_sock+0x1da/0x700
      tcp_check_req+0x1fb/0x510
      tcp_v6_rcv+0x98b/0x1420
      ipv6_list_rcv+0x2258/0x26e0
      napi_complete_done+0x5b1/0x2990
      mlx5e_napi_poll+0x2ae/0x8d0
      net_rx_action+0x13e/0x590
      irq_exit_rcu+0xf5/0x320
      common_interrupt+0x80/0x90
      asm_common_interrupt+0x22/0x40
      cpuidle_enter_state+0xfb/0x273
      cpu_startup_entry+0x15e/0x260
      start_secondary+0x8a/0x90
      secondary_startup_64_no_verify+0xfa/0xfb
      
      freed by task 0 on cpu 9 at 260507.927527s:
      rcu_core_si+0x4ff/0xf10
      irq_exit_rcu+0xf5/0x320
      sysvec_apic_timer_interrupt+0x6d/0x80
      asm_sysvec_apic_timer_interrupt+0x16/0x20
      cpuidle_enter_state+0xfb/0x273
      cpu_startup_entry+0x15e/0x260
      start_secondary+0x8a/0x90
      secondary_startup_64_no_verify+0xfa/0xfb
      
      Fixes: 83fccfc3 ("inet: fix potential deadlock in reqsk_queue_unlink()")
      Reported-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Closes: https://lore.kernel.org/netdev/eb6684d0-ffd9-4bdc-9196-33f690c25824@linux.dev/
      Link: https://lore.kernel.org/netdev/b55e2ca0-42f2-4b7c-b445-6ffd87ca74a0@linux.dev/ [1]
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Link: https://patch.msgid.link/20241014223312.4254-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e8c526f2
    • Wang Hai's avatar
      net: bcmasp: fix potential memory leak in bcmasp_xmit() · fed07d3e
      Wang Hai authored
      The bcmasp_xmit() returns NETDEV_TX_OK without freeing skb
      in case of mapping fails, add dev_kfree_skb() to fix it.
      
      Fixes: 490cb412 ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Acked-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://patch.msgid.link/20241014145901.48940-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fed07d3e
  3. 15 Oct, 2024 11 commits
    • Wang Hai's avatar
      net: systemport: fix potential memory leak in bcm_sysport_xmit() · c401ed1c
      Wang Hai authored
      The bcm_sysport_xmit() returns NETDEV_TX_OK without freeing skb
      in case of dma_map_single() fails, add dev_kfree_skb() to fix it.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Link: https://patch.msgid.link/20241014145115.44977-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c401ed1c
    • Linus Torvalds's avatar
      Merge tag 'trace-ringbuffer-v6.12-rc3' of... · 2f87d091
      Linus Torvalds authored
      Merge tag 'trace-ringbuffer-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull ring-buffer fixes from Steven Rostedt:
      
       - Fix ref counter of buffers assigned at boot up
      
         A tracing instance can be created from the kernel command line. If it
         maps to memory, it is considered permanent and should not be deleted,
         or bad things can happen. If it is not mapped to memory, then the
         user is fine to delete it via rmdir from the instances directory. But
         the ref counts assumed 0 was free to remove and greater than zero was
         not. But this was not the case. When an instance is created, it
         should have the reference of 1, and if it should not be removed, it
         must be greater than 1. The boot up code set normal instances with a
         ref count of 0, which could get removed if something accessed it and
         then released it. And memory mapped instances had a ref count of 1
         which meant it could be deleted, and bad things happen. Keep normal
         instances ref count as 1, and set memory mapped instances ref count
         to 2.
      
       - Protect sub buffer size (order) updates from other modifications
      
         When a ring buffer is changing the size of its sub-buffers, no other
         operations should be performed on the ring buffer. That includes
         reading it. But the locking only grabbed the buffer->mutex that keeps
         some operations from touching the ring buffer. It also must hold the
         cpu_buffer->reader_lock as well when updates happen as other paths
         use that to do some operations on the ring buffer.
      
      * tag 'trace-ringbuffer-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        ring-buffer: Fix reader locking when changing the sub buffer order
        ring-buffer: Fix refcount setting of boot mapped buffers
      2f87d091
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs · bdc72765
      Linus Torvalds authored
      Pull bcachefs fixes from Kent Overstreet:
      
       - New metadata version inode_has_child_snapshots
      
         This fixes bugs with handling of unlinked inodes + snapshots, in
         particular when an inode is reattached after taking a snapshot;
         deleted inodes now get correctly cleaned up across snapshots.
      
       - Disk accounting rewrite fixes
           - validation fixes for when a device has been removed
           - fix journal replay failing with "journal_reclaim_would_deadlock"
      
       - Some more small fixes for erasure coding + device removal
      
       - Assorted small syzbot fixes
      
      * tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs: (27 commits)
        bcachefs: Fix sysfs warning in fstests generic/730,731
        bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev
        bcachefs: Fix kasan splat in new_stripe_alloc_buckets()
        bcachefs: Add missing validation for bch_stripe.csum_granularity_bits
        bcachefs: Fix missing bounds checks in bch2_alloc_read()
        bcachefs: fix uaf in bch2_dio_write_done()
        bcachefs: Improve check_snapshot_exists()
        bcachefs: Fix bkey_nocow_lock()
        bcachefs: Fix accounting replay flags
        bcachefs: Fix invalid shift in member_to_text()
        bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID
        bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry
        bcachefs: Check if stuck in journal_res_get()
        closures: Add closure_wait_event_timeout()
        bcachefs: Fix state lock involved deadlock
        bcachefs: Fix NULL pointer dereference in bch2_opt_to_text
        bcachefs: Release transaction before wake up
        bcachefs: add check for btree id against max in try read node
        bcachefs: Disk accounting device validation fixes
        bcachefs: bch2_inode_or_descendents_is_open()
        ...
      bdc72765
    • Wang Hai's avatar
      net: ethernet: rtsn: fix potential memory leak in rtsn_start_xmit() · c186b7a7
      Wang Hai authored
      The rtsn_start_xmit() returns NETDEV_TX_OK without freeing skb
      in case of skb->len being too long, add dev_kfree_skb_any() to fix it.
      
      Fixes: b0d3969d ("net: ethernet: rtsn: Add support for Renesas Ethernet-TSN")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Reviewed-by: default avatarNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://patch.msgid.link/20241014144250.38802-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c186b7a7
    • Wang Hai's avatar
      net: xilinx: axienet: fix potential memory leak in axienet_start_xmit() · 99714e37
      Wang Hai authored
      The axienet_start_xmit() returns NETDEV_TX_OK without freeing skb
      in case of dma_map_single() fails, add dev_kfree_skb_any() to fix it.
      
      Fixes: 71791dc8 ("net: axienet: Check for DMA mapping errors")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Reviewed-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@amd.com>
      Link: https://patch.msgid.link/20241014143704.31938-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      99714e37
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-prevent-mpc-handshake-on-port-based-signal-endpoints' · 56f51dfd
      Jakub Kicinski authored
      Matthieu Baerts says:
      
      ====================
      mptcp: prevent MPC handshake on port-based signal endpoints
      
      MPTCP connection requests toward a listening socket created by the
      in-kernel PM for a port based signal endpoint will never be accepted,
      they need to be explicitly rejected.
      
      - Patch 1: Explicitly reject such requests. A fix for >= v5.12.
      
      - Patch 2: Cover this case in the MPTCP selftests to avoid regressions.
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      v1: https://lore.kernel.org/20240908180620.822579-1-xiyou.wangcong@gmail.com
      
      Link: https://lore.kernel.org/a5289a0d-2557-40b8-9575-6f1a0bbf06e4@redhat.com
      ====================
      
      Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-0-7faea8e6b6ae@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      56f51dfd
    • Paolo Abeni's avatar
      selftests: mptcp: join: test for prohibited MPC to port-based endp · 5afca7e9
      Paolo Abeni authored
      Explicitly verify that MPC connection attempts towards a port-based
      signal endpoint fail with a reset.
      
      Note that this new test is a bit different from the other ones, not
      using 'run_tests'. It is then needed to add the capture capability, and
      the picking the right port which have been extracted into three new
      helpers. The info about the capture can also be printed from a single
      point, which simplifies the exit paths in do_transfer().
      
      The 'Fixes' tag here below is the same as the one from the previous
      commit: this patch here is not fixing anything wrong in the selftests,
      but it validates the previous fix for an issue introduced by this commit
      ID.
      
      Fixes: 1729cf18 ("mptcp: create the listening socket for new port")
      Cc: stable@vger.kernel.org
      Co-developed-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-2-7faea8e6b6ae@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5afca7e9
    • Paolo Abeni's avatar
      mptcp: prevent MPC handshake on port-based signal endpoints · 3d041393
      Paolo Abeni authored
      Syzkaller reported a lockdep splat:
      
        ============================================
        WARNING: possible recursive locking detected
        6.11.0-rc6-syzkaller-00019-g67784a74 #0 Not tainted
        --------------------------------------------
        syz-executor364/5113 is trying to acquire lock:
        ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
        ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
      
        but task is already holding lock:
        ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
        ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
      
        other info that might help us debug this:
         Possible unsafe locking scenario:
      
               CPU0
               ----
          lock(k-slock-AF_INET);
          lock(k-slock-AF_INET);
      
         *** DEADLOCK ***
      
         May be due to missing lock nesting notation
      
        7 locks held by syz-executor364/5113:
         #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
         #0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x153/0x1b10 net/mptcp/protocol.c:1806
         #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
         #1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x11f/0x530 net/mptcp/protocol.c:1727
         #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
         #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
         #2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5f/0x1b80 net/ipv4/ip_output.c:470
         #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
         #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
         #3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1390 net/ipv4/ip_output.c:228
         #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: local_lock_acquire include/linux/local_lock_internal.h:29 [inline]
         #4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x33b/0x15b0 net/core/dev.c:6104
         #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
         #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
         #5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x230/0x5f0 net/ipv4/ip_input.c:232
         #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
         #6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
      
        stack backtrace:
        CPU: 0 UID: 0 PID: 5113 Comm: syz-executor364 Not tainted 6.11.0-rc6-syzkaller-00019-g67784a74 #0
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
        Call Trace:
         <IRQ>
         __dump_stack lib/dump_stack.c:93 [inline]
         dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
         check_deadlock kernel/locking/lockdep.c:3061 [inline]
         validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3855
         __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
         lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
         __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
         _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
         spin_lock include/linux/spinlock.h:351 [inline]
         sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
         mptcp_sk_clone_init+0x32/0x13c0 net/mptcp/protocol.c:3279
         subflow_syn_recv_sock+0x931/0x1920 net/mptcp/subflow.c:874
         tcp_check_req+0xfe4/0x1a20 net/ipv4/tcp_minisocks.c:853
         tcp_v4_rcv+0x1c3e/0x37f0 net/ipv4/tcp_ipv4.c:2267
         ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
         ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
         NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
         NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
         __netif_receive_skb_one_core net/core/dev.c:5661 [inline]
         __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
         process_backlog+0x662/0x15b0 net/core/dev.c:6108
         __napi_poll+0xcb/0x490 net/core/dev.c:6772
         napi_poll net/core/dev.c:6841 [inline]
         net_rx_action+0x89b/0x1240 net/core/dev.c:6963
         handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
         do_softirq+0x11b/0x1e0 kernel/softirq.c:455
         </IRQ>
         <TASK>
         __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
         local_bh_enable include/linux/bottom_half.h:33 [inline]
         rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline]
         __dev_queue_xmit+0x1763/0x3e90 net/core/dev.c:4450
         dev_queue_xmit include/linux/netdevice.h:3105 [inline]
         neigh_hh_output include/net/neighbour.h:526 [inline]
         neigh_output include/net/neighbour.h:540 [inline]
         ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:235
         ip_local_out net/ipv4/ip_output.c:129 [inline]
         __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:535
         __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
         tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6542 [inline]
         tcp_rcv_state_process+0x2c32/0x4570 net/ipv4/tcp_input.c:6729
         tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1934
         sk_backlog_rcv include/net/sock.h:1111 [inline]
         __release_sock+0x214/0x350 net/core/sock.c:3004
         release_sock+0x61/0x1f0 net/core/sock.c:3558
         mptcp_sendmsg_fastopen+0x1ad/0x530 net/mptcp/protocol.c:1733
         mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1812
         sock_sendmsg_nosec net/socket.c:730 [inline]
         __sock_sendmsg+0x1a6/0x270 net/socket.c:745
         ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
         ___sys_sendmsg net/socket.c:2651 [inline]
         __sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
         __do_sys_sendmmsg net/socket.c:2766 [inline]
         __se_sys_sendmmsg net/socket.c:2763 [inline]
         __x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
         do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
         entry_SYSCALL_64_after_hwframe+0x77/0x7f
        RIP: 0033:0x7f04fb13a6b9
        Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
        RSP: 002b:00007ffd651f42d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
        RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f04fb13a6b9
        RDX: 0000000000000001 RSI: 0000000020000d00 RDI: 0000000000000004
        RBP: 00007ffd651f4310 R08: 0000000000000001 R09: 0000000000000001
        R10: 0000000020000080 R11: 0000000000000246 R12: 00000000000f4240
        R13: 00007f04fb187449 R14: 00007ffd651f42f4 R15: 00007ffd651f4300
         </TASK>
      
      As noted by Cong Wang, the splat is false positive, but the code
      path leading to the report is an unexpected one: a client is
      attempting an MPC handshake towards the in-kernel listener created
      by the in-kernel PM for a port based signal endpoint.
      
      Such connection will be never accepted; many of them can make the
      listener queue full and preventing the creation of MPJ subflow via
      such listener - its intended role.
      
      Explicitly detect this scenario at initial-syn time and drop the
      incoming MPC request.
      
      Fixes: 1729cf18 ("mptcp: create the listening socket for new port")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+f4aacdfef2c6a6529c3e@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=f4aacdfef2c6a6529c3e
      Cc: Cong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts (NGI0) <matttbe@kernel.org>
      Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-1-7faea8e6b6ae@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3d041393
    • Li RongQing's avatar
      net/smc: Fix searching in list of known pnetids in smc_pnet_add_pnetid · 82ac39eb
      Li RongQing authored
      pnetid of pi (not newly allocated pe) should be compared
      
      Fixes: e888a2e8 ("net/smc: introduce list of pnetids for Ethernet devices")
      Reviewed-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Link: https://patch.msgid.link/20241014115321.33234-1-lirongqing@baidu.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      82ac39eb
    • Oleksij Rempel's avatar
      net: macb: Avoid 20s boot delay by skipping MDIO bus registration for fixed-link PHY · d0c3601f
      Oleksij Rempel authored
      A boot delay was introduced by commit 79540d13 ("net: macb: Fix
      handling of fixed-link node"). This delay was caused by the call to
      `mdiobus_register()` in cases where a fixed-link PHY was present. The
      MDIO bus registration triggered unnecessary PHY address scans, leading
      to a 20-second delay due to attempts to detect Clause 45 (C45)
      compatible PHYs, despite no MDIO bus being attached.
      
      The commit 79540d13 ("net: macb: Fix handling of fixed-link node")
      was originally introduced to fix a regression caused by commit
      7897b071 ("net: macb: convert to phylink"), which caused the driver
      to misinterpret fixed-link nodes as PHY nodes. This resulted in warnings
      like:
      mdio_bus f0028000.ethernet-ffffffff: fixed-link has invalid PHY address
      mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 0
      ...
      mdio_bus f0028000.ethernet-ffffffff: scan phy fixed-link at address 31
      
      This patch reworks the logic to avoid registering and allocation of the
      MDIO bus when:
        - The device tree contains a fixed-link node.
        - There is no "mdio" child node in the device tree.
      
      If a child node named "mdio" exists, the MDIO bus will be registered to
      support PHYs  attached to the MACB's MDIO bus. Otherwise, with only a
      fixed-link, the MDIO bus is skipped.
      
      Tested on a sama5d35 based system with a ksz8863 switch attached to
      macb0.
      
      Fixes: 79540d13 ("net: macb: Fix handling of fixed-link node")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://patch.msgid.link/20241013052916.3115142-1-o.rempel@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d0c3601f
    • Wang Hai's avatar
      net: ethernet: aeroflex: fix potential memory leak in greth_start_xmit_gbit() · cf57b5d7
      Wang Hai authored
      The greth_start_xmit_gbit() returns NETDEV_TX_OK without freeing skb
      in case of skb->len being too long, add dev_kfree_skb() to fix it.
      
      Fixes: d4c41139 ("net: Add Aeroflex Gaisler 10/100/1G Ethernet MAC driver")
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Reviewed-by: default avatarGerhard Engleder <gerhard@engleder-embedded.com>
      Link: https://patch.msgid.link/20241012110434.49265-1-wanghai38@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf57b5d7