1. 22 Oct, 2023 3 commits
    • Shigeru Yoshida's avatar
      net: usb: smsc95xx: Fix uninit-value access in smsc95xx_read_reg · 51a32e82
      Shigeru Yoshida authored
      syzbot reported the following uninit-value access issue [1]:
      
      smsc95xx 1-1:0.0 (unnamed net_device) (uninitialized): Failed to read reg index 0x00000030: -32
      smsc95xx 1-1:0.0 (unnamed net_device) (uninitialized): Error reading E2P_CMD
      =====================================================
      BUG: KMSAN: uninit-value in smsc95xx_reset+0x409/0x25f0 drivers/net/usb/smsc95xx.c:896
       smsc95xx_reset+0x409/0x25f0 drivers/net/usb/smsc95xx.c:896
       smsc95xx_bind+0x9bc/0x22e0 drivers/net/usb/smsc95xx.c:1131
       usbnet_probe+0x100b/0x4060 drivers/net/usb/usbnet.c:1750
       usb_probe_interface+0xc75/0x1210 drivers/usb/core/driver.c:396
       really_probe+0x506/0xf40 drivers/base/dd.c:658
       __driver_probe_device+0x2a7/0x5d0 drivers/base/dd.c:800
       driver_probe_device+0x72/0x7b0 drivers/base/dd.c:830
       __device_attach_driver+0x55a/0x8f0 drivers/base/dd.c:958
       bus_for_each_drv+0x3ff/0x620 drivers/base/bus.c:457
       __device_attach+0x3bd/0x640 drivers/base/dd.c:1030
       device_initial_probe+0x32/0x40 drivers/base/dd.c:1079
       bus_probe_device+0x3d8/0x5a0 drivers/base/bus.c:532
       device_add+0x16ae/0x1f20 drivers/base/core.c:3622
       usb_set_configuration+0x31c9/0x38c0 drivers/usb/core/message.c:2207
       usb_generic_driver_probe+0x109/0x2a0 drivers/usb/core/generic.c:238
       usb_probe_device+0x290/0x4a0 drivers/usb/core/driver.c:293
       really_probe+0x506/0xf40 drivers/base/dd.c:658
       __driver_probe_device+0x2a7/0x5d0 drivers/base/dd.c:800
       driver_probe_device+0x72/0x7b0 drivers/base/dd.c:830
       __device_attach_driver+0x55a/0x8f0 drivers/base/dd.c:958
       bus_for_each_drv+0x3ff/0x620 drivers/base/bus.c:457
       __device_attach+0x3bd/0x640 drivers/base/dd.c:1030
       device_initial_probe+0x32/0x40 drivers/base/dd.c:1079
       bus_probe_device+0x3d8/0x5a0 drivers/base/bus.c:532
       device_add+0x16ae/0x1f20 drivers/base/core.c:3622
       usb_new_device+0x15f6/0x22f0 drivers/usb/core/hub.c:2589
       hub_port_connect drivers/usb/core/hub.c:5440 [inline]
       hub_port_connect_change drivers/usb/core/hub.c:5580 [inline]
       port_event drivers/usb/core/hub.c:5740 [inline]
       hub_event+0x53bc/0x7290 drivers/usb/core/hub.c:5822
       process_one_work kernel/workqueue.c:2630 [inline]
       process_scheduled_works+0x104e/0x1e70 kernel/workqueue.c:2703
       worker_thread+0xf45/0x1490 kernel/workqueue.c:2784
       kthread+0x3e8/0x540 kernel/kthread.c:388
       ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147
       ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      Local variable buf.i225 created at:
       smsc95xx_read_reg drivers/net/usb/smsc95xx.c:90 [inline]
       smsc95xx_reset+0x203/0x25f0 drivers/net/usb/smsc95xx.c:892
       smsc95xx_bind+0x9bc/0x22e0 drivers/net/usb/smsc95xx.c:1131
      
      CPU: 1 PID: 773 Comm: kworker/1:2 Not tainted 6.6.0-rc1-syzkaller-00125-ge42bebf6 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023
      Workqueue: usb_hub_wq hub_event
      =====================================================
      
      Similar to e9c65989 ("net: usb: smsc75xx: Fix uninit-value access in
      __smsc75xx_read_reg"), this issue is caused because usbnet_read_cmd() reads
      less bytes than requested (zero byte in the reproducer). In this case,
      'buf' is not properly filled.
      
      This patch fixes the issue by returning -ENODATA if usbnet_read_cmd() reads
      less bytes than requested.
      
      sysbot reported similar uninit-value access issue [2]. The root cause is
      the same as mentioned above, and this patch addresses it as well.
      
      Fixes: 2f7ca802 ("net: Add SMSC LAN9500 USB2.0 10/100 ethernet adapter driver")
      Reported-and-tested-by: syzbot+c74c24b43c9ae534f0e0@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+2c97a98a5ba9ea9c23bd@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=c74c24b43c9ae534f0e0 [1]
      Closes: https://syzkaller.appspot.com/bug?extid=2c97a98a5ba9ea9c23bd [2]
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      51a32e82
    • Su Hui's avatar
      net: chelsio: cxgb4: add an error code check in t4_load_phy_fw · 9f771493
      Su Hui authored
      t4_set_params_timeout() can return -EINVAL if failed, add check
      for this.
      Signed-off-by: default avatarSu Hui <suhui@nfschina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f771493
    • Christophe JAILLET's avatar
      net: ieee802154: adf7242: Fix some potential buffer overflow in adf7242_stats_show() · ca082f01
      Christophe JAILLET authored
      strncat() usage in adf7242_debugfs_init() is wrong.
      The size given to strncat() is the maximum number of bytes that can be
      written, excluding the trailing NULL.
      
      Here, the size that is passed, DNAME_INLINE_LEN, does not take into account
      the size of "adf7242-" that is already in the array.
      
      In order to fix it, use snprintf() instead.
      
      Fixes: 7302b9d9 ("ieee802154/adf7242: Driver for ADF7242 MAC IEEE802154")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca082f01
  2. 21 Oct, 2023 7 commits
  3. 20 Oct, 2023 8 commits
    • Mateusz Palczewski's avatar
      igb: Fix potential memory leak in igb_add_ethtool_nfc_entry · 8c0b48e0
      Mateusz Palczewski authored
      Add check for return of igb_update_ethtool_nfc_entry so that in case
      of any potential errors the memory alocated for input will be freed.
      
      Fixes: 0e71def2 ("igb: add support of RX network flow classification")
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c0b48e0
    • Kunwu Chan's avatar
      treewide: Spelling fix in comment · fb71ba0e
      Kunwu Chan authored
      reques -> request
      
      Fixes: 09dde54c ("PS3: gelic: Add wireless support for PS3")
      Signed-off-by: default avatarKunwu Chan <chentao@kylinos.cn>
      Reviewed-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb71ba0e
    • Ivan Vecera's avatar
      i40e: Fix I40E_FLAG_VF_VLAN_PRUNING value · 665e7d83
      Ivan Vecera authored
      Commit c87c938f ("i40e: Add VF VLAN pruning") added new
      PF flag I40E_FLAG_VF_VLAN_PRUNING but its value collides with
      existing I40E_FLAG_TOTAL_PORT_SHUTDOWN_ENABLED flag.
      
      Move the affected flag at the end of the flags and fix its value.
      
      Reproducer:
      [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close on
      [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning on
      [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
      [ 6323.142585] i40e 0000:02:00.0: Setting link-down-on-close not supported on this port (because total-port-shutdown is enabled)
      netlink error: Operation not supported
      [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning off
      [root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
      
      The link-down-on-close flag cannot be modified after setting vf-vlan-pruning
      because vf-vlan-pruning shares the same bit with total-port-shutdown flag
      that prevents any modification of link-down-on-close flag.
      
      Fixes: c87c938f ("i40e: Add VF VLAN pruning")
      Cc: Mateusz Palczewski <mateusz.palczewski@intel.com>
      Cc: Simon Horman <horms@kernel.org>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      665e7d83
    • Michal Schmidt's avatar
      iavf: initialize waitqueues before starting watchdog_task · 7db31110
      Michal Schmidt authored
      It is not safe to initialize the waitqueues after queueing the
      watchdog_task. It will be using them.
      
      The chance of this causing a real problem is very small, because
      there will be some sleeping before any of the waitqueues get used.
      I got a crash only after inserting an artificial sleep in iavf_probe.
      
      Queue the watchdog_task as the last step in iavf_probe. Add a comment to
      prevent repeating the mistake.
      
      Fixes: fe2647ab ("i40evf: prevent VF close returning before state transitions to DOWN")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7db31110
    • Mirsad Goran Todorovac's avatar
      r8169: fix the KCSAN reported data race in rtl_rx while reading desc->opts1 · f97eee48
      Mirsad Goran Todorovac authored
      KCSAN reported the following data-race bug:
      
      ==================================================================
      BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
      
      race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
      rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
      __napi_poll (net/core/dev.c:6527)
      net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
      __do_softirq (kernel/softirq.c:553)
      __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
      irq_exit_rcu (kernel/softirq.c:647)
      sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
      asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
      cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
      cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
      call_cpuidle (kernel/sched/idle.c:135)
      do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
      cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
      start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
      secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
      
      value changed: 0x80003fff -> 0x3402805f
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c0 #41
      Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
      ==================================================================
      
      drivers/net/ethernet/realtek/r8169_main.c:
      ==========================================
         4429
       → 4430                 status = le32_to_cpu(desc->opts1);
         4431                 if (status & DescOwn)
         4432                         break;
         4433
         4434                 /* This barrier is needed to keep us from reading
         4435                  * any other fields out of the Rx descriptor until
         4436                  * we know the status of DescOwn
         4437                  */
         4438                 dma_rmb();
         4439
         4440                 if (unlikely(status & RxRES)) {
         4441                         if (net_ratelimit())
         4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",
      
      Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
      desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
      happening:
      
         4429
       → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
         4431                 if (status & DescOwn)
         4432                         break;
         4433
      
      As the consequence of this fix, this KCSAN warning was eliminated.
      
      Fixes: 6202806e ("r8169: drop member opts1_mask from struct rtl8169_private")
      Suggested-by: default avatarMarco Elver <elver@google.com>
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Cc: nic_swsd@realtek.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/Signed-off-by: default avatarMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f97eee48
    • Mirsad Goran Todorovac's avatar
      r8169: fix the KCSAN reported data-race in rtl_tx while reading TxDescArray[entry].opts1 · dcf75a0f
      Mirsad Goran Todorovac authored
      KCSAN reported the following data-race:
      
      ==================================================================
      BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
      
      race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
      rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
      __napi_poll (net/core/dev.c:6527)
      net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
      __do_softirq (kernel/softirq.c:553)
      __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
      irq_exit_rcu (kernel/softirq.c:647)
      sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
      asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
      cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
      cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
      call_cpuidle (kernel/sched/idle.c:135)
      do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
      cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
      start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
      secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
      
      value changed: 0xb0000042 -> 0x00000000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c0 #41
      Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
      ==================================================================
      
      The read side is in
      
      drivers/net/ethernet/realtek/r8169_main.c
      =========================================
         4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
         4356                    int budget)
         4357 {
         4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
         4359         struct sk_buff *skb;
         4360
         4361         dirty_tx = tp->dirty_tx;
         4362
         4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
         4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
         4365                 u32 status;
         4366
       → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
         4368                 if (status & DescOwn)
         4369                         break;
         4370
         4371                 skb = tp->tx_skb[entry].skb;
         4372                 rtl8169_unmap_tx_skb(tp, entry);
         4373
         4374                 if (skb) {
         4375                         pkts_compl++;
         4376                         bytes_compl += skb->len;
         4377                         napi_consume_skb(skb, budget);
         4378                 }
         4379                 dirty_tx++;
         4380         }
         4381
         4382         if (tp->dirty_tx != dirty_tx) {
         4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
         4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
         4385
         4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
         4387                                               rtl_tx_slots_avail(tp),
         4388                                               R8169_TX_START_THRS);
         4389                 /*
         4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
         4391                  * too close. Let's kick an extra TxPoll request when a burst
         4392                  * of start_xmit activity is detected (if it is not detected,
         4393                  * it is slow enough). -- FR
         4394                  * If skb is NULL then we come here again once a tx irq is
         4395                  * triggered after the last fragment is marked transmitted.
         4396                  */
         4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
         4398                         rtl8169_doorbell(tp);
         4399         }
         4400 }
      
      tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
      this KCSAN warning.
      
         4366
       → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
         4368                 if (status & DescOwn)
         4369                         break;
         4370
      
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Cc: nic_swsd@realtek.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Marco Elver <elver@google.com>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/Signed-off-by: default avatarMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dcf75a0f
    • Mirsad Goran Todorovac's avatar
      r8169: fix the KCSAN reported data-race in rtl_tx() while reading tp->cur_tx · c1c0ce31
      Mirsad Goran Todorovac authored
      KCSAN reported the following data-race:
      
      ==================================================================
      BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]
      
      write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
      rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
      dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
      sch_direct_xmit (net/sched/sch_generic.c:342)
      __dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
      ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
      __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
      ip_finish_output (net/ipv4/ip_output.c:328)
      ip_output (net/ipv4/ip_output.c:435)
      ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
      udp_send_skb (net/ipv4/udp.c:963)
      udp_sendmsg (net/ipv4/udp.c:1246)
      inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
      sock_sendmsg (net/socket.c:730 net/socket.c:753)
      __sys_sendto (net/socket.c:2177)
      __x64_sys_sendto (net/socket.c:2185)
      do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
      entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      
      read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
      rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
      __napi_poll (net/core/dev.c:6527)
      net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
      __do_softirq (kernel/softirq.c:553)
      __irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
      irq_exit_rcu (kernel/softirq.c:647)
      common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
      asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
      cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
      cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
      call_cpuidle (kernel/sched/idle.c:135)
      do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
      cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
      start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
      secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
      
      value changed: 0x002f4815 -> 0x002f4816
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c0 #41
      Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
      ==================================================================
      
      The write side of drivers/net/ethernet/realtek/r8169_main.c is:
      ==================
         4251         /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
         4252         smp_wmb();
         4253
       → 4254         WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
         4255
         4256         stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
         4257                                                 R8169_TX_STOP_THRS,
         4258                                                 R8169_TX_START_THRS);
      
      The read side is the function rtl_tx():
      
         4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
         4356                    int budget)
         4357 {
         4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
         4359         struct sk_buff *skb;
         4360
         4361         dirty_tx = tp->dirty_tx;
         4362
         4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
         4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
         4365                 u32 status;
         4366
         4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
         4368                 if (status & DescOwn)
         4369                         break;
         4370
         4371                 skb = tp->tx_skb[entry].skb;
         4372                 rtl8169_unmap_tx_skb(tp, entry);
         4373
         4374                 if (skb) {
         4375                         pkts_compl++;
         4376                         bytes_compl += skb->len;
         4377                         napi_consume_skb(skb, budget);
         4378                 }
         4379                 dirty_tx++;
         4380         }
         4381
         4382         if (tp->dirty_tx != dirty_tx) {
         4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
         4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
         4385
         4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
         4387                                               rtl_tx_slots_avail(tp),
         4388                                               R8169_TX_START_THRS);
         4389                 /*
         4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
         4391                  * too close. Let's kick an extra TxPoll request when a burst
         4392                  * of start_xmit activity is detected (if it is not detected,
         4393                  * it is slow enough). -- FR
         4394                  * If skb is NULL then we come here again once a tx irq is
         4395                  * triggered after the last fragment is marked transmitted.
         4396                  */
       → 4397                 if (tp->cur_tx != dirty_tx && skb)
         4398                         rtl8169_doorbell(tp);
         4399         }
         4400 }
      
      Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
      line 4363:
      
         4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
      
      but the same solution is required for protecting the other access to tp->cur_tx:
      
       → 4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
         4398                         rtl8169_doorbell(tp);
      
      The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
      might have suffered read tearing under some compiler optimisations.
      
      The fix eliminated the KCSAN data-race report for this bug.
      
      It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
      and line 4397. This test should certainly not be cached by the compiler in some register
      for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
      4254 in the meantime.
      
      Fixes: 94d8a98e ("r8169: reduce number of workaround doorbell rings")
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Cc: nic_swsd@realtek.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Marco Elver <elver@google.com>
      Cc: netdev@vger.kernel.org
      Link: https://lore.kernel.org/lkml/dc7fc8fa-4ea4-e9a9-30a6-7c83e6b53188@alu.unizg.hr/Signed-off-by: default avatarMirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1c0ce31
    • Maciej Fijalkowski's avatar
      i40e: xsk: remove count_mask · 913eda2b
      Maciej Fijalkowski authored
      Cited commit introduced a neat way of updating next_to_clean that does
      not require boundary checks on each increment. This was done by masking
      the new value with (ring length - 1) mask. Problem is that this is
      applicable only for power of 2 ring sizes, for every other size this
      assumption can not be made. In turn, it leads to cleaning descriptors
      out of order as well as splats:
      
      [ 1388.411915] Workqueue: events xp_release_deferred
      [ 1388.411919] RIP: 0010:xp_free+0x1a/0x50
      [ 1388.411921] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 8b 57 70 48 8d 47 70 48 89 e5 48 39 d0 74 06 <5d> c3 cc cc cc cc 48 8b 57 60 83 82 b8 00 00 00 01 48 8b 57 60 48
      [ 1388.411922] RSP: 0018:ffa0000000a83cb0 EFLAGS: 00000206
      [ 1388.411923] RAX: ff11000119aa5030 RBX: 000000000000001d RCX: ff110001129b6e50
      [ 1388.411924] RDX: ff11000119aa4fa0 RSI: 0000000055555554 RDI: ff11000119aa4fc0
      [ 1388.411925] RBP: ffa0000000a83cb0 R08: 0000000000000000 R09: 0000000000000000
      [ 1388.411926] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000115829b80
      [ 1388.411927] R13: 000000000000005f R14: 0000000000000000 R15: ff11000119aa4fc0
      [ 1388.411928] FS:  0000000000000000(0000) GS:ff11000277e00000(0000) knlGS:0000000000000000
      [ 1388.411929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1388.411930] CR2: 00007f1f564e6c14 CR3: 000000000783c005 CR4: 0000000000771ef0
      [ 1388.411931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1388.411931] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1388.411932] PKRU: 55555554
      [ 1388.411933] Call Trace:
      [ 1388.411934]  <IRQ>
      [ 1388.411935]  ? show_regs+0x6e/0x80
      [ 1388.411937]  ? watchdog_timer_fn+0x1d2/0x240
      [ 1388.411939]  ? __pfx_watchdog_timer_fn+0x10/0x10
      [ 1388.411941]  ? __hrtimer_run_queues+0x10e/0x290
      [ 1388.411945]  ? clockevents_program_event+0xae/0x130
      [ 1388.411947]  ? hrtimer_interrupt+0x105/0x240
      [ 1388.411949]  ? __sysvec_apic_timer_interrupt+0x54/0x150
      [ 1388.411952]  ? sysvec_apic_timer_interrupt+0x7f/0x90
      [ 1388.411955]  </IRQ>
      [ 1388.411955]  <TASK>
      [ 1388.411956]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
      [ 1388.411958]  ? xp_free+0x1a/0x50
      [ 1388.411960]  i40e_xsk_clean_rx_ring+0x5d/0x100 [i40e]
      [ 1388.411968]  i40e_clean_rx_ring+0x14c/0x170 [i40e]
      [ 1388.411977]  i40e_queue_pair_disable+0xda/0x260 [i40e]
      [ 1388.411986]  i40e_xsk_pool_setup+0x192/0x1d0 [i40e]
      [ 1388.411993]  i40e_reconfig_rss_queues+0x1f0/0x1450 [i40e]
      [ 1388.412002]  xp_disable_drv_zc+0x73/0xf0
      [ 1388.412004]  ? mutex_lock+0x17/0x50
      [ 1388.412007]  xp_release_deferred+0x2b/0xc0
      [ 1388.412010]  process_one_work+0x178/0x350
      [ 1388.412011]  ? __pfx_worker_thread+0x10/0x10
      [ 1388.412012]  worker_thread+0x2f7/0x420
      [ 1388.412014]  ? __pfx_worker_thread+0x10/0x10
      [ 1388.412015]  kthread+0xf8/0x130
      [ 1388.412017]  ? __pfx_kthread+0x10/0x10
      [ 1388.412019]  ret_from_fork+0x3d/0x60
      [ 1388.412021]  ? __pfx_kthread+0x10/0x10
      [ 1388.412023]  ret_from_fork_asm+0x1b/0x30
      [ 1388.412026]  </TASK>
      
      It comes from picking wrong ring entries when cleaning xsk buffers
      during pool detach.
      
      Remove the count_mask logic and use they boundary check when updating
      next_to_process (which used to be a next_to_clean).
      
      Fixes: c8a8ca34 ("i40e: remove unnecessary memory writes of the next to clean pointer")
      Reported-by: default avatarTushar Vyavahare <tushar.vyavahare@intel.com>
      Tested-by: default avatarTushar Vyavahare <tushar.vyavahare@intel.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20231018163908.40841-1-maciej.fijalkowski@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      913eda2b
  4. 19 Oct, 2023 22 commits