1. 20 Jul, 2023 7 commits
    • Eric Dumazet's avatar
      can: raw: fix lockdep issue in raw_release() · 11c9027c
      Eric Dumazet authored
      syzbot complained about a lockdep issue [1]
      
      Since raw_bind() and raw_setsockopt() first get RTNL
      before locking the socket, we must adopt the same order in raw_release()
      
      [1]
      WARNING: possible circular locking dependency detected
      6.5.0-rc1-syzkaller-00192-g78adb4bc #0 Not tainted
      ------------------------------------------------------
      syz-executor.0/14110 is trying to acquire lock:
      ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1708 [inline]
      ffff88804e4b6130 (sk_lock-AF_CAN){+.+.}-{0:0}, at: raw_bind+0xb1/0xab0 net/can/raw.c:435
      
      but task is already holding lock:
      ffffffff8e3df368 (rtnl_mutex){+.+.}-{3:3}, at: raw_bind+0xa7/0xab0 net/can/raw.c:434
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (rtnl_mutex){+.+.}-{3:3}:
      __mutex_lock_common kernel/locking/mutex.c:603 [inline]
      __mutex_lock+0x181/0x1340 kernel/locking/mutex.c:747
      raw_release+0x1c6/0x9b0 net/can/raw.c:391
      __sock_release+0xcd/0x290 net/socket.c:654
      sock_close+0x1c/0x20 net/socket.c:1386
      __fput+0x3fd/0xac0 fs/file_table.c:384
      task_work_run+0x14d/0x240 kernel/task_work.c:179
      resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
      exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
      exit_to_user_mode_prepare+0x210/0x240 kernel/entry/common.c:204
      __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
      syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:297
      do_syscall_64+0x44/0xb0 arch/x86/entry/common.c:86
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      -> #0 (sk_lock-AF_CAN){+.+.}-{0:0}:
      check_prev_add kernel/locking/lockdep.c:3142 [inline]
      check_prevs_add kernel/locking/lockdep.c:3261 [inline]
      validate_chain kernel/locking/lockdep.c:3876 [inline]
      __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144
      lock_acquire kernel/locking/lockdep.c:5761 [inline]
      lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726
      lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492
      lock_sock include/net/sock.h:1708 [inline]
      raw_bind+0xb1/0xab0 net/can/raw.c:435
      __sys_bind+0x1ec/0x220 net/socket.c:1792
      __do_sys_bind net/socket.c:1803 [inline]
      __se_sys_bind net/socket.c:1801 [inline]
      __x64_sys_bind+0x72/0xb0 net/socket.c:1801
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      other info that might help us debug this:
      
      Possible unsafe locking scenario:
      
      CPU0 CPU1
      ---- ----
      lock(rtnl_mutex);
              lock(sk_lock-AF_CAN);
              lock(rtnl_mutex);
      lock(sk_lock-AF_CAN);
      
      *** DEADLOCK ***
      
      1 lock held by syz-executor.0/14110:
      
      stack backtrace:
      CPU: 0 PID: 14110 Comm: syz-executor.0 Not tainted 6.5.0-rc1-syzkaller-00192-g78adb4bc #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      check_noncircular+0x311/0x3f0 kernel/locking/lockdep.c:2195
      check_prev_add kernel/locking/lockdep.c:3142 [inline]
      check_prevs_add kernel/locking/lockdep.c:3261 [inline]
      validate_chain kernel/locking/lockdep.c:3876 [inline]
      __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144
      lock_acquire kernel/locking/lockdep.c:5761 [inline]
      lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726
      lock_sock_nested+0x3a/0xf0 net/core/sock.c:3492
      lock_sock include/net/sock.h:1708 [inline]
      raw_bind+0xb1/0xab0 net/can/raw.c:435
      __sys_bind+0x1ec/0x220 net/socket.c:1792
      __do_sys_bind net/socket.c:1803 [inline]
      __se_sys_bind net/socket.c:1801 [inline]
      __x64_sys_bind+0x72/0xb0 net/socket.c:1801
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7fd89007cb29
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fd890d2a0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000031
      RAX: ffffffffffffffda RBX: 00007fd89019bf80 RCX: 00007fd89007cb29
      RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000003
      RBP: 00007fd8900c847a R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 000000000000000b R14: 00007fd89019bf80 R15: 00007ffebf8124f8
      </TASK>
      
      Fixes: ee8b94c8 ("can: raw: fix receiver memory leak")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ziyang Xuan <william.xuanziyang@huawei.com>
      Cc: Oliver Hartkopp <socketcan@hartkopp.net>
      Cc: stable@vger.kernel.org
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Link: https://lore.kernel.org/all/20230720114438.172434-1-edumazet@google.comSigned-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      11c9027c
    • Marc Kleine-Budde's avatar
      can: gs_usb: gs_can_close(): add missing set of CAN state to CAN_STATE_STOPPED · f8a2da6e
      Marc Kleine-Budde authored
      After an initial link up the CAN device is in ERROR-ACTIVE mode. Due
      to a missing CAN_STATE_STOPPED in gs_can_close() it doesn't change to
      STOPPED after a link down:
      
      | ip link set dev can0 up
      | ip link set dev can0 down
      | ip --details link show can0
      | 13: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10
      |     link/can  promiscuity 0 allmulti 0 minmtu 0 maxmtu 0
      |     can state ERROR-ACTIVE restart-ms 1000
      
      Add missing assignment of CAN_STATE_STOPPED in gs_can_close().
      
      Cc: stable@vger.kernel.org
      Fixes: d08e973a ("can: gs_usb: Added support for the GS_USB CAN devices")
      Link: https://lore.kernel.org/all/20230718-gs_usb-fix-can-state-v1-1-f19738ae2c23@pengutronix.deSigned-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      f8a2da6e
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: always mtk_get_ib1_pkt_type · 9f9d4c1a
      Daniel Golle authored
      entries and bind debugfs files would display wrong data on NETSYS_V2 and
      later because instead of using mtk_get_ib1_pkt_type the driver would use
      MTK_FOE_IB1_PACKET_TYPE which corresponds to NETSYS_V1(.x) SoCs.
      Use mtk_get_ib1_pkt_type so entries and bind records display correctly.
      
      Fixes: 03a3180e ("net: ethernet: mtk_eth_soc: introduce flow offloading support for mt7986")
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Acked-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/c0ae03d0182f4d27b874cbdf0059bc972c317f3c.1689727134.git.daniel@makrotopia.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9f9d4c1a
    • Jakub Kicinski's avatar
      Merge branch 'r8169-revert-two-changes-that-caused-regressions' · 88f2e009
      Jakub Kicinski authored
      Heiner Kallweit says:
      
      ====================
      r8169: revert two changes that caused regressions
      
      This reverts two changes that caused regressions.
      ====================
      
      Link: https://lore.kernel.org/r/ddadceae-19c9-81b8-47b5-a4ff85e2563a@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      88f2e009
    • Heiner Kallweit's avatar
      Revert "r8169: disable ASPM during NAPI poll" · e31a9fed
      Heiner Kallweit authored
      This reverts commit e1ed3e4d.
      
      Turned out the change causes a performance regression.
      
      Link: https://lore.kernel.org/netdev/20230713124914.GA12924@green245/T/
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/055c6bc2-74fa-8c67-9897-3f658abb5ae7@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e31a9fed
    • Heiner Kallweit's avatar
      r8169: revert 2ab19de6 ("r8169: remove ASPM restrictions now that ASPM is... · cf2ffdea
      Heiner Kallweit authored
      r8169: revert 2ab19de6 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll")
      
      There have been reports that on a number of systems this change breaks
      network connectivity. Therefore effectively revert it. Mainly affected
      seem to be systems where BIOS denies ASPM access to OS.
      Due to later changes we can't do a direct revert.
      
      Fixes: 2ab19de6 ("r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll")
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/netdev/e47bac0d-e802-65e1-b311-6acb26d5cf10@freenet.de/T/
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217596Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/57f13ec0-b216-d5d8-363d-5b05528ec5fb@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf2ffdea
    • Kuniyuki Iwashima's avatar
      Revert "tcp: avoid the lookup process failing to get sk in ehash table" · 81b3ade5
      Kuniyuki Iwashima authored
      This reverts commit 3f4ca5fa.
      
      Commit 3f4ca5fa ("tcp: avoid the lookup process failing to get sk in
      ehash table") reversed the order in how a socket is inserted into ehash
      to fix an issue that ehash-lookup could fail when reqsk/full sk/twsk are
      swapped.  However, it introduced another lookup failure.
      
      The full socket in ehash is allocated from a slab with SLAB_TYPESAFE_BY_RCU
      and does not have SOCK_RCU_FREE, so the socket could be reused even while
      it is being referenced on another CPU doing RCU lookup.
      
      Let's say a socket is reused and inserted into the same hash bucket during
      lookup.  After the blamed commit, a new socket is inserted at the end of
      the list.  If that happens, we will skip sockets placed after the previous
      position of the reused socket, resulting in ehash lookup failure.
      
      As described in Documentation/RCU/rculist_nulls.rst, we should insert a
      new socket at the head of the list to avoid such an issue.
      
      This issue, the swap-lookup-failure, and another variant reported in [0]
      can all be handled properly by adding a locked ehash lookup suggested by
      Eric Dumazet [1].
      
      However, this issue could occur for every packet, thus more likely than
      the other two races, so let's revert the change for now.
      
      Link: https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@baidu.com/ [0]
      Link: https://lore.kernel.org/netdev/CANn89iK8snOz8TYOhhwfimC7ykYA78GA3Nyv8x06SZYa1nKdyA@mail.gmail.com/ [1]
      Fixes: 3f4ca5fa ("tcp: avoid the lookup process failing to get sk in ehash table")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717215918.15723-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      81b3ade5
  2. 19 Jul, 2023 16 commits
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e80698b7
      Jakub Kicinski authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2023-07-19
      
      We've added 4 non-merge commits during the last 1 day(s) which contain
      a total of 3 files changed, 55 insertions(+), 10 deletions(-).
      
      The main changes are:
      
      1) Fix stack depth check in presence of async callbacks,
         from Kumar Kartikeya Dwivedi.
      
      2) Fix BTI type used for freplace attached functions,
         from Alexander Duyck.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf, arm64: Fix BTI type used for freplace attached functions
        selftests/bpf: Add more tests for check_max_stack_depth bug
        bpf: Repeat check_max_stack_depth for async callbacks
        bpf: Fix subprog idx logic in check_max_stack_depth
      ====================
      
      Link: https://lore.kernel.org/r/20230719174502.74023-1-alexei.starovoitov@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e80698b7
    • Yuanjun Gong's avatar
      ipv4: ip_gre: fix return value check in erspan_xmit() · aa7cb378
      Yuanjun Gong authored
      goto free_skb if an unexpected result is returned by pskb_tirm()
      in erspan_xmit().
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa7cb378
    • Yuanjun Gong's avatar
      ipv4: ip_gre: fix return value check in erspan_fb_xmit() · 02d84f3e
      Yuanjun Gong authored
      goto err_free_skb if an unexpected result is returned by pskb_tirm()
      in erspan_fb_xmit().
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02d84f3e
    • Yuanjun Gong's avatar
      drivers:net: fix return value check in ocelot_fdma_receive_skb · bce56033
      Yuanjun Gong authored
      ocelot_fdma_receive_skb should return false if an unexpected
      value is returned by pskb_trim.
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bce56033
    • Yuanjun Gong's avatar
      drivers: net: fix return value check in emac_tso_csum() · 78a93c31
      Yuanjun Gong authored
      in emac_tso_csum(), return an error code if an unexpected value
      is returned by pskb_trim().
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78a93c31
    • Yuanjun Gong's avatar
      net:ipv6: check return value of pskb_trim() · 4258faa1
      Yuanjun Gong authored
      goto tx_err if an unexpected result is returned by pskb_tirm()
      in ip6erspan_tunnel_xmit().
      
      Fixes: 5a963eb6 ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4258faa1
    • Wang Ming's avatar
      net: ipv4: Use kfree_sensitive instead of kfree · daa75144
      Wang Ming authored
      key might contain private part of the key, so better use
      kfree_sensitive to free it.
      
      Fixes: 38320c70 ("[IPSEC]: Use crypto_aead and authenc in ESP")
      Signed-off-by: default avatarWang Ming <machel@vivo.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      daa75144
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 7f5acea7
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-07-17 (iavf)
      
      This series contains updates to iavf driver only.
      
      Ding Hui fixes use-after-free issue by calling netif_napi_del() for all
      allocated q_vectors. He also resolves out-of-bounds issue by not
      updating to new values when timeout is encountered.
      
      Marcin and Ahmed change the way resets are handled so that the callback
      operating under the RTNL lock will wait for the reset to finish, the
      rtnl_lock sensitive functions in reset flow will schedule the netdev update
      for later in order to remove circular dependency with the critical lock.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        iavf: fix reset task race with iavf_remove()
        iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies
        Revert "iavf: Do not restart Tx queues after reset task failure"
        Revert "iavf: Detach device during reset task"
        iavf: Wait for reset in callbacks which trigger it
        iavf: use internal state to free traffic IRQs
        iavf: Fix out-of-bounds when setting channels on remove
        iavf: Fix use-after-free in free_netdev
      ====================
      
      Link: https://lore.kernel.org/r/20230717175205.3217774-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7f5acea7
    • Jakub Kicinski's avatar
      Merge branch 'tcp-annotate-data-races-in-tcp_rsk-req' · e9b2bd96
      Jakub Kicinski authored
      Eric Dumazet says:
      
      ====================
      tcp: annotate data-races in tcp_rsk(req)
      
      Small series addressing two syzbot reports around tcp_rsk(req)
      ====================
      
      Link: https://lore.kernel.org/r/20230717144445.653164-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e9b2bd96
    • Eric Dumazet's avatar
      tcp: annotate data-races around tcp_rsk(req)->ts_recent · eba20811
      Eric Dumazet authored
      TCP request sockets are lockless, tcp_rsk(req)->ts_recent
      can change while being read by another cpu as syzbot noticed.
      
      This is harmless, but we should annotate the known races.
      
      Note that tcp_check_req() changes req->ts_recent a bit early,
      we might change this in the future.
      
      BUG: KCSAN: data-race in tcp_check_req / tcp_check_req
      
      write to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 1:
      tcp_check_req+0x694/0xc70 net/ipv4/tcp_minisocks.c:762
      tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
      ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
      ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
      dst_input include/net/dst.h:468 [inline]
      ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
      __netif_receive_skb_one_core net/core/dev.c:5493 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
      process_backlog+0x21f/0x380 net/core/dev.c:5935
      __napi_poll+0x60/0x3b0 net/core/dev.c:6498
      napi_poll net/core/dev.c:6565 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6698
      __do_softirq+0xc1/0x265 kernel/softirq.c:571
      do_softirq+0x7e/0xb0 kernel/softirq.c:472
      __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:396
      local_bh_enable+0x1f/0x20 include/linux/bottom_half.h:33
      rcu_read_unlock_bh include/linux/rcupdate.h:843 [inline]
      __dev_queue_xmit+0xabb/0x1d10 net/core/dev.c:4271
      dev_queue_xmit include/linux/netdevice.h:3088 [inline]
      neigh_hh_output include/net/neighbour.h:528 [inline]
      neigh_output include/net/neighbour.h:542 [inline]
      ip_finish_output2+0x700/0x840 net/ipv4/ip_output.c:229
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:317
      NF_HOOK_COND include/linux/netfilter.h:292 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:431
      dst_output include/net/dst.h:458 [inline]
      ip_local_out net/ipv4/ip_output.c:126 [inline]
      __ip_queue_xmit+0xa4d/0xa70 net/ipv4/ip_output.c:533
      ip_queue_xmit+0x38/0x40 net/ipv4/ip_output.c:547
      __tcp_transmit_skb+0x1194/0x16e0 net/ipv4/tcp_output.c:1399
      tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
      tcp_write_xmit+0x13ff/0x2fd0 net/ipv4/tcp_output.c:2693
      __tcp_push_pending_frames+0x6a/0x1a0 net/ipv4/tcp_output.c:2877
      tcp_push_pending_frames include/net/tcp.h:1952 [inline]
      __tcp_sock_set_cork net/ipv4/tcp.c:3336 [inline]
      tcp_sock_set_cork+0xe8/0x100 net/ipv4/tcp.c:3343
      rds_tcp_xmit_path_complete+0x3b/0x40 net/rds/tcp_send.c:52
      rds_send_xmit+0xf8d/0x1420 net/rds/send.c:422
      rds_send_worker+0x42/0x1d0 net/rds/threads.c:200
      process_one_work+0x3e6/0x750 kernel/workqueue.c:2408
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2555
      kthread+0x1d7/0x210 kernel/kthread.c:379
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      read to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 0:
      tcp_check_req+0x32a/0xc70 net/ipv4/tcp_minisocks.c:622
      tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
      ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
      ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
      dst_input include/net/dst.h:468 [inline]
      ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
      __netif_receive_skb_one_core net/core/dev.c:5493 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
      process_backlog+0x21f/0x380 net/core/dev.c:5935
      __napi_poll+0x60/0x3b0 net/core/dev.c:6498
      napi_poll net/core/dev.c:6565 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6698
      __do_softirq+0xc1/0x265 kernel/softirq.c:571
      run_ksoftirqd+0x17/0x20 kernel/softirq.c:939
      smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
      kthread+0x1d7/0x210 kernel/kthread.c:379
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      value changed: 0x1cd237f1 -> 0x1cd237f2
      
      Fixes: 079096f1 ("tcp/dccp: install syn_recv requests into ehash table")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717144445.653164-3-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eba20811
    • Eric Dumazet's avatar
      tcp: annotate data-races around tcp_rsk(req)->txhash · 5e526552
      Eric Dumazet authored
      TCP request sockets are lockless, some of their fields
      can change while being read by another cpu as syzbot noticed.
      
      This is usually harmless, but we should annotate the known
      races.
      
      This patch takes care of tcp_rsk(req)->txhash,
      a separate one is needed for tcp_rsk(req)->ts_recent.
      
      BUG: KCSAN: data-race in tcp_make_synack / tcp_rtx_synack
      
      write to 0xffff8881362304bc of 4 bytes by task 32083 on cpu 1:
      tcp_rtx_synack+0x9d/0x2a0 net/ipv4/tcp_output.c:4213
      inet_rtx_syn_ack+0x38/0x80 net/ipv4/inet_connection_sock.c:880
      tcp_check_req+0x379/0xc70 net/ipv4/tcp_minisocks.c:665
      tcp_v6_rcv+0x125b/0x1b20 net/ipv6/tcp_ipv6.c:1673
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      netif_receive_skb_internal net/core/dev.c:5652 [inline]
      netif_receive_skb+0x4a/0x310 net/core/dev.c:5711
      tun_rx_batched+0x3bf/0x400
      tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997
      tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043
      call_write_iter include/linux/fs.h:1871 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x4ab/0x7d0 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff8881362304bc of 4 bytes by task 32078 on cpu 0:
      tcp_make_synack+0x367/0xb40 net/ipv4/tcp_output.c:3663
      tcp_v6_send_synack+0x72/0x420 net/ipv6/tcp_ipv6.c:544
      tcp_conn_request+0x11a8/0x1560 net/ipv4/tcp_input.c:7059
      tcp_v6_conn_request+0x13f/0x180 net/ipv6/tcp_ipv6.c:1175
      tcp_rcv_state_process+0x156/0x1de0 net/ipv4/tcp_input.c:6494
      tcp_v6_do_rcv+0x98a/0xb70 net/ipv6/tcp_ipv6.c:1509
      tcp_v6_rcv+0x17b8/0x1b20 net/ipv6/tcp_ipv6.c:1735
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      netif_receive_skb_internal net/core/dev.c:5652 [inline]
      netif_receive_skb+0x4a/0x310 net/core/dev.c:5711
      tun_rx_batched+0x3bf/0x400
      tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997
      tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043
      call_write_iter include/linux/fs.h:1871 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x4ab/0x7d0 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x91d25731 -> 0xe79325cd
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 32078 Comm: syz-executor.4 Not tainted 6.5.0-rc1-syzkaller-00033-geb26cbb1 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
      
      Fixes: 58d607d3 ("tcp: provide skb->hash to synack packets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717144445.653164-2-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5e526552
    • Subbaraya Sundeep's avatar
      octeontx2-pf: mcs: Generate hash key using ecb(aes) · e7002b3b
      Subbaraya Sundeep authored
      Hardware generated encryption and ICV tags are found to
      be wrong when tested with IEEE MACSEC test vectors.
      This is because as per the HRM, the hash key (derived by
      AES-ECB block encryption of an all 0s block with the SAK)
      has to be programmed by the software in
      MCSX_RS_MCS_CPM_TX_SLAVE_SA_PLCY_MEM_4X register.
      Hence fix this by generating hash key in software and
      configuring in hardware.
      
      Fixes: c54ffc73 ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Link: https://lore.kernel.org/r/1689574603-28093-1-git-send-email-sbhatta@marvell.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e7002b3b
    • Florian Kauer's avatar
      igc: Prevent garbled TX queue with XDP ZEROCOPY · 78adb4bc
      Florian Kauer authored
      In normal operation, each populated queue item has
      next_to_watch pointing to the last TX desc of the packet,
      while each cleaned item has it set to 0. In particular,
      next_to_use that points to the next (necessarily clean)
      item to use has next_to_watch set to 0.
      
      When the TX queue is used both by an application using
      AF_XDP with ZEROCOPY as well as a second non-XDP application
      generating high traffic, the queue pointers can get in
      an invalid state where next_to_use points to an item
      where next_to_watch is NOT set to 0.
      
      However, the implementation assumes at several places
      that this is never the case, so if it does hold,
      bad things happen. In particular, within the loop inside
      of igc_clean_tx_irq(), next_to_clean can overtake next_to_use.
      Finally, this prevents any further transmission via
      this queue and it never gets unblocked or signaled.
      Secondly, if the queue is in this garbled state,
      the inner loop of igc_clean_tx_ring() will never terminate,
      completely hogging a CPU core.
      
      The reason is that igc_xdp_xmit_zc() reads next_to_use
      before acquiring the lock, and writing it back
      (potentially unmodified) later. If it got modified
      before locking, the outdated next_to_use is written
      pointing to an item that was already used elsewhere
      (and thus next_to_watch got written).
      
      Fixes: 9acf59a7 ("igc: Enable TX via AF_XDP zero-copy")
      Signed-off-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Tested-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20230717175444.3217831-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      78adb4bc
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-6.5-20230717' of... · 936fd2c5
      Jakub Kicinski authored
      Merge tag 'linux-can-fixes-for-6.5-20230717' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2023-07-17
      
      The 1st patch is by Ziyang Xuan and fixes a possible memory leak in
      the receiver handling in the CAN RAW protocol.
      
      YueHaibing contributes a use after free in bcm_proc_show() of the
      Broad Cast Manager (BCM) CAN protocol.
      
      The next 2 patches are by me and fix a possible null pointer
      dereference in the RX path of the gs_usb driver with activated
      hardware timestamps and the candlelight firmware.
      
      The last patch is by Fedor Ross, Marek Vasut and me and targets the
      mcp251xfd driver. The polling timeout of __mcp251xfd_chip_set_mode()
      is increased to fix bus joining on busy CAN buses and very low bit
      rate.
      
      * tag 'linux-can-fixes-for-6.5-20230717' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout
        can: gs_usb: fix time stamp counter initialization
        can: gs_usb: gs_can_open(): improve error handling
        can: bcm: Fix UAF in bcm_proc_show()
        can: raw: fix receiver memory leak
      ====================
      
      Link: https://lore.kernel.org/r/20230717180938.230816-1-mkl@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      936fd2c5
    • John Fastabend's avatar
      mailmap: Add entry for old intel email · 195e903b
      John Fastabend authored
      Fix old email to avoid bouncing email from net/drivers and older
      netdev work. Anyways my @intel email hasn't been active for years.
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/r/20230717173306.38407-1-john.fastabend@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      195e903b
    • Shannon Nelson's avatar
      mailmap: add entries for past lives · d1998e50
      Shannon Nelson authored
      Update old emails for my current work email.
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Link: https://lore.kernel.org/r/20230717193242.43670-1-shannon.nelson@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d1998e50
  3. 18 Jul, 2023 12 commits
  4. 17 Jul, 2023 5 commits
    • Fedor Ross's avatar
      can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout · 9efa1a54
      Fedor Ross authored
      The mcp251xfd controller needs an idle bus to enter 'Normal CAN 2.0
      mode' or . The maximum length of a CAN frame is 736 bits (64 data
      bytes, CAN-FD, EFF mode, worst case bit stuffing and interframe
      spacing). For low bit rates like 10 kbit/s the arbitrarily chosen
      MCP251XFD_POLL_TIMEOUT_US of 1 ms is too small.
      
      Otherwise during polling for the CAN controller to enter 'Normal CAN
      2.0 mode' the timeout limit is exceeded and the configuration fails
      with:
      
      | $ ip link set dev can1 up type can bitrate 10000
      | [  731.911072] mcp251xfd spi2.1 can1: Controller failed to enter mode CAN 2.0 Mode (6) and stays in Configuration Mode (4) (con=0x068b0760, osc=0x00000468).
      | [  731.927192] mcp251xfd spi2.1 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying.
      | [  731.938101] A link change request failed with some changes committed already. Interface can1 may have been left with an inconsistent configuration, please check.
      | RTNETLINK answers: Connection timed out
      
      Make MCP251XFD_POLL_TIMEOUT_US timeout calculation dynamic. Use
      maximum of 1ms and bit time of 1 full 64 data bytes CAN-FD frame in
      EFF mode, worst case bit stuffing and interframe spacing at the
      current bit rate.
      
      For easier backporting define the macro MCP251XFD_FRAME_LEN_MAX_BITS
      that holds the max frame length in bits, which is 736. This can be
      replaced by can_frame_bits(true, true, true, true, CANFD_MAX_DLEN) in
      a cleanup patch later.
      
      Fixes: 55e5b97f ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
      Signed-off-by: default avatarFedor Ross <fedor.ross@ifm.com>
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/all/20230717-mcp251xfd-fix-increase-poll-timeout-v5-1-06600f34c684@pengutronix.deSigned-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      9efa1a54
    • Ahmed Zaki's avatar
      iavf: fix reset task race with iavf_remove() · c34743da
      Ahmed Zaki authored
      The reset task is currently scheduled from the watchdog or adminq tasks.
      First, all direct calls to schedule the reset task are replaced with the
      iavf_schedule_reset(), which is modified to accept the flag showing the
      type of reset.
      
      To prevent the reset task from starting once iavf_remove() starts, we need
      to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now
      easily added to iavf_schedule_reset().
      
      Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task.
      It is redundant since all callers who set the flag immediately schedules
      the reset task.
      
      Fixes: 3ccd54ef ("iavf: Fix init state closure on remove")
      Fixes: 14756b2a ("iavf: Fix __IAVF_RESETTING state usage")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      c34743da
    • Ahmed Zaki's avatar
      iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies · d1639a17
      Ahmed Zaki authored
      A driver's lock (crit_lock) is used to serialize all the driver's tasks.
      Lockdep, however, shows a circular dependency between rtnl and
      crit_lock. This happens when an ndo that already holds the rtnl requests
      the driver to reset, since the reset task (in some paths) tries to grab
      rtnl to either change real number of queues of update netdev features.
      
        [566.241851] ======================================================
        [566.241893] WARNING: possible circular locking dependency detected
        [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G           OE
        [566.241984] ------------------------------------------------------
        [566.242025] repro.sh/2604 is trying to acquire lock:
        [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf]
        [566.242167]
                     but task is already holding lock:
        [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf]
        [566.242300]
                     which lock already depends on the new lock.
      
        [566.242353]
                     the existing dependency chain (in reverse order) is:
        [566.242401]
                     -> #1 (rtnl_mutex){+.+.}-{3:3}:
        [566.242451]        __mutex_lock+0xc1/0xbb0
        [566.242489]        iavf_init_interrupt_scheme+0x179/0x440 [iavf]
        [566.242560]        iavf_watchdog_task+0x80b/0x1400 [iavf]
        [566.242627]        process_one_work+0x2b3/0x560
        [566.242663]        worker_thread+0x4f/0x3a0
        [566.242696]        kthread+0xf2/0x120
        [566.242730]        ret_from_fork+0x29/0x50
        [566.242763]
                     -> #0 (&adapter->crit_lock){+.+.}-{3:3}:
        [566.242815]        __lock_acquire+0x15ff/0x22b0
        [566.242869]        lock_acquire+0xd2/0x2c0
        [566.242901]        __mutex_lock+0xc1/0xbb0
        [566.242934]        iavf_close+0x3c/0x240 [iavf]
        [566.242997]        __dev_close_many+0xac/0x120
        [566.243036]        dev_close_many+0x8b/0x140
        [566.243071]        unregister_netdevice_many_notify+0x165/0x7c0
        [566.243116]        unregister_netdevice_queue+0xd3/0x110
        [566.243157]        iavf_remove+0x6c1/0x730 [iavf]
        [566.243217]        pci_device_remove+0x33/0xa0
        [566.243257]        device_release_driver_internal+0x1bc/0x240
        [566.243299]        pci_stop_bus_device+0x6c/0x90
        [566.243338]        pci_stop_and_remove_bus_device+0xe/0x20
        [566.243380]        pci_iov_remove_virtfn+0xd1/0x130
        [566.243417]        sriov_disable+0x34/0xe0
        [566.243448]        ice_free_vfs+0x2da/0x330 [ice]
        [566.244383]        ice_sriov_configure+0x88/0xad0 [ice]
        [566.245353]        sriov_numvfs_store+0xde/0x1d0
        [566.246156]        kernfs_fop_write_iter+0x15e/0x210
        [566.246921]        vfs_write+0x288/0x530
        [566.247671]        ksys_write+0x74/0xf0
        [566.248408]        do_syscall_64+0x58/0x80
        [566.249145]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
        [566.249886]
                       other info that might help us debug this:
      
        [566.252014]  Possible unsafe locking scenario:
      
        [566.253432]        CPU0                    CPU1
        [566.254118]        ----                    ----
        [566.254800]   lock(rtnl_mutex);
        [566.255514]                                lock(&adapter->crit_lock);
        [566.256233]                                lock(rtnl_mutex);
        [566.256897]   lock(&adapter->crit_lock);
        [566.257388]
                        *** DEADLOCK ***
      
      The deadlock can be triggered by a script that is continuously resetting
      the VF adapter while doing other operations requiring RTNL, e.g:
      
      	while :; do
      		ip link set $VF up
      		ethtool --set-channels $VF combined 2
      		ip link set $VF down
      		ip link set $VF up
      		ethtool --set-channels $VF combined 4
      		ip link set $VF down
      	done
      
      Any operation that triggers a reset can substitute "ethtool --set-channles"
      
      As a fix, add a new task "finish_config" that do all the work which
      needs rtnl lock. With the exception of iavf_remove(), all work that
      require rtnl should be called from this task.
      
      As for iavf_remove(), at the point where we need to call
      unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config
      task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent
      finish_config work cannot restart after that since the task is guarded
      by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config().
      
      Fixes: 5ac49f3c ("iavf: use mutexes for locking of critical sections")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      d1639a17
    • Marcin Szycik's avatar
      Revert "iavf: Do not restart Tx queues after reset task failure" · d916d273
      Marcin Szycik authored
      This reverts commit 08f1c147.
      
      Netdev is no longer being detached during reset, so this fix can be
      reverted. We leave the removal of "hacky" IFF_UP flag update.
      Signed-off-by: default avatarMarcin Szycik <marcin.szycik@linux.intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      d916d273
    • Marcin Szycik's avatar
      Revert "iavf: Detach device during reset task" · d2806d96
      Marcin Szycik authored
      This reverts commit aa626da9.
      
      Detaching device during reset was not fully fixing the rtnl locking issue,
      as there could be a situation where callback was already in progress before
      detaching netdev.
      
      Furthermore, detaching netdevice causes TX timeouts if traffic is running.
      To reproduce:
      
      ip netns exec ns1 iperf3 -c $PEER_IP -t 600 --logfile /dev/null &
      while :; do
              for i in 200 7000 400 5000 300 3000 ; do
      		ip netns exec ns1 ip link set $VF1 mtu $i
                      sleep 2
              done
              sleep 10
      done
      
      Currently, callbacks such as iavf_change_mtu() wait for the reset.
      If the reset fails to acquire the rtnl_lock, they schedule the netdev
      update for later while continuing the reset flow. Operations like MTU
      changes are performed under the rtnl_lock. Therefore, when the operation
      finishes, another callback that uses rtnl_lock can start.
      Signed-off-by: default avatarDawid Wesierski <dawidx.wesierski@intel.com>
      Signed-off-by: default avatarMarcin Szycik <marcin.szycik@linux.intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      d2806d96