1. 20 Sep, 2022 23 commits
  2. 19 Sep, 2022 5 commits
  3. 16 Sep, 2022 9 commits
    • Peilin Ye's avatar
      tcp: Use WARN_ON_ONCE() in tcp_read_skb() · 96628951
      Peilin Ye authored
      Prevent tcp_read_skb() from flooding the syslog.
      Suggested-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96628951
    • David S. Miller's avatar
      Merge branch 'net-unsync-addresses-from-ports' · 34d2d336
      David S. Miller authored
      From: Benjamin Poirier <bpoirier@nvidia.com>
      To: netdev@vger.kernel.org
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>,
      	Veaceslav Falico <vfalico@gmail.com>,
      	Andy Gospodarek <andy@greyhouse.net>,
      	"David S. Miller" <davem@davemloft.net>,
      	Eric Dumazet <edumazet@google.com>,
      	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
      	Jiri Pirko <jiri@resnulli.us>, Shuah Khan <shuah@kernel.org>,
      	Jonathan Toppins <jtoppins@redhat.com>,
      	linux-kselftest@vger.kernel.org
      Subject: [PATCH net v3 0/4] Unsync addresses from ports when stopping aggregated devices
      Date: Wed,  7 Sep 2022 16:56:38 +0900	[thread overview]
      Message-ID: <20220907075642.475236-1-bpoirier@nvidia.com> (raw)
      
      This series fixes similar problems in the bonding and team drivers.
      
      Because of missing dev_{uc,mc}_unsync() calls, addresses added to
      underlying devices may be leftover after the aggregated device is deleted.
      Add the missing calls and a few related tests.
      
      v2:
      * fix selftest installation, see patch 3
      
      v3:
      * Split lacpdu_multicast changes to their own patch, #1
      * In ndo_{add,del}_slave methods, only perform address list changes when
        the aggregated device is up (patches 2 & 3)
      * Add selftest function related to the above change (patch 4)
      ====================
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34d2d336
    • Benjamin Poirier's avatar
      net: Add tests for bonding and team address list management · bbb774d9
      Benjamin Poirier authored
      Test that the bonding and team drivers clean up an underlying device's
      address lists (dev->uc, dev->mc) when the aggregated device is deleted.
      
      Test addition and removal of the LACPDU multicast address on underlying
      devices by the bonding driver.
      
      v2:
      * add lag_lib.sh to TEST_FILES
      
      v3:
      * extend bond_listen_lacpdu_multicast test to init_state up and down cases
      * remove some superfluous shell syntax and 'set dev ... up' commands
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbb774d9
    • Benjamin Poirier's avatar
      net: team: Unsync device addresses on ndo_stop · bd602342
      Benjamin Poirier authored
      Netdev drivers are expected to call dev_{uc,mc}_sync() in their
      ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
      This is mentioned in the kerneldoc for those dev_* functions.
      
      The team driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
      ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
      already been emptied in unregister_netdevice_many() before ndo_uninit is
      called. This mistake can result in addresses being leftover on former team
      ports after a team device has been deleted; see test_LAG_cleanup() in the
      last patch in this series.
      
      Add unsync calls at their expected location, team_close().
      
      v3:
      * When adding or deleting a port, only sync/unsync addresses if the team
        device is up. In other cases, it is taken care of at the right time by
        ndo_open/ndo_set_rx_mode/ndo_stop.
      
      Fixes: 3d249d4c ("net: introduce ethernet teaming device")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd602342
    • Benjamin Poirier's avatar
      net: bonding: Unsync device addresses on ndo_stop · 86247aba
      Benjamin Poirier authored
      Netdev drivers are expected to call dev_{uc,mc}_sync() in their
      ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
      This is mentioned in the kerneldoc for those dev_* functions.
      
      The bonding driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
      ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
      already been emptied in unregister_netdevice_many() before ndo_uninit is
      called. This mistake can result in addresses being leftover on former bond
      slaves after a bond has been deleted; see test_LAG_cleanup() in the last
      patch in this series.
      
      Add unsync calls, via bond_hw_addr_flush(), at their expected location,
      bond_close().
      Add dev_mc_add() call to bond_open() to match the above change.
      
      v3:
      * When adding or deleting a slave, only sync/unsync, add/del addresses if
        the bond is up. In other cases, it is taken care of at the right time by
        ndo_open/ndo_set_rx_mode/ndo_stop.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86247aba
    • Benjamin Poirier's avatar
      net: bonding: Share lacpdu_mcast_addr definition · 1d9a143e
      Benjamin Poirier authored
      There are already a few definitions of arrays containing
      MULTICAST_LACPDU_ADDR and the next patch will add one more use. These all
      contain the same constant data so define one common instance for all
      bonding code.
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d9a143e
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 21be1ad6
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-09-08 (ice, iavf)
      
      This series contains updates to ice and iavf drivers.
      
      Dave removes extra unplug of auxiliary bus on reset which caused a
      scheduling while atomic to be reported for ice.
      
      Ding Hui defers setting of queues for TCs to ensure valid configuration
      and restores old config if invalid for ice.
      
      Sylwester fixes a check of setting MAC address to occur after result is
      received from PF for iavf driver.
      
      Brett changes check of ring tail to use software cached value as not all
      devices have access to register tail for iavf driver.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21be1ad6
    • Oleksandr Mazur's avatar
      net: marvell: prestera: add support for for Aldrin2 · 9124dbcc
      Oleksandr Mazur authored
      Aldrin2 (98DX8525) is a Marvell Prestera PP, with 100G support.
      Signed-off-by: default avatarOleksandr Mazur <oleksandr.mazur@plvision.eu>
      
      V2:
        - retarget to net tree instead of net-next;
        - fix missed colon in patch subject ('net marvell' vs 'net: mavell');
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9124dbcc
    • Haimin Zhang's avatar
      net/ieee802154: fix uninit value bug in dgram_sendmsg · 94160108
      Haimin Zhang authored
      There is uninit value bug in dgram_sendmsg function in
      net/ieee802154/socket.c when the length of valid data pointed by the
      msg->msg_name isn't verified.
      
      We introducing a helper function ieee802154_sockaddr_check_size to
      check namelen. First we check there is addr_type in ieee802154_addr_sa.
      Then, we check namelen according to addr_type.
      
      Also fixed in raw_bind, dgram_bind, dgram_connect.
      Signed-off-by: default avatarHaimin Zhang <tcs_kernel@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94160108
  4. 13 Sep, 2022 3 commits
    • Matthieu Baerts's avatar
      Documentation: mptcp: fix pm_type formatting · 0727a9a5
      Matthieu Baerts authored
      When looking at the rendered HTML version, we can see 'pm_type' is not
      displayed with a bold font:
      
        https://docs.kernel.org/5.19/networking/mptcp-sysctl.html
      
      The empty line under 'pm_type' is then removed to have the same style as
      the others.
      
      Fixes: 6bb63ccc ("mptcp: Add a per-namespace sysctl to set the default path manager type")
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/r/20220906180404.1255873-2-matthieu.baerts@tessares.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0727a9a5
    • Paolo Abeni's avatar
      mptcp: fix fwd memory accounting on coalesce · 7288ff6e
      Paolo Abeni authored
      The intel bot reported a memory accounting related splat:
      
      [  240.473094] ------------[ cut here ]------------
      [  240.478507] page_counter underflow: -4294828518 nr_pages=4294967290
      [  240.485500] WARNING: CPU: 2 PID: 14986 at mm/page_counter.c:56 page_counter_cancel+0x96/0xc0
      [  240.570849] CPU: 2 PID: 14986 Comm: mptcp_connect Tainted: G S                5.19.0-rc4-00739-gd24141fe #1
      [  240.581637] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.63 10/05/2017
      [  240.590600] RIP: 0010:page_counter_cancel+0x96/0xc0
      [  240.596179] Code: 00 00 00 45 31 c0 48 89 ef 5d 4c 89 c6 41 5c e9 40 fd ff ff 4c 89 e2 48 c7 c7 20 73 39 84 c6 05 d5 b1 52 04 01 e8 e7 95 f3
      01 <0f> 0b eb a9 48 89 ef e8 1e 25 fc ff eb c3 66 66 2e 0f 1f 84 00 00
      [  240.615639] RSP: 0018:ffffc9000496f7c8 EFLAGS: 00010082
      [  240.621569] RAX: 0000000000000000 RBX: ffff88819c9c0120 RCX: 0000000000000000
      [  240.629404] RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff5200092deeb
      [  240.637239] RBP: ffff88819c9c0120 R08: 0000000000000001 R09: ffff888366527a2b
      [  240.645069] R10: ffffed106cca4f45 R11: 0000000000000001 R12: 00000000fffffffa
      [  240.652903] R13: ffff888366536118 R14: 00000000fffffffa R15: ffff88819c9c0000
      [  240.660738] FS:  00007f3786e72540(0000) GS:ffff888366500000(0000) knlGS:0000000000000000
      [  240.669529] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  240.675974] CR2: 00007f966b346000 CR3: 0000000168cea002 CR4: 00000000003706e0
      [  240.683807] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  240.691641] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  240.699468] Call Trace:
      [  240.702613]  <TASK>
      [  240.705413]  page_counter_uncharge+0x29/0x80
      [  240.710389]  drain_stock+0xd0/0x180
      [  240.714585]  refill_stock+0x278/0x580
      [  240.718951]  __sk_mem_reduce_allocated+0x222/0x5c0
      [  240.729248]  __mptcp_update_rmem+0x235/0x2c0
      [  240.734228]  __mptcp_move_skbs+0x194/0x6c0
      [  240.749764]  mptcp_recvmsg+0xdfa/0x1340
      [  240.763153]  inet_recvmsg+0x37f/0x500
      [  240.782109]  sock_read_iter+0x24a/0x380
      [  240.805353]  new_sync_read+0x420/0x540
      [  240.838552]  vfs_read+0x37f/0x4c0
      [  240.842582]  ksys_read+0x170/0x200
      [  240.864039]  do_syscall_64+0x5c/0x80
      [  240.872770]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [  240.878526] RIP: 0033:0x7f3786d9ae8e
      [  240.882805] Code: c0 e9 b6 fe ff ff 50 48 8d 3d 6e 18 0a 00 e8 89 e8 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
      [  240.902259] RSP: 002b:00007fff7be81e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
      [  240.910533] RAX: ffffffffffffffda RBX: 0000000000002000 RCX: 00007f3786d9ae8e
      [  240.918368] RDX: 0000000000002000 RSI: 00007fff7be87ec0 RDI: 0000000000000005
      [  240.926206] RBP: 0000000000000005 R08: 00007f3786e6a230 R09: 00007f3786e6a240
      [  240.934046] R10: fffffffffffff288 R11: 0000000000000246 R12: 0000000000002000
      [  240.941884] R13: 00007fff7be87ec0 R14: 00007fff7be87ec0 R15: 0000000000002000
      [  240.949741]  </TASK>
      [  240.952632] irq event stamp: 27367
      [  240.956735] hardirqs last  enabled at (27366): [<ffffffff81ba50ea>] mem_cgroup_uncharge_skmem+0x6a/0x80
      [  240.966848] hardirqs last disabled at (27367): [<ffffffff81b8fd42>] refill_stock+0x282/0x580
      [  240.976017] softirqs last  enabled at (27360): [<ffffffff83a4d8ef>] mptcp_recvmsg+0xaf/0x1340
      [  240.985273] softirqs last disabled at (27364): [<ffffffff83a4d30c>] __mptcp_move_skbs+0x18c/0x6c0
      [  240.994872] ---[ end trace 0000000000000000 ]---
      
      After commit d24141fe ("mptcp: drop SK_RECLAIM_* macros"),
      if rmem_fwd_alloc become negative, mptcp_rmem_uncharge() can
      try to reclaim a negative amount of pages, since the expression:
      
      	reclaimable >= PAGE_SIZE
      
      will evaluate to true for any negative value of the int
      'reclaimable': 'PAGE_SIZE' is an unsigned long and
      the negative integer will be promoted to a (very large)
      unsigned long value.
      
      Still after the mentioned commit, kfree_skb_partial()
      in mptcp_try_coalesce() will reclaim most of just released fwd
      memory, so that following charging of the skb delta size will
      lead to negative fwd memory values.
      
      At that point a racing recvmsg() can trigger the splat.
      
      Address the issue switching the order of the memory accounting
      operations. The fwd memory can still transiently reach negative
      values, but that will happen in an atomic scope and no code
      path could touch/use such value.
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Fixes: d24141fe ("mptcp: drop SK_RECLAIM_* macros")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Link: https://lore.kernel.org/r/20220906180404.1255873-1-matthieu.baerts@tessares.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7288ff6e
    • Ioana Ciornei's avatar
      net: phy: aquantia: wait for the suspend/resume operations to finish · ca2dccde
      Ioana Ciornei authored
      The Aquantia datasheet notes that after issuing a Processor-Intensive
      MDIO operation, like changing the low-power state of the device, the
      driver should wait for the operation to finish before issuing a new MDIO
      command.
      
      The new aqr107_wait_processor_intensive_op() function is added which can
      be used after these kind of MDIO operations. At the moment, we are only
      adding it at the end of the suspend/resume calls.
      
      The issue was identified on a board featuring the AQR113C PHY, on
      which commands like 'ip link (..) up / down' issued without any delays
      between them would render the link on the PHY to remain down.
      The issue was easy to reproduce with a one-liner:
       $ ip link set dev ethX down; ip link set dev ethX up; \
       ip link set dev ethX down; ip link set dev ethX up;
      
      Fixes: ac9e81c2 ("net: phy: aquantia: add suspend / resume callbacks for AQR107 family")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220906130451.1483448-1-ioana.ciornei@nxp.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ca2dccde