1. 13 Sep, 2021 13 commits
  2. 12 Sep, 2021 3 commits
  3. 11 Sep, 2021 1 commit
    • Jesper Nilsson's avatar
      net: stmmac: allow CSR clock of 300MHz · 08dad2f4
      Jesper Nilsson authored
      The Synopsys Ethernet IP uses the CSR clock as a base clock for MDC.
      The divisor used is set in the MAC_MDIO_Address register field CR
      (Clock Rate)
      
      The divisor is there to change the CSR clock into a clock that falls
      below the IEEE 802.3 specified max frequency of 2.5MHz.
      
      If the CSR clock is 300MHz, the code falls back to using the reset
      value in the MAC_MDIO_Address register, as described in the comment
      above this code.
      
      However, 300MHz is actually an allowed value and the proper divider
      can be estimated quite easily (it's just 1Hz difference!)
      
      A CSR frequency of 300MHz with the maximum clock rate value of 0x5
      (STMMAC_CSR_250_300M, a divisor of 124) gives somewhere around
      ~2.42MHz which is below the IEEE 802.3 specified maximum.
      
      For the ARTPEC-8 SoC, the CSR clock is this problematic 300MHz,
      and unfortunately, the reset-value of the MAC_MDIO_Address CR field
      is 0x0.
      
      This leads to a clock rate of zero and a divisor of 42, and gives an
      MDC frequency of ~7.14MHz.
      
      Allow CSR clock of 300MHz by making the comparison inclusive.
      Signed-off-by: default avatarJesper Nilsson <jesper.nilsson@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08dad2f4
  4. 10 Sep, 2021 5 commits
  5. 09 Sep, 2021 10 commits
    • Guenter Roeck's avatar
      net: ni65: Avoid typecast of pointer to u32 · e0119126
      Guenter Roeck authored
      Building alpha:allmodconfig results in the following error.
      
      drivers/net/ethernet/amd/ni65.c: In function 'ni65_stop_start':
      drivers/net/ethernet/amd/ni65.c:751:37: error:
      	cast from pointer to integer of different size
      		buffer[i] = (u32) isa_bus_to_virt(tmdp->u.buffer);
      
      'buffer[]' is declared as unsigned long, so replace the typecast to u32
      with a typecast to unsigned long to fix the problem.
      
      Cc: Arnd Bergmann <arnd@kernel.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0119126
    • David S. Miller's avatar
      Merge branch 'sfx-xdp-fallback-tx-queues' · e3a843f9
      David S. Miller authored
      Íñigo Huguet says:
      
      ====================
      sfc: fallback for lack of xdp tx queues
      
      If there are not enough hardware resources to allocate one tx queue per
      CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using
      them resulted each time with the packet being drop and this message in
      the logs: XDP TX failed (-22)
      
      These patches implement 2 fallback solutions for 2 different situations
      that might happen:
      1. There are not enough free resources for all the tx queues, but there
         are some free resources available
      2. There are not enough free resources at all for tx queues.
      
      Both solutions are based in sharing tx queues, using __netif_tx_lock for
      synchronization. In the second case, as there are not XDP TX queues to
      share, network stack queues are used instead, but since we're taking
      __netif_tx_lock, concurrent access to the queues is correctly protected.
      
      The solution for this second case might affect performance both of XDP
      traffic and normal traffice due to lock contention if both are used
      intensively. That's why I call it a "last resort" fallback: it's not a
      desirable situation, but at least we have XDP TX working.
      
      Some tests has shown good results and indicate that the non-fallback
      case is not being damaged by this changes. They are also promising for
      the fallback cases. This is the test:
      1. From another machine, send high amount of packets with pktgen, script
         samples/pktgen/pktgen_sample04_many_flows.sh
      2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments
         "-a XDP_TX --swapmac" and see the results
      3. In the tested machine, run also pktgen_sample04 to create high TX
         normal traffic, and see how xdp_rxq_info results vary
      
      Note that this test doesn't check the worst situations for the fallback
      solutions because XDP_TX will only be executed from the same CPUs that
      are processed by sfc, and not from every CPU in the system, so the
      performance drop due to the highest locking contention doesn't happen.
      I'd like to test that, as well, but I don't have access right now to a
      proper environment.
      
      Test results:
      
      Without doing TX:
      Before changes: ~2,900,000 pps
      After changes, 1 queues/core: ~2,900,000 pps
      After changes, 2 queues/core: ~2,900,000 pps
      After changes, 8 queues/core: ~2,900,000 pps
      After changes, borrowing from network stack: ~2,900,000 pps
      
      With multiflow TX at the same time:
      Before changes: ~1,700,000 - 2,900,000 pps
      After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps
      After changes, 2 queues/core: ~1,700,000 pps
      After changes, 8 queues/core: ~1,700,000 pps
      After changes, borrowing from network stack: 1,150,000 pps
      
      Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program
      and pktgen simultaneously. This was expected because XDP doesn't have any
      buffering system if the NIC is under very high pressure. Thousands of
      these warnings are shown in the case of borrowing net stack queues. As I
      said before, this was also expected.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3a843f9
    • Íñigo Huguet's avatar
      sfc: last resort fallback for lack of xdp tx queues · 6215b608
      Íñigo Huguet authored
      Previous patch addressed the situation of having some free resources for
      xdp tx but not enough for one tx queue per CPU. This patch address the
      worst case of not having resources at all for xdp tx.
      
      Instead of using queues dedicated to xdp, normal queues used by network
      stack are shared for both cases, using __netif_tx_lock for
      synchronization. Also queue stop/restart must be considered in the xdp
      path to avoid freezing the queue.
      
      This is not the ideal situation we might want to be, and a performance
      penalty is expected both for normal and xdp traffic, but at least XDP
      will work in all possible situations (with a warning in the logs),
      improving a bit the pain of not knowing in what situations we can use it
      and in what situations we cannot.
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6215b608
    • Íñigo Huguet's avatar
      sfc: fallback for lack of xdp tx queues · 41544618
      Íñigo Huguet authored
      If there are not enough resources to allocate one TX queue per core for
      XDP TX it was completely disabled.
      
      This patch implements a fallback solution for sharing the available
      queues using __netif_tx_lock for synchronization. In the normal case that
      there is one TX queue per CPU, no locking is done, as it was before.
      
      With this fallback solution, XDP TX will work in much more cases that
      were failing, specially in machines with many CPUs. It's hard for XDP
      users to know what features are supported across different NICs and
      configurations, so they will benefit on having wider support.
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41544618
    • Joakim Zhang's avatar
      net: stmmac: platform: fix build warning when with !CONFIG_PM_SLEEP · 2a48d96f
      Joakim Zhang authored
      Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid
      build warning with !CONFIG_PM_SLEEP:
      
      >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
           796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
               |            ^~~~~~~~~~~~~~~~~~~~~~~~~
      >> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
           775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
               |            ^~~~~~~~~~~~~~~~~~~~~~~~~~
         cc1: all warnings being treated as errors
      
      Fixes: 276aae37 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a48d96f
    • Xiyu Yang's avatar
      net/l2tp: Fix reference count leak in l2tp_udp_recv_core · 9b6ff7eb
      Xiyu Yang authored
      The reference count leak issue may take place in an error handling
      path. If both conditions of tunnel->version == L2TP_HDR_VER_3 and the
      return value of l2tp_v3_ensure_opt_in_linear is nonzero, the function
      would directly jump to label invalid, without decrementing the reference
      count of the l2tp_session object session increased earlier by
      l2tp_tunnel_get_session(). This may result in refcount leaks.
      
      Fix this issue by decrease the reference count before jumping to the
      label invalid.
      
      Fixes: 4522a70d ("l2tp: fix reading optional fields of L2TPv3")
      Signed-off-by: default avatarXiyu Yang <xiyuyang19@fudan.edu.cn>
      Signed-off-by: default avatarXin Xiong <xiongx18@fudan.edu.cn>
      Signed-off-by: default avatarXin Tan <tanxin.ctf@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b6ff7eb
    • Eric Dumazet's avatar
      net/af_unix: fix a data-race in unix_dgram_poll · 04f08eb4
      Eric Dumazet authored
      syzbot reported another data-race in af_unix [1]
      
      Lets change __skb_insert() to use WRITE_ONCE() when changing
      skb head qlen.
      
      Also, change unix_dgram_poll() to use lockless version
      of unix_recvq_full()
      
      It is verry possible we can switch all/most unix_recvq_full()
      to the lockless version, this will be done in a future kernel version.
      
      [1] HEAD commit: 8596e589
      
      BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll
      
      write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0:
       __skb_insert include/linux/skbuff.h:1938 [inline]
       __skb_queue_before include/linux/skbuff.h:2043 [inline]
       __skb_queue_tail include/linux/skbuff.h:2076 [inline]
       skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264
       unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850
       sock_sendmsg_nosec net/socket.c:703 [inline]
       sock_sendmsg net/socket.c:723 [inline]
       ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
       ___sys_sendmsg net/socket.c:2446 [inline]
       __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532
       __do_sys_sendmmsg net/socket.c:2561 [inline]
       __se_sys_sendmmsg net/socket.c:2558 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1:
       skb_queue_len include/linux/skbuff.h:1869 [inline]
       unix_recvq_full net/unix/af_unix.c:194 [inline]
       unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777
       sock_poll+0x23e/0x260 net/socket.c:1288
       vfs_poll include/linux/poll.h:90 [inline]
       ep_item_poll fs/eventpoll.c:846 [inline]
       ep_send_events fs/eventpoll.c:1683 [inline]
       ep_poll fs/eventpoll.c:1798 [inline]
       do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226
       __do_sys_epoll_wait fs/eventpoll.c:2238 [inline]
       __se_sys_epoll_wait fs/eventpoll.c:2233 [inline]
       __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x0000001b -> 0x00000001
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G        W         5.14.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 86b18aaa ("skbuff: fix a data race in skb_queue_len()")
      Cc: Qian Cai <cai@lca.pw>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04f08eb4
    • Tong Zhang's avatar
      net: macb: fix use after free on rmmod · d82d5303
      Tong Zhang authored
      plat_dev->dev->platform_data is released by platform_device_unregister(),
      use of pclk and hclk is a use-after-free. Since device unregister won't
      need a clk device we adjust the function call sequence to fix this issue.
      
      [   31.261225] BUG: KASAN: use-after-free in macb_remove+0x77/0xc6 [macb_pci]
      [   31.275563] Freed by task 306:
      [   30.276782]  platform_device_release+0x25/0x80
      Suggested-by: default avatarNicolas Ferre <Nicolas.Ferre@microchip.com>
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82d5303
    • Sukadev Bhattiprolu's avatar
      ibmvnic: check failover_pending in login response · 273c29e9
      Sukadev Bhattiprolu authored
      If a failover occurs before a login response is received, the login
      response buffer maybe undefined. Check that there was no failover
      before accessing the login response buffer.
      
      Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol")
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      273c29e9
    • Paolo Abeni's avatar
      vhost_net: fix OoB on sendmsg() failure. · 3c4cea8f
      Paolo Abeni authored
      If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp'
      and 'done_idx' indexes are left unchanged. If such failure happens
      when batched_xdp == VHOST_NET_BATCH, the next call to
      vhost_net_build_xdp() will access and write memory outside the xdp
      buffers area.
      
      Since sendmsg() can only error with EBADFD, this change addresses the
      issue explicitly freeing the XDP buffers batch on error.
      
      Fixes: 0a0be13b ("vhost_net: batch submitting XDP buffers to underlayer sockets")
      Suggested-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c4cea8f
  6. 08 Sep, 2021 8 commits
    • Joakim Zhang's avatar
      net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume · 276aae37
      Joakim Zhang authored
      commit 5f585913 ("net: stmmac: delete the eee_ctrl_timer after
      napi disabled"), this patch tries to fix system hang caused by eee_ctrl_timer,
      unfortunately, it only can resolve it for system reboot stress test. System
      hang also can be reproduced easily during system suspend/resume stess test
      when mount NFS on i.MX8MP EVK board.
      
      In stmmac driver, eee feature is combined to phylink framework. When do
      system suspend, phylink_stop() would queue delayed work, it invokes
      stmmac_mac_link_down(), where to deactivate eee_ctrl_timer synchronizly.
      In above commit, try to fix issue by deactivating eee_ctrl_timer obviously,
      but it is not enough. Looking into eee_ctrl_timer expire callback
      stmmac_eee_ctrl_timer(), it could enable hareware eee mode again. What is
      unexpected is that LPI interrupt (MAC_Interrupt_Enable.LPIEN bit) is always
      asserted. This interrupt has chance to be issued when LPI state entry/exit
      from the MAC, and at that time, clock could have been already disabled.
      The result is that system hang when driver try to touch register from
      interrupt handler.
      
      The reason why above commit can fix system hang issue in stmmac_release()
      is that, deactivate eee_ctrl_timer not just after napi disabled, further
      after irq freed.
      
      In conclusion, hardware would generate LPI interrupt when clock has been
      disabled during suspend or resume, since hardware is in eee mode and LPI
      interrupt enabled.
      
      Interrupts from MAC, MTL and DMA level are enabled and never been disabled
      when system suspend, so postpone clocks management from suspend stage to
      noirq suspend stage should be more safe.
      
      Fixes: 5f585913 ("net: stmmac: delete the eee_ctrl_timer after napi disabled")
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      276aae37
    • Alex Elder's avatar
      net: ipa: initialize all filter table slots · b5c10223
      Alex Elder authored
      There is an off-by-one problem in ipa_table_init_add(), when
      initializing filter tables.
      
      In that function, the number of filter table entries is determined
      based on the number of set bits in the filter map.  However that
      count does *not* include the extra "slot" in the filter table that
      holds the filter map itself.  Meanwhile, ipa_table_addr() *does*
      include the filter map in the memory it returns, but because the
      count it's provided doesn't include it, it includes one too few
      table entries.
      
      Fix this by including the extra slot for the filter map in the count
      computed in ipa_table_init_add().
      
      Note: ipa_filter_reset_table() does not have this problem; it resets
      filter table entries one by one, but does not overwrite the filter
      bitmap.
      
      Fixes: 2b9feef2 ("soc: qcom: ipa: filter and routing tables")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5c10223
    • Nathan Rossi's avatar
      net: phylink: Update SFP selected interface on advertising changes · ea269a6f
      Nathan Rossi authored
      Currently changes to the advertising state via ethtool do not cause any
      reselection of the configured interface mode after the SFP is already
      inserted and initially configured.
      
      While it is not typical to change the advertised link modes for an
      interface using an SFP in certain use cases it is desirable. In the case
      of a SFP port that is capable of handling both SFP and SFP+ modules it
      will automatically select between 1G and 10G modes depending on the
      supported mode of the SFP. However if the SFP module is capable of
      working in multiple modes (e.g. a SFP+ DAC that can operate at 1G or
      10G), one end of the cable may be attached to a SFP 1000base-x port thus
      the SFP+ end must be manually configured to the 1000base-x mode in order
      for the link to be established.
      
      This change causes the ethtool setting of advertised mode changes to
      reselect the interface mode so that the link can be established.
      Additionally when a module is inserted the advertising mode is reset to
      match the supported modes of the module.
      Signed-off-by: default avatarNathan Rossi <nathan.rossi@digi.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea269a6f
    • Arnd Bergmann's avatar
      ne2000: fix unused function warning · d7e203ff
      Arnd Bergmann authored
      Geert noticed a warning on MIPS TX49xx, Atari and presuambly other
      platforms when the driver is built-in but NETDEV_LEGACY_INIT is
      disabled:
      
      drivers/net/ethernet/8390/ne.c:909:20: warning: ‘ne_add_devices’ defined but not used [-Wunused-function]
      
      Merge the two module init functions into a single one with an
      IS_ENABLED() check to replace the incorrect #ifdef.
      
      Fixes: 4228c394 ("make legacy ISA probe optional")
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Tested-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7e203ff
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2021-09-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · c324f023
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2021-09-07
      
      This series introduces some fixes to mlx5 driver.
      Please pull and let me know if there is any problem.
      
      Included here, a patch which solves a build warning reported on
      linux-kernel mailing list [1]:
      Fix commit ("net/mlx5: Bridge, fix uninitialized variable usage")
      
      I hope this series can make it to rc1.
      
      [1] https://www.spinics.net/lists/netdev/msg765481.html
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c324f023
    • Sukadev Bhattiprolu's avatar
      ibmvnic: check failover_pending in login response · d437f5aa
      Sukadev Bhattiprolu authored
      If a failover occurs before a login response is received, the login
      response buffer maybe undefined. Check that there was no failover
      before accessing the login response buffer.
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d437f5aa
    • Jeremy Kerr's avatar
      mctp: perform route destruction under RCU read lock · 581edcd0
      Jeremy Kerr authored
      The kernel test robot reports:
      
        [  843.509974][  T345] =============================
        [  843.524220][  T345] WARNING: suspicious RCU usage
        [  843.538791][  T345] 5.14.0-rc2-00606-g889b7da2 #1 Not tainted
        [  843.553617][  T345] -----------------------------
        [  843.567412][  T345] net/mctp/route.c:310 RCU-list traversed in non-reader section!!
      
      - we're missing the rcu read lock acquire around the destruction path.
      
      This change adds the acquire/release - the path is already atomic, and
      we're using the _rcu list iterators.
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      581edcd0
    • Lin, Zhenpeng's avatar
      dccp: don't duplicate ccid when cloning dccp sock · d9ea761f
      Lin, Zhenpeng authored
      Commit 2677d206 ("dccp: don't free ccid2_hc_tx_sock ...") fixed
      a UAF but reintroduced CVE-2017-6074.
      
      When the sock is cloned, two dccps_hc_tx_ccid will reference to the
      same ccid. So one can free the ccid object twice from two socks after
      cloning.
      
      This issue was found by "Hadar Manor" as well and assigned with
      CVE-2020-16119, which was fixed in Ubuntu's kernel. So here I port
      the patch from Ubuntu to fix it.
      
      The patch prevents cloned socks from referencing the same ccid.
      
      Fixes: 2677d206 ("dccp: don't free ccid2_hc_tx_sock ...")
      Signed-off-by: default avatarZhenpeng Lin <zplin@psu.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9ea761f