1. 17 Apr, 2021 31 commits
    • Heiner Kallweit's avatar
      r8169: keep pause settings on interface down/up cycle · 11ac4e66
      Heiner Kallweit authored
      Currently, if the user changes the pause settings, the default settings
      will be restored after an interface down/up cycle, and also when
      resuming from suspend. This doesn't seem to provide the best user
      experience. Change this to keep user settings, and just ensure that in
      jumbo mode pause is disabled.
      Small drawback: When switching back mtu from jumbo to non-jumbo then
      pause remains disabled (but user can enable it using ethtool).
      I think that's a not too common scenario and acceptable.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11ac4e66
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8203c7ce
      Jakub Kicinski authored
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
       - keep the ZC code, drop the code related to reinit
      net/bridge/netfilter/ebtables.c
       - fix build after move to net_generic
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8203c7ce
    • Linus Torvalds's avatar
      Merge tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 88a5af94
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes for 5.12-rc8, including fixes from netfilter, and
        bpf. BPF verifier changes stand out, otherwise things have slowed
        down.
      
        Current release - regressions:
      
         - gro: ensure frag0 meets IP header alignment
      
         - Revert "net: stmmac: re-init rx buffers when mac resume back"
      
         - ethernet: macb: fix the restore of cmp registers
      
        Previous releases - regressions:
      
         - ixgbe: Fix NULL pointer dereference in ethtool loopback test
      
         - ixgbe: fix unbalanced device enable/disable in suspend/resume
      
         - phy: marvell: fix detection of PHY on Topaz switches
      
         - make tcp_allowed_congestion_control readonly in non-init netns
      
         - xen-netback: Check for hotplug-status existence before watching
      
        Previous releases - always broken:
      
         - bpf: mitigate a speculative oob read of up to map value size by
           tightening the masking window
      
         - sctp: fix race condition in sctp_destroy_sock
      
         - sit, ip6_tunnel: Unregister catch-all devices
      
         - netfilter: nftables: clone set element expression template
      
         - netfilter: flowtable: fix NAT IPv6 offload mangling
      
         - net: geneve: check skb is large enough for IPv4/IPv6 header
      
         - netlink: don't call ->netlink_bind with table lock held"
      
      * tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
        netlink: don't call ->netlink_bind with table lock held
        MAINTAINERS: update my email
        bpf: Update selftests to reflect new error states
        bpf: Tighten speculative pointer arithmetic mask
        bpf: Move sanitize_val_alu out of op switch
        bpf: Refactor and streamline bounds check into helper
        bpf: Improve verifier error messages for users
        bpf: Rework ptr_limit into alu_limit and add common error path
        bpf: Ensure off_reg has no mixed signed bounds for all types
        bpf: Move off_reg into sanitize_ptr_alu
        bpf: Use correct permission flag for mixed signed bounds arithmetic
        ch_ktls: do not send snd_una update to TCB in middle
        ch_ktls: tcb close causes tls connection failure
        ch_ktls: fix device connection close
        ch_ktls: Fix kernel panic
        i40e: fix the panic when running bpf in xdpdrv mode
        net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
        net/mlx5e: Fix setting of RS FEC mode
        net/mlx5: Fix setting of devlink traps in switchdev mode
        Revert "net: stmmac: re-init rx buffers when mac resume back"
        ...
      88a5af94
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-for-5.12-rc8' of... · bdfd99e6
      Linus Torvalds authored
      Merge tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
      
      Pull libnvdimm fixes from Dan Williams:
       "The largest change is for a regression that landed during -rc1 for
        block-device read-only handling. Vaibhav found a new use for the
        ability (originally introduced by virtio_pmem) to call back to the
        platform to flush data, but also found an original bug in that
        implementation. Lastly, Arnd cleans up some compile warnings in dax.
      
        This has all appeared in -next with no reported issues.
      
        Summary:
      
         - Fix a regression of read-only handling in the pmem driver
      
         - Fix a compile warning
      
         - Fix support for platform cache flush commands on powerpc/papr"
      
      * tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC
        libnvdimm: Notify disk drivers to revalidate region read-only
        dax: avoid -Wempty-body warnings
      bdfd99e6
    • Linus Torvalds's avatar
      Merge tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 7c226774
      Linus Torvalds authored
      Pull CXL memory class fixes from Dan Williams:
       "A collection of fixes for the CXL memory class driver introduced in
        this release cycle.
      
        The driver was primarily developed on a work-in-progress QEMU
        emulation of the interface and we have since found a couple places
        where it hid spec compliance bugs in the driver, or had a spec
        implementation bug itself.
      
        The biggest change here is replacing a percpu_ref with an rwsem to
        cleanup a couple bugs in the error unwind path during ioctl device
        init. Lastly there were some minor cleanups to not export the
        power-management sysfs-ABI for the ioctl device, use the proper sysfs
        helper for emitting values, and prevent subtle bugs as new
        administration commands are added to the supported list.
      
        The bulk of it has appeared in -next save for the top commit which was
        found today and validated on a fixed-up QEMU model.
      
        Summary:
      
         - Fix support for CXL memory devices with registers offset from the
           BAR base.
      
         - Fix the reporting of device capacity.
      
         - Fix the driver commands list definition to be disconnected from the
           UAPI command list.
      
         - Replace percpu_ref with rwsem to fix initialization error path.
      
         - Fix leaks in the driver initialization error path.
      
         - Drop the power/ directory from CXL device sysfs.
      
         - Use the recommended sysfs helper for attribute 'show'
           implementations"
      
      * tag 'cxl-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
        cxl/mem: Fix memory device capacity probing
        cxl/mem: Fix register block offset calculation
        cxl/mem: Force array size of mem_commands[] to CXL_MEM_COMMAND_ID_MAX
        cxl/mem: Disable cxl device power management
        cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures
        cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
        cxl/mem: Use sysfs_emit() for attribute show routines
      7c226774
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · fdb5d6ca
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "12 patches.
      
        Subsystems affected by this patch series: mm (documentation, kasan,
        and pagemap), csky, ia64, gcov, and lib"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        lib: remove "expecting prototype" kernel-doc warnings
        gcov: clang: fix clang-11+ build
        mm: ptdump: fix build failure
        mm/mapping_dirty_helpers: guard hugepage pud's usage
        ia64: tools: remove duplicate definition of ia64_mf() on ia64
        ia64: tools: remove inclusion of ia64-specific version of errno.h header
        ia64: fix discontig.c section mismatches
        ia64: remove duplicate entries in generic_defconfig
        csky: change a Kconfig symbol name to fix e1000 build error
        kasan: remove redundant config option
        kasan: fix hwasan build for gcc
        mm: eliminate "expecting prototype" kernel-doc warnings
      fdb5d6ca
    • Dan Williams's avatar
      cxl/mem: Fix memory device capacity probing · fae8817a
      Dan Williams authored
      The CXL Identify Memory Device output payload emits capacity in 256MB
      units. The driver is treating the capacity field as bytes. This was
      missed because QEMU reports bytes when it should report bytes / 256MB.
      
      Fixes: 8adaf747 ("cxl/mem: Find device capabilities")
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Cc: Ben Widawsky <ben.widawsky@intel.com>
      Link: https://lore.kernel.org/r/161862021044.3259705.7008520073059739760.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      fae8817a
    • David S. Miller's avatar
      Merge branch 'mptcp-fixes-and-tracepoints' · 474f4593
      David S. Miller authored
      Mat Martineau says:
      
      ====================
      mptcp: Fixes and tracepoints from the mptcp tree
      
      Here's one more batch of changes that we've tested out in the MPTCP tree.
      
      Patch 1 makes the MPTCP KUnit config symbol more consistent with other
      subsystems.
      
      Patch 2 fixes a couple of format specifiers in pr_debug()s
      
      Patches 3-7 add four helpful tracepoints for MPTCP.
      
      Patch 8 is a one-line refactor to use an available helper macro.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      474f4593
    • Geliang Tang's avatar
      mptcp: use mptcp_for_each_subflow in mptcp_close · 44227915
      Geliang Tang authored
      This patch used the macro helper mptcp_for_each_subflow() instead of
      list_for_each_entry() in mptcp_close.
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44227915
    • Geliang Tang's avatar
      mptcp: add tracepoint in subflow_check_data_avail · d96a838a
      Geliang Tang authored
      This patch added a tracepoint in subflow_check_data_avail() to show the
      mapping status.
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d96a838a
    • Geliang Tang's avatar
      mptcp: add tracepoint in ack_update_msk · ed66bfb4
      Geliang Tang authored
      This patch added a tracepoint in ack_update_msk() to track the
      incoming data_ack and window/snd_una updates.
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed66bfb4
    • Geliang Tang's avatar
      mptcp: add tracepoint in get_mapping_status · 0918e34b
      Geliang Tang authored
      This patch added a tracepoint in the mapping status function
      get_mapping_status() to dump every mpext field.
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0918e34b
    • Geliang Tang's avatar
      mptcp: add tracepoint in mptcp_subflow_get_send · e10a9892
      Geliang Tang authored
      This patch added a tracepoint in the packet scheduler function
      mptcp_subflow_get_send().
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e10a9892
    • Geliang Tang's avatar
      mptcp: export mptcp_subflow_active · 43f1140b
      Geliang Tang authored
      This patch moved the static function mptcp_subflow_active to protocol.h
      as an inline one.
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43f1140b
    • Geliang Tang's avatar
      mptcp: fix format specifiers for unsigned int · e4b61351
      Geliang Tang authored
      Some of the sequence numbers are printed as the negative ones in the debug
      log:
      
      [   46.250932] MPTCP: DSS
      [   46.250940] MPTCP: data_fin=0 dsn64=0 use_map=0 ack64=1 use_ack=1
      [   46.250948] MPTCP: data_ack=2344892449471675613
      [   46.251012] MPTCP: msk=000000006e157e3f status=10
      [   46.251023] MPTCP: msk=000000006e157e3f snd_data_fin_enable=0 pending=0 snd_nxt=2344892449471700189 write_seq=2344892449471700189
      [   46.251343] MPTCP: msk=00000000ec44a129 ssk=00000000f7abd481 sending dfrag at seq=-1658937016627538668 len=100 already sent=0
      [   46.251360] MPTCP: data_seq=16787807057082012948 subflow_seq=1 data_len=100 dsn64=1
      
      This patch used the format specifier %u instead of %d for the unsigned int
      values to fix it.
      
      Fixes: d9ca1de8 ("mptcp: move page frag allocation in mptcp_sendmsg()")
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4b61351
    • Nico Pache's avatar
      kunit: mptcp: adhere to KUNIT formatting standard · 3fcc8a25
      Nico Pache authored
      Drop 'S' from end of CONFIG_MPTCP_KUNIT_TESTS in order to adhere to the
      KUNIT *_KUNIT_TEST config name format.
      
      Fixes: a00a5822 (mptcp: move crypto test to KUNIT)
      Reviewed-by: default avatarDavid Gow <davidgow@google.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarNico Pache <npache@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3fcc8a25
    • David S. Miller's avatar
      Merge branch 'enetc-xdp-fixes' · 820dd7a2
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Fixups for XDP on NXP ENETC
      
      After some more XDP testing on the NXP LS1028A, this is a set of 10 bug
      fixes, simplifications and tweaks, ranging from addressing Toke's feedback
      (the network stack can run concurrently with XDP on the same TX rings)
      to fixing some OOM conditions seen under TX congestion.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      820dd7a2
    • Vladimir Oltean's avatar
      net: enetc: apply the MDIO workaround for XDP_REDIRECT too · 24e39309
      Vladimir Oltean authored
      Described in fd5736bf ("enetc: Workaround for MDIO register access
      issue") is a workaround for a hardware bug that requires a register
      access of the MDIO controller to never happen concurrently with a
      register access of a port PF. To avoid that, a mutual exclusion scheme
      with rwlocks was implemented - the port PF accessors are the 'read'
      side, and the MDIO accessors are the 'write' side.
      
      When we do XDP_REDIRECT between two ENETC interfaces, all is fine
      because the MDIO lock is already taken from the NAPI poll loop.
      
      But when the ingress interface is not ENETC, just the egress is, the
      MDIO lock is not taken, so we might access the port PF registers
      concurrently with MDIO, which will make the link flap due to wrong
      values returned from the PHY.
      
      To avoid this, let's just slap an enetc_lock_mdio/enetc_unlock_mdio at
      the beginning and ending of enetc_xdp_xmit. The fact that the MDIO lock
      is designed as a rwlock is important here, because the read side is
      reentrant (that is one of the main reasons why we chose it). Usually,
      the way we benefit of its reentrancy is by running the data path
      concurrently on both CPUs, but in this case, we benefit from the
      reentrancy by taking the lock even when the lock is already taken
      (and that's the situation where ENETC is both the ingress and the egress
      interface for XDP_REDIRECT, which was fine before and still is fine now).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24e39309
    • Vladimir Oltean's avatar
      net: enetc: fix buffer leaks with XDP_TX enqueue rejections · 92ff9a6e
      Vladimir Oltean authored
      If the TX ring is congested, enetc_xdp_tx() returns false for the
      current XDP frame (represented as an array of software BDs).
      
      This array of software TX BDs is constructed in enetc_rx_swbd_to_xdp_tx_swbd
      from software BDs freshly cleaned from the RX ring. The issue is that we
      scrub the RX software BDs too soon, more precisely before we know that
      we can enqueue the TX BDs successfully into the TX ring.
      
      If we can't enqueue them (and enetc_xdp_tx returns false), we call
      enetc_xdp_drop which attempts to recycle the buffers held by the RX
      software BDs. But because we scrubbed those RX BDs already, two things
      happen:
      
      (a) we leak their memory
      (b) we populate the RX software BD ring with an all-zero rx_swbd
          structure, which makes the buffer refill path allocate more memory.
      
      enetc_refill_rx_ring
      -> if (unlikely(!rx_swbd->page))
         -> enetc_new_page
      
      That is a recipe for fast OOM.
      
      Fixes: 7ed2bc80 ("net: enetc: add support for XDP_TX")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92ff9a6e
    • Vladimir Oltean's avatar
      net: enetc: handle the invalid XDP action the same way as XDP_DROP · 975acc83
      Vladimir Oltean authored
      When the XDP program returns an invalid action, we should free the RX
      buffer.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      975acc83
    • Vladimir Oltean's avatar
      net: enetc: use dedicated TX rings for XDP · 7eab503b
      Vladimir Oltean authored
      It is possible for one CPU to perform TX hashing (see netdev_pick_tx)
      between the 8 ENETC TX rings, and the TX hashing to select TX queue 1.
      
      At the same time, it is possible for the other CPU to already use TX
      ring 1 for XDP (either XDP_TX or XDP_REDIRECT). Since there is no mutual
      exclusion between XDP and the network stack, we run into an issue
      because the ENETC TX procedure is not reentrant.
      
      The obvious approach would be to just make XDP take the lock of the
      network stack's TX queue corresponding to the ring it's about to enqueue
      in.
      
      For XDP_REDIRECT, this is quite straightforward, a lock at the beginning
      and end of enetc_xdp_xmit() should do the trick.
      
      But for XDP_TX, it's a bit more complicated. For one, we do TX batching
      all by ourselves for frames with the XDP_TX verdict. This is something
      we would like to keep the way it is, for performance reasons. But
      batching means that the network stack's lock should be kept from the
      first enqueued XDP_TX frame and until we ring the doorbell. That is
      mostly fine, except for cases when in the same NAPI loop we have mixed
      XDP_TX and XDP_REDIRECT frames. So if enetc_xdp_xmit() gets called while
      we are holding the lock from the RX NAPI, then bam, deadlock. The naive
      answer could be 'just flush the XDP_TX frames first, then release the
      network stack's TX queue lock, then call xdp_do_flush_map()'. But even
      xdp_do_redirect() is capable of flushing the batched XDP_REDIRECT
      frames, so unless we unlock/relock the TX queue around xdp_do_redirect(),
      there simply isn't any clean way to protect XDP_TX from concurrent
      network stack .ndo_start_xmit() on another CPU.
      
      So we need to take a different approach, and that is to reserve two
      rings for the sole use of XDP. We leave TX rings
      0..ndev->real_num_tx_queues-1 to be handled by the network stack, and we
      pick them from the end of the priv->tx_ring array.
      
      We make an effort to keep the mapping done by enetc_alloc_msix() which
      decides which CPU handles the TX completions of which TX ring in its
      NAPI poll. So the XDP TX ring of CPU 0 is handled by TX ring 6, and the
      XDP TX ring of CPU 1 is handled by TX ring 7.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7eab503b
    • Vladimir Oltean's avatar
      net: enetc: increase TX ring size · ee3e875f
      Vladimir Oltean authored
      Now that commit d6a2829e ("net: enetc: increase RX ring default
      size") has increased the RX ring size, it is quite easy to congest the
      TX rings when the traffic is predominantly XDP_TX, as the RX ring is
      quite a bit larger than the TX one.
      
      Since we bit the bullet and did the expensive thing already (larger RX
      rings consume more memory pages), it seems quite foolish to keep the TX
      rings small. So make them equally sized with TX.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee3e875f
    • Vladimir Oltean's avatar
      net: enetc: remove unneeded xdp_do_flush_map() · a6369fe6
      Vladimir Oltean authored
      xdp_do_redirect already contains:
      -> dev_map_enqueue
         -> __xdp_enqueue
            -> bq_enqueue
               -> bq_xmit_all // if we have more than 16 frames
      
      So the logic from enetc will never be hit, because ENETC_DEFAULT_TX_WORK
      is 128.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6369fe6
    • Vladimir Oltean's avatar
      net: enetc: stop XDP NAPI processing when build_skb() fails · 8f50d8bb
      Vladimir Oltean authored
      When the code path below fails:
      
      enetc_clean_rx_ring_xdp // XDP_PASS
      -> enetc_build_skb
         -> enetc_map_rx_buff_to_skb
            -> build_skb
      
      enetc_clean_rx_ring_xdp will 'break', but that 'break' instruction isn't
      strong enough to actually break the NAPI poll loop, just the switch/case
      statement for XDP actions. So we increment rx_frm_cnt and go to the next
      frames minding our own business.
      
      Instead let's do what the skb NAPI poll function does, and break the
      loop now, waiting for the memory pressure to go away. Otherwise the next
      calls to build_skb() are likely to fail too.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f50d8bb
    • Vladimir Oltean's avatar
      net: enetc: recycle buffers for frames with RX errors · 672f9a21
      Vladimir Oltean authored
      When receiving a frame with errors, currently we do nothing with it (we
      don't construct an skb or an xdp_buff), we just exit the NAPI poll loop.
      
      Let's put the buffer back into the RX ring (similar to XDP_DROP).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      672f9a21
    • Vladimir Oltean's avatar
      net: enetc: rename the buffer reuse helpers · 6b04830d
      Vladimir Oltean authored
      enetc_put_xdp_buff has nothing to do with XDP, frankly, it is just a
      helper to populate the recycle end of the shadow RX BD ring
      (next_to_alloc) with a given buffer.
      
      On the other hand, enetc_put_rx_buff plays more tricks than its name
      would suggest.
      
      So let's rename enetc_put_rx_buff into enetc_flip_rx_buff to reflect the
      half-page buffer reuse tricks that it employs, and enetc_put_xdp_buff
      into enetc_put_rx_buff which suggests a more garden-variety operation.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b04830d
    • Vladimir Oltean's avatar
      net: enetc: remove redundant clearing of skb/xdp_frame pointer in TX conf path · e9e49ae8
      Vladimir Oltean authored
      Later in enetc_clean_tx_ring we have:
      
      		/* Scrub the swbd here so we don't have to do that
      		 * when we reuse it during xmit
      		 */
      		memset(tx_swbd, 0, sizeof(*tx_swbd));
      
      So these assignments are unnecessary.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9e49ae8
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · bc45f524
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      1GbE Intel Wired LAN Driver Updates 2021-04-16
      
      This series contains updates to igb and igc drivers.
      
      Ederson adjusts Tx buffer distributions in Qav mode to improve
      TSN-aware traffic for igb. He also enable PPS support and auxiliary PHC
      functions for igc.
      
      Grzegorz checks that the MTA register was properly written and
      retries if not for igb.
      
      Sasha adds reporting of EEE low power idle counters to ethtool and fixes
      a return value being overwritten through looping for igc.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc45f524
    • Gustavo A. R. Silva's avatar
      flow_dissector: Fix out-of-bounds warning in __skb_flow_bpf_to_target() · 1e3d976d
      Gustavo A. R. Silva authored
      Fix the following out-of-bounds warning:
      
      net/core/flow_dissector.c:835:3: warning: 'memcpy' offset [33, 48] from the object at 'flow_keys' is out of the bounds of referenced subobject 'ipv6_src' with type '__u32[4]' {aka 'unsigned int[4]'} at offset 16 [-Warray-bounds]
      
      The problem is that the original code is trying to copy data into a
      couple of struct members adjacent to each other in a single call to
      memcpy().  So, the compiler legitimately complains about it. As these
      are just a couple of members, fix this by copying each one of them in
      separate calls to memcpy().
      
      This helps with the ongoing efforts to globally enable -Warray-bounds
      and get us closer to being able to tighten the FORTIFY_SOURCE routines
      on memcpy().
      
      Link: https://github.com/KSPP/linux/issues/109Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e3d976d
    • Florian Westphal's avatar
      netlink: don't call ->netlink_bind with table lock held · f2764bd4
      Florian Westphal authored
      When I added support to allow generic netlink multicast groups to be
      restricted to subscribers with CAP_NET_ADMIN I was unaware that a
      genl_bind implementation already existed in the past.
      
      It was reverted due to ABBA deadlock:
      
      1. ->netlink_bind gets called with the table lock held.
      2. genetlink bind callback is invoked, it grabs the genl lock.
      
      But when a new genl subsystem is (un)registered, these two locks are
      taken in reverse order.
      
      One solution would be to revert again and add a comment in genl
      referring 1e82a62f, "genetlink: remove genl_bind").
      
      This would need a second change in mptcp to not expose the raw token
      value anymore, e.g.  by hashing the token with a secret key so userspace
      can still associate subflow events with the correct mptcp connection.
      
      However, Paolo Abeni reminded me to double-check why the netlink table is
      locked in the first place.
      
      I can't find one.  netlink_bind() is already called without this lock
      when userspace joins a group via NETLINK_ADD_MEMBERSHIP setsockopt.
      Same holds for the netlink_unbind operation.
      
      Digging through the history, commit f7736080
      ("netlink: access nlk groups safely in netlink bind and getname")
      expanded the lock scope.
      
      commit 3a20773b ("net: netlink: cap max groups which will be considered in netlink_bind()")
      ... removed the nlk->ngroups access that the lock scope
      extension was all about.
      
      Reduce the lock scope again and always call ->netlink_bind without
      the table lock.
      
      The Fixes tag should be vs. the patch mentioned in the link below,
      but that one got squash-merged into the patch that came earlier in the
      series.
      
      Fixes: 4d54cc32 ("mptcp: avoid lock_fast usage in accept path")
      Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/T/#u
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Cc: Sean Tranchetti <stranche@codeaurora.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2764bd4
    • David S. Miller's avatar
      Merge branch 'ethtool-stats' · 1c86514d
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      ethtool: add uAPI for reading standard stats
      
      Continuing the effort of providing a unified access method
      to standard stats, and explicitly tying the definitions to
      the standards this series adds an API for general stats
      which do no fit into more targeted control APIs.
      
      There is nothing clever here, just a netlink API for dumping
      statistics defined by standards and RFCs which today end up
      in ethtool -S under infinite variations of names.
      
      This series adds basic IEEE stats (for PHY, MAC, Ctrl frames)
      and RMON stats. AFAICT other RFCs only duplicate the IEEE
      stats.
      
      This series does _not_ add a netlink API to read driver-defined
      stats. There seems to be little to gain from moving that part
      to netlink.
      
      The netlink message format is very simple, and aims to allow
      adding stats and groups with no changes to user tooling (which
      IIUC is expected for ethtool).
      
      On user space side we can re-use -S, and make it dump
      standard stats if --groups are defined.
      
      $ ethtool -S eth0 --groups eth-phy eth-mac eth-ctrl rmon
      Stats for eth0:
      eth-phy-SymbolErrorDuringCarrier: 0
      eth-mac-FramesTransmittedOK: 0
      eth-mac-FrameTooLongErrors: 0
      eth-ctrl-MACControlFramesTransmitted: 0
      eth-ctrl-MACControlFramesReceived: 1
      eth-ctrl-UnsupportedOpcodesReceived: 0
      rmon-etherStatsUndersizePkts: 0
      rmon-etherStatsJabbers: 0
      rmon-rx-etherStatsPkts64Octets: 1
      rmon-rx-etherStatsPkts128to255Octets: 0
      rmon-rx-etherStatsPkts1024toMaxOctets: 1
      rmon-tx-etherStatsPkts64Octets: 1
      rmon-tx-etherStatsPkts128to255Octets: 0
      rmon-tx-etherStatsPkts1024toMaxOctets: 1
      
      v1:
      
      Driver support for mlxsw, mlx5 and bnxt included.
      
      Compared to the RFC I went ahead with wrapping the stats into
      a 1:1 nest. Now IDs of stats can start from 0, at a cost of
      slightly "careful" u64 alignment handling.
      
      v2:
      
      Add missing kdoc in patch 5.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c86514d
  2. 16 Apr, 2021 9 commits