1. 14 Aug, 2024 7 commits
  2. 13 Aug, 2024 33 commits
    • Frank Li's avatar
      dt-bindings: net: fsl,qoriq-mc-dpmac: using unevaluatedProperties · be034ee6
      Frank Li authored
      Replace additionalProperties with unevaluatedProperties because it have
      allOf: $ref: ethernet-controller.yaml#.
      
      Remove all properties, which already defined in ethernet-controller.yaml.
      
      Fixed below CHECK_DTBS warnings:
      arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dtb:
         fsl-mc@80c000000: dpmacs:ethernet@11: 'fixed-link' does not match any of the regexes: 'pinctrl-[0-9]+'
              from schema $id: http://devicetree.org/schemas/misc/fsl,qoriq-mc.yaml#Signed-off-by: default avatarFrank Li <Frank.Li@nxp.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://patch.msgid.link/20240811184049.3759195-1-Frank.Li@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      be034ee6
    • Jing-Ping Jan's avatar
      Documentation: networking: correct spelling · baae8b0b
      Jing-Ping Jan authored
      Correct spelling problems for Documentation/networking/ as reported
      by ispell.
      Signed-off-by: default avatarJing-Ping Jan <zoo868e@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/20240812170910.5760-1-zoo868e@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      baae8b0b
    • Rosen Penev's avatar
      net: hinic: use ethtool_sprintf/puts · dd1bf9f9
      Rosen Penev authored
      Simpler and avoids manual pointer addition.
      Signed-off-by: default avatarRosen Penev <rosenp@gmail.com>
      Link: https://patch.msgid.link/20240809044957.4534-1-rosenp@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      dd1bf9f9
    • Paolo Abeni's avatar
      Merge branch 'net-netconsole-fix-netconsole-unsafe-locking' · 2c9c2a3d
      Paolo Abeni authored
      Breno Leitao says:
      
      ====================
      net: netconsole: Fix netconsole unsafe locking
      
      Problem:
      =======
      
      The current locking mechanism in netconsole is unsafe and suboptimal due
      to the following issues:
      
      1) Lock Release and Reacquisition Mid-Loop:
      
      In netconsole_netdev_event(), the target_list_lock is released and
      reacquired within a loop, potentially causing collisions and cleaning up
      targets that are being enabled.
      
      	int netconsole_netdev_event()
      	{
      	...
      		spin_lock_irqsave(&target_list_lock, flags);
      		list_for_each_entry(nt, &target_list, list) {
      			spin_unlock_irqrestore(&target_list_lock, flags);
      			__netpoll_cleanup(&nt->np);
      			spin_lock_irqsave(&target_list_lock, flags);
      		}
      		spin_lock_irqsave(&target_list_lock, flags);
      	...
      	}
      
      2) Non-Atomic Cleanup Operations:
      
      In enabled_store(), the cleanup of structures is not atomic, risking
      cleanup of structures that are in the process of being enabled.
      
      	size_t enabled_store()
      	{
      	...
      		spin_lock_irqsave(&target_list_lock, flags);
      		nt->enabled = false;
      		spin_unlock_irqrestore(&target_list_lock, flags);
      		netpoll_cleanup(&nt->np);
      	...
      	}
      
      These issues stem from the following limitations in netconsole's locking
      design:
      
      1) write_{ext_}msg() functions:
      
      	a) Cannot sleep
      	b) Must iterate through targets and send messages to all enabled entries.
      	c) List iteration is protected by target_list_lock spinlock.
      
      2) Network event handling in netconsole_netdev_event():
      
      	a) Needs to sleep
      	b) Requires iteration over the target list (holding
      	   target_list_lock spinlock).
      	c) Some events necessitate netpoll struct cleanup, which *needs*
      	   to sleep.
      
      The target_list_lock needs to be used by non-sleepable functions while
      also protecting operations that may sleep, leading to the current unsafe
      design.
      
      Solution:
      ========
      
      1) Dual Locking Mechanism:
      	- Retain current target_list_lock for non-sleepable use cases.
      	- Introduce target_cleanup_list_lock (mutex) for sleepable
      	  operations.
      
      2) Deferred Cleanup:
      	- Implement atomic, deferred cleanup of structures using the new
      	  mutex (target_cleanup_list_lock).
      	- Avoid the `goto` in the middle of the list_for_each_entry
      
      3) Separate Cleanup List:
      	- Create target_cleanup_list for deferred cleanup, protected by
      	  target_cleanup_list_lock.
      	- This allows cleanup() to sleep without affecting message
      	  transmission.
      	- When iterating over targets, move devices needing cleanup to
      	  target_cleanup_list.
      	- Handle cleanup under the target_cleanup_list_lock mutex.
      
      4) Make a clear locking hierarchy
      
      	- The target_cleanup_list_lock takes precedence over target_list_lock.
      
      	- Major Workflow Locking Sequences:
      		a) Network Event Affecting Netpoll (netconsole_netdev_event):
      			rtnl -> target_cleanup_list_lock -> target_list_lock
      
      		b) Message Writing (write_msg()):
      			console_lock -> target_list_lock
      
      		c) Configfs Target Enable/Disable (enabled_store()):
      			dynamic_netconsole_mutex -> target_cleanup_list_lock -> target_list_lock
      
      This hierarchy ensures consistent lock acquisition order across
      different operations, preventing deadlocks and maintaining proper
      synchronization. The target_cleanup_list_lock's higher priority allows
      for safe deferred cleanup operations without interfering with regular
      message transmission protected by target_list_lock.  Each workflow
      follows a specific locking sequence, ensuring that operations like
      network event handling, message writing, and target management are
      properly synchronized and do not conflict with each other.
      
      Changelog:
      
      v3:
        * Move  netconsole_process_cleanups() function to inside
          CONFIG_NETCONSOLE_DYNAMIC block, avoiding Werror=unused-function
          (Jakub)
      
      v2:
        * The selftest has been removed from the patchset because veth is now
          IFF_DISABLE_NETPOLL. A new test will be sent separately.
        * https://lore.kernel.org/all/20240807091657.4191542-1-leitao@debian.org/
      
      v1:
        * https://lore.kernel.org/all/20240801161213.2707132-1-leitao@debian.org/
      ====================
      
      Link: https://patch.msgid.link/20240808122518.498166-1-leitao@debian.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2c9c2a3d
    • Breno Leitao's avatar
      net: netconsole: Defer netpoll cleanup to avoid lock release during list traversal · 97714695
      Breno Leitao authored
      Current issue:
      - The `target_list_lock` spinlock is held while iterating over
        target_list() entries.
      - Mid-loop, the lock is released to call __netpoll_cleanup(), then
        reacquired.
      - This practice compromises the protection provided by
        `target_list_lock`.
      
      Reason for current design:
      1. __netpoll_cleanup() may sleep, incompatible with holding a spinlock.
      2. target_list_lock must be a spinlock because write_msg() cannot sleep.
         (See commit b5427c27 ("[NET] netconsole: Support multiple logging
          targets"))
      
      Defer the cleanup of the netpoll structure to outside the
      target_list_lock() protected area. Create another list
      (target_cleanup_list) to hold the entries that need to be cleaned up,
      and clean them using a mutex (target_cleanup_list_lock).
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      97714695
    • Breno Leitao's avatar
      net: netconsole: Unify Function Return Paths · f2ab4c1a
      Breno Leitao authored
      The return flow in netconsole's dynamic functions is currently
      inconsistent. This patch aims to streamline and standardize the process
      by ensuring that the mutex is unlocked before returning the ret value.
      
      Additionally, this update includes a minor functional change where
      certain strnlen() operations are performed with the
      dynamic_netconsole_mutex locked. This adjustment is not anticipated to
      cause any issues, however, it is crucial to document this change for
      clarity.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f2ab4c1a
    • Breno Leitao's avatar
      net: netconsole: Standardize variable naming · 5c4a39e8
      Breno Leitao authored
      Update variable names from err to ret in cases where the variable may
      return non-error values.
      
      This change facilitates a forthcoming patch that relies on ret being
      used consistently to handle return values, regardless of whether they
      indicate an error or not.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5c4a39e8
    • Breno Leitao's avatar
      net: netconsole: Correct mismatched return types · e0a2b7e4
      Breno Leitao authored
      netconsole incorrectly mixes int and ssize_t types by using int for
      return variables in functions that should return ssize_t.
      
      This is fixed by updating the return variables to the appropriate
      ssize_t type, ensuring consistency across the function definitions.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e0a2b7e4
    • Breno Leitao's avatar
      net: netpoll: extract core of netpoll_cleanup · 1ef33652
      Breno Leitao authored
      Extract the core part of netpoll_cleanup(), so, it could be called from
      a caller that has the rtnl lock already.
      
      Netconsole uses this in a weird way right now:
      
      	__netpoll_cleanup(&nt->np);
      	spin_lock_irqsave(&target_list_lock, flags);
      	netdev_put(nt->np.dev, &nt->np.dev_tracker);
      	nt->np.dev = NULL;
      	nt->enabled = false;
      
      This will be replaced by do_netpoll_cleanup() as the locking situation
      is overhauled.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Reviewed-by: default avatarRik van Riel <riel@surriel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1ef33652
    • Paolo Abeni's avatar
      Merge branch 'stmmac-add-loongson-platform-support' · 2bbf1aed
      Paolo Abeni authored
      Yanteng Si says:
      
      ====================
      stmmac: Add Loongson platform support
      
      v17:
      * As Serge's comments:
          Add return 0 for _dt_config().
          Get back the conditional MSI-clear method execution.
      
      v16:
      * As Serge's comments:
         Move the of_node_put(plat->mdio_node) call to the DT-config/clear methods.
         Drop 'else if'.
      * Modify the commit message of 7/14. (LS2K CPU -> LS2K SOC)
      
      V15:
      * Drop return that will not be executed.
      * Move pdev from patch 12 to patch 13 to pass W=1 builds.
      
      RFC v15:
      * As Serge's comments:
         Extend the commit message.(patch 7 and patch 11)
         Add fixes tag for patch 8.
         Add loongson_dwmac_dt_clear() patch.
         Modify loongson_dwmac_msi_config().
         ...
      * Pick Huacai's Acked-by tag.
      * Pick Serge's Reviewed-by tag.
      * I have already contacted the author(ZhangQing) of the module,
        so I copied her valid email: diasyzhang@tencent.com.
      
      Note:
      I replied to the comments on v14 last Sunday, but all of Loongson's
      email servers failed to deliver. The network administrator told me
      today that he has fixed the problem and re-delivered all the failed
      emails, but I did not see them on the mailing list. I hope they will
      not suddenly appear in everyone's mailbox one day. I apologize for
      this. (The email content mainly agrees with Serge's suggestion.)
      
      v14:
      
      Because Loongson GMAC can be also found with the 8-channels AV feature
      enabled, we'll need to reconsider the patches logic and thus the
      commit logs too. As Serge's comments and Russell's comments:
      [PATCH net-next v14 01/15] net: stmmac: Move the atds flag to the stmmac_dma_cfg structure
      [PATCH net-next v14 02/15] net: stmmac: Add multi-channel support
      [PATCH net-next v14 03/15] net: stmmac: Export dwmac1000_dma_ops
      [PATCH net-next v14 04/15] net: stmmac: dwmac-loongson: Drop duplicated hash-based filter size init
      [PATCH net-next v14 05/15] net: stmmac: dwmac-loongson: Drop pci_enable/disable_msi calls
      [PATCH net-next v14 06/15] net: stmmac: dwmac-loongson: Use PCI_DEVICE_DATA() macro for device identification
      [PATCH net-next v14 07/15] net: stmmac: dwmac-loongson: Detach GMAC-specific platform data init
      +-> Init the plat_stmmacenet_data::{tx_queues_to_use,rx_queues_to_use}
          in the loongson_gmac_data() method.
      [PATCH net-next v14 08/15] net: stmmac: dwmac-loongson: Init ref and PTP clocks rate
      [PATCH net-next v14 09/15] net: stmmac: dwmac-loongson: Add phy_interface for Loongson GMAC
      [PATCH net-next v14 10/15] net: stmmac: dwmac-loongson: Introduce PCI device info data
      +-> Make sure the setup() method is called after the pci_enable_device()
          invocation.
      [PATCH net-next v14 11/15] net: stmmac: dwmac-loongson: Add DT-less GMAC PCI-device support
      +-> Introduce the loongson_dwmac_dt_config() method here instead of
          doing that in a separate patch.
      +-> Add loongson_dwmac_acpi_config() which would just get the IRQ from
          the pdev->irq field and make sure it is valid.
      [PATCH net-next v14 12/15] net: stmmac: Fixed failure to set network speed to 1000.
      +-> Drop the patch as Russell's comments, At the same time, he provided another
          better repair suggestion, and I decided to send it separately after the
          patch set was merged. See:
          <https://lore.kernel.org/netdev/ZoW1fNqV3PxEobFx@shell.armlinux.org.uk/>
      [PATCH net-next v14 13/15] net: stmmac: dwmac-loongson: Add Loongson Multi-channels GMAC support
      +-> This is former "net: stmmac: dwmac-loongson: Add Loongson GNET
          support" patch, but which adds the support of the Loongson GMAC with the
          8-channels AV-feature available.
      +-> loongson_dwmac_intx_config() shall be dropped due to the
          loongson_dwmac_acpi_config() method added in the PATCH 11/15.
      +-> Make sure loongson_data::loongson_id is initialized before the
          stmmac_pci_info::setup() is called.
      +-> Move the rx_queues_to_use/tx_queues_to_use and coe_unsupported
          fields initialization to the loongson_gmac_data() method.
      +-> As before, call the loongson_dwmac_msi_config() method if the multi-channels
          Loongson MAC has been detected.
      +-> Move everything GNET-specific to the next patch.
      [PATCH net-next v14 14/15] net: stmmac: dwmac-loongson: Add Loongson GNET support
      +-> Everything Loonsgson GNET-specific is supposed to be added in the
          framework of this patch:
          + PCI_DEVICE_ID_LOONGSON_GNET macro
          + loongson_gnet_fix_speed() method
          + loongson_gnet_data() method
          + loongson_gnet_pci_info data
          + The GNET-specific part of the loongson_dwmac_setup() method.
          + ...
      [PATCH net-next v14 15/15] net: stmmac: dwmac-loongson: Add loongson module author
      
      Other's:
      Pick Serge's Reviewed-by tag.
      
      v13:
      
      * Sorry, we have clarified some things in the past 10 days. I did not
       give you a clear reply to the following questions in v12, so I need
       to reply again:
      
       1. The current LS2K2000 also have a GMAC(and two GNET) that supports 8
          channels, so we have to reconsider the initialization of
          tx/rx_queues_to_use into probe();
      
       2. In v12, we disagreed on the loongson_dwmac_msi_config method, but I changed
          it based on Serge's comments(If I understand correctly):
      	if (dev_of_node(&pdev->dev)) {
      		ret = loongson_dwmac_dt_config(pdev, plat, &res);
      	}
      
      	if (ld->loongson_id == DWMAC_CORE_LS2K2000) {
      		ret = loongson_dwmac_msi_config(pdev, plat, &res);
      	} else {
      		ret = loongson_dwmac_intx_config(pdev, plat, &res);
      	}
      
       3. Our priv->dma_cap.pcs is false, so let's use PHY_INTERFACE_MODE_NA;
      
       4. Our GMAC does not support Delay, so let's use PHY_INTERFACE_MODE_RGMII_ID,
          the current dts is wrong, a fix patch will be sent to the LoongArch list
          later.
      
      Others:
      * Re-split a part of the patch (it seems we do this with every version);
      * Copied Serge's comments into the commit message of patch;
      * Fixed the stmmac_dma_operation_mode() method;
      * Changed some code comments.
      
      v12:
      * The biggest change is the re-splitting of patches.
      * Add a "gmac_version" in loongson_data, then we only
        read it once in the _probe().
      * Drop Serge's patch.
      * Rebase to the latest code state.
      * Fixed the gnet commit message.
      
      v11:
      * Break loongson_phylink_get_caps(), fix bad logic.
      * Remove a unnecessary ";".
      * Remove some unnecessary "{}".
      * add a blank.
      * Move the code of fix _force_1000 to patch 6/6.
      
      The main changes occur in these two functions:
      loongson_dwmac_probe();
      loongson_dwmac_setup();
      
      v10:
      As Andrew's comment:
      * Add a #define for the 0x37.
      * Add a #define for Port Select.
      
      others:
      * Pick Serge's patch, This patch resulted from the process
        of reviewing our patch set.
      * Based on Serge's patch, modify our loongson_phylink_get_caps().
      * Drop patch 3/6, we need mac_interface.
      * Adjusted the code layout of gnet patch.
      * Corrected several errata in commit message.
      * Move DISABLE_FORCE flag to loongson_gnet_data().
      
      v9:
      We have not provided a detailed list of equipment for a long time,
      and I apologize for this. During this period, I have collected some
      information and now present it to you, hoping to alleviate the pressure
      of review.
      
      1. IP core
      We now have two types of IP cores, one is 0x37, similar to dwmac1000;
      The other is 0x10.  Compared to 0x37, we split several DMA registers
      from one to two, and it is not worth adding a new entry for this.
      According to Serge's comment, we made these devices work by overwriting
      priv->synopsys_id = 0x37 and mac->dma = <LS_dma_ops>.
      
      1.1.  Some more detailed information
      The number of DMA channels for 0x37 is 1; The number of DMA channels
      for 0x10 is 8.  Except for channel 0, otherchannels do not support
      sending hardware checksums. Supported AV features are Qav, Qat, and Qas,
      and the rest are consistent with 3.73.
      
      2. DEVICE
      We have two types of devices,
      one is GMAC, which only has a MAC chip inside and needs an external PHY
      chip;
      the other is GNET, which integrates both MAC and PHY chips inside.
      
      2.1.  Some more detailed information
      GMAC device: LS7A1000, LS2K1000, these devices do not support any pause
      mode.
      gnet device: LS7A2000, LS2K2000, the chip connection between the mac and
                   phy of these devices is not normal and requires two rounds of
                   negotiation; LS7A2000 does not support half-duplex and
      multi-channel;
                   to enable multi-channel on LS2K2000, you need to turn off
      hardware checksum.
      **Note**: Only the LS2K2000's IP core is 0x10, while the IP cores of other
      devices are 0x37.
      
      3. TABLE
      
      device    type    pci_id    ip_core
      ls7a1000  gmac    7a03      0x35/0x37
      ls2k1000  gmac    7a03      0x35/0x37
      ls7a2000  gnet    7a13      0x37
      ls2k2000  gnet    7a13      0x10
      -----------------------------------------------
      Changes:
      
      * passed the CI
        <https://github.com/linux-netdev/nipa/blob/main/tests/patch/checkpatch
        /checkpatch.sh>
      * reverse xmas tree order.
      * Silence build warning.
      * Re-split the patch.
      * Add more detailed commit message.
      * Add more code comment.
      * Reduce modification of generic code.
      * using the GNET-specific prefix.
      * define a new macro for the GNET MAC.
      * Use an easier way to overwrite mac.
      * Removed some useless printk.
      
      v8:
      * The biggest change is according to Serge's comment in the previous
        edition:
         Seeing the patch in the current state would overcomplicate the generic
         code and the only functions you need to update are
         dwmac_dma_interrupt()
         dwmac1000_dma_init_channel()
         you can have these methods re-defined with all the Loongson GNET
         specifics in the low-level platform driver (dwmac-loongson.c). After
         that you can just override the mac_device_info.dma pointer with a
         fixed stmmac_dma_ops descriptor. Here is what should be done for that:
      
         1. Keep the Patch 4/9 with my comments fixed. First it will be partly
         useful for your GNET device. Second in general it's a correct
         implementation of the normal DW GMAC v3.x multi-channels feature and
         will be useful for the DW GMACs with that feature enabled.
      
         2. Create the Loongson GNET-specific
         stmmac_dma_ops.dma_interrupt()
         stmmac_dma_ops.init_chan()
         methods in the dwmac-loongson.c driver. Don't forget to move all the
         Loongson-specific macros from dwmac_dma.h to dwmac-loongson.c.
      
         3. Create a Loongson GNET-specific platform setup method with the next
         semantics:
            + allocate stmmac_dma_ops instance and initialize it with
              dwmac1000_dma_ops.
            + override the stmmac_dma_ops.{dma_interrupt, init_chan} with
              the pointers to the methods defined in 2.
            + allocate mac_device_info instance and initialize the
              mac_device_info.dma field with a pointer to the new
              stmmac_dma_ops instance.
            + call dwmac1000_setup() or initialize mac_device_info in a way
              it's done in dwmac1000_setup() (the later might be better so you
              wouldn't need to export the dwmac1000_setup() function).
            + override stmmac_priv.synopsys_id with a correct value.
      
         4. Initialize plat_stmmacenet_data.setup() with the pointer to the
         method created in 3.
      
      * Others:
        Re-split the patch.
        Passed checkpatch.pl test.
      
      v7:
      * Refer to andrew's suggestion:
        - Add DMA_INTR_ENA_NIE_RX and DMA_INTR_ENA_NIE_TX #define's, etc.
      
      * Others:
        - Using --subject-prefix="PATCH net-next vN" to indicate that the
          patches are for the networking tree.
        - Rebase to the latest networking tree:
          <git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git>
      
      v6:
      
      * Refer to Serge's suggestion:
        - Add new platform feature flag:
          include/linux/stmmac.h:
          +#define STMMAC_FLAG_HAS_LGMAC			BIT(13)
      
        - Add the IRQs macros specific to the Loongson Multi-channels GMAC:
           drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h:
           +#define DMA_INTR_ENA_NIE_LOONGSON 0x00060000      /* ...*/
           #define DMA_INTR_ENA_NIE 0x00010000	/* Normal Summary */
           ...
      
        - Drop all of redundant changes that don't require the
          prototypes being converted to accepting the stmmac_priv
          pointer.
      
      * Refer to andrew's suggestion:
        - Drop white space changes.
        - break patch up into lots of smaller parts.
           Some small patches have been put into another series as a preparation
           see <https://lore.kernel.org/loongarch/cover.1702289232.git.siyanteng@loongson.cn/T/#t>
      
           *note* : This series of patches relies on the three small patches above.
      * others
        - Drop irq_flags changes.
        - Changed patch order.
      
      v4 -> v5:
      
      * Remove an ugly and useless patch (fix channel number).
      * Remove the non-standard dma64 driver code, and also remove
        the HWIF entries, since the associated custom callbacks no
        longer exist.
      * Refer to Serge's suggestion: Update the dwmac1000_dma.c to
        support the multi-DMA-channels controller setup.
      
      See:
      v4: <https://lore.kernel.org/loongarch/cover.1692696115.git.chenfeiyang@loongson.cn/>
      v3: <https://lore.kernel.org/loongarch/cover.1691047285.git.chenfeiyang@loongson.cn/>
      v2: <https://lore.kernel.org/loongarch/cover.1690439335.git.chenfeiyang@loongson.cn/>
      v1: <https://lore.kernel.org/loongarch/cover.1689215889.git.chenfeiyang@loongson.cn/>
      ====================
      
      Link: https://patch.msgid.link/cover.1723014611.git.siyanteng@loongson.cnSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2bbf1aed
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Add loongson module author · 930df099
      Yanteng Si authored
      Add Yanteng Si as MODULE_AUTHOR of Loongson DWMAC PCI driver.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      930df099
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Add Loongson GNET support · 56dbe2c2
      Yanteng Si authored
      The new generation Loongson LS2K2000 SoC and LS7A2000 chipset are
      equipped with the network controllers called Loongson GNET. It's the
      single and multi DMA-channels Loongson GMAC but with a PHY attached.
      Here is the summary of the DW GMAC features the controller has:
      
         DW GMAC IP-core: v3.73a
         Speeds: 10/100/1000Mbps
         Duplex: Full (both versions), Half (LS2K2000 GNET only)
         DMA-descriptors type: enhanced
         L3/L4 filters availability: Y
         VLAN hash table filter: Y
         PHY-interface: GMII (PHY is integrated into the chips)
         Remote Wake-up support: Y
         Mac Management Counters (MMC): Y
         Number of additional MAC addresses: 5
         MAC Hash-based filter: Y
         Hash Table Size: 256
         AV feature: Y (LS2K2000 GNET only)
         DMA channels: 8 (LS2K2000 GNET), 1 (LS7A2000 GNET)
      
      Let's update the Loongson DWMAC driver to supporting the new Loongson
      GNET controller. The change is mainly trivial: the driver shall be
      bound to the PCIe device with DID 0x7a13, and the device-specific
      setup() method shall be called for it. The only peculiarity concerns
      the integrated PHY speed change procedure. The PHY has a weird problem
      with switching from the low speeds to 1000Mbps mode. The speedup
      procedure requires the PHY-link re-negotiation. So the suggested
      change provide the device-specific fix_mac_speed() method to overcome
      the problem.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      56dbe2c2
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Add Loongson Multi-channels GMAC support · 803fc61d
      Yanteng Si authored
      The Loongson DWMAC driver currently supports the Loongson GMAC
      devices (based on the DW GMAC v3.50a/v3.73a IP-core) installed to the
      LS2K1000 SoC and LS7A1000 chipset. But recently a new generation
      LS2K2000 SoC was released with the new version of the Loongson GMAC
      synthesized in. The new controller is based on the DW GMAC v3.73a
      IP-core with the AV-feature enabled, which implies the multi
      DMA-channels support. The multi DMA-channels feature has the next
      vendor-specific peculiarities:
      
      1. Split up Tx and Rx DMA IRQ status/mask bits:
             Name              Tx          Rx
        DMA_INTR_ENA_NIE = 0x00040000 | 0x00020000;
        DMA_INTR_ENA_AIE = 0x00010000 | 0x00008000;
        DMA_STATUS_NIS   = 0x00040000 | 0x00020000;
        DMA_STATUS_AIS   = 0x00010000 | 0x00008000;
        DMA_STATUS_FBI   = 0x00002000 | 0x00001000;
      2. Custom Synopsys ID hardwired into the GMAC_VERSION.SNPSVER register
      field. It's 0x10 while it should have been 0x37 in accordance with
      the actual DW GMAC IP-core version.
      3. There are eight DMA-channels available meanwhile the Synopsys DW
      GMAC IP-core supports up to three DMA-channels.
      4. It's possible to have each DMA-channel IRQ independently delivered.
      The MSI IRQs must be utilized for that.
      
      Thus in order to have the multi-channels Loongson GMAC controllers
      supported let's modify the Loongson DWMAC driver in accordance with
      all the peculiarities described above:
      
      1. Create the multi-channels Loongson GMAC-specific
         stmmac_dma_ops::dma_interrupt()
         stmmac_dma_ops::init_chan()
         callbacks due to the non-standard DMA IRQ CSR flags layout.
      2. Create the Loongson DWMAC-specific platform setup() method
      which gets to initialize the DMA-ops with the dwmac1000_dma_ops
      instance and overrides the callbacks described in 1. The method also
      overrides the custom Synopsys ID with the real one in order to have
      the rest of the HW-specific callbacks correctly detected by the driver
      core.
      3. Make sure the platform setup() method enables the flow control and
      duplex modes supported by the controller.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      803fc61d
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Add DT-less GMAC PCI-device support · 126f4f96
      Yanteng Si authored
      The Loongson GMAC driver currently supports the network controllers
      installed on the LS2K1000 SoC and LS7A1000 chipset, for which the GMAC
      devices are required to be defined in the platform device tree source.
      But Loongson machines may have UEFI (implies ACPI) or PMON/UBOOT
      (implies FDT) as the system bootloaders. In order to have both system
      configurations support let's extend the driver functionality with the
      case of having the Loongson GMAC probed on the PCI bus with no device
      tree node defined for it. That requires to make the device DT-node
      optional, to rely on the IRQ line detected by the PCI core and to
      have the MDIO bus ID calculated using the PCIe Domain+BDF numbers.
      
      In order to have the device probe() and remove() methods less
      complicated let's move the DT- and ACPI-specific code to the
      respective sub-functions.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      126f4f96
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Introduce PCI device info data · 0ec04d32
      Yanteng Si authored
      The Loongson GNET device support is about to be added in one of the
      next commits. As another preparation for that introduce the PCI device
      info data with a setup() callback performing the device-specific
      platform data initializations. Currently it is utilized for the
      already supported Loongson GMAC device only.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0ec04d32
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Add phy_interface for Loongson GMAC · 849dc734
      Yanteng Si authored
      PHY-interface of the Loongson GMAC device is RGMII with no internal
      delays added to the data lines signal. So to comply with that let's
      pre-initialize the platform-data field with the respective enum
      constant.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      849dc734
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Init ref and PTP clocks rate · c70f3163
      Yanteng Si authored
      Reference and PTP clocks rate of the Loongson GMAC devices is 125MHz.
      (So is in the GNET devices which support is about to be added.) Set
      the respective plat_stmmacenet_data field up in accordance with that
      so to have the coalesce command and timestamping work correctly.
      
      Fixes: 30bba69d ("stmmac: pci: Add dwmac support for Loongson")
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c70f3163
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Detach GMAC-specific platform data init · 79afc700
      Yanteng Si authored
      Loongson delivers two types of the network devices: Loongson GMAC and
      Loongson GNET in the framework of four SOC/Chipsets revisions:
      
         Chip             Network  PCI Dev ID   Synopys Version   DMA-channel
      LS2K1000 SOC         GMAC      0x7a03       v3.50a/v3.73a        1
      LS7A1000 Chipset     GMAC      0x7a03       v3.50a/v3.73a        1
      LS2K2000 SOC         GMAC      0x7a03          v3.73a            8
      LS2K2000 SOC         GNET      0x7a13          v3.73a            8
      LS7A2000 Chipset     GNET      0x7a13          v3.73a            1
      
      The driver currently supports the chips with the Loongson GMAC network
      device synthesized with a single DMA-channel available. As a
      preparation before adding the Loongson GNET support detach the
      Loongson GMAC-specific platform data initializations to the
      loongson_gmac_data() method and preserve the common settings in the
      loongson_default_data().
      
      While at it drop the return value statement from the
      loongson_default_data() method as redundant.
      
      Note there is no intermediate vendor-specific PCS in between the MAC
      and PHY on Loongson GMAC and GNET. So the plat->mac_interface field
      can be freely initialized with the PHY_INTERFACE_MODE_NA value.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      79afc700
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Use PCI_DEVICE_DATA() macro for device identification · 324d96b4
      Yanteng Si authored
      For the readability sake convert the hard-coded Loongson GMAC PCI ID to
      the respective macro and use the PCI_DEVICE_DATA() macro-function to
      create the pci_device_id array entry. The later change will be
      specifically useful in order to assign the device-specific data for the
      currently supported device and for about to be added Loongson GNET
      controller.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      324d96b4
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Drop pci_enable/disable_msi calls · 0c979e6b
      Yanteng Si authored
      The Loongson GMAC driver currently doesn't utilize the MSI IRQs, but
      retrieves the IRQs specified in the device DT-node. Let's drop the
      direct pci_enable_msi()/pci_disable_msi() calls then as redundant
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      0c979e6b
    • Yanteng Si's avatar
      net: stmmac: dwmac-loongson: Drop duplicated hash-based filter size init · 393ea68b
      Yanteng Si authored
      The plat_stmmacenet_data::multicast_filter_bins field is twice
      initialized in the loongson_default_data() method. Drop the redundant
      initialization, but for the readability sake keep the filters init
      statements defined in the same place of the method.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      393ea68b
    • Yanteng Si's avatar
      net: stmmac: Export dwmac1000_dma_ops · 005c0f07
      Yanteng Si authored
      Export the DW GMAC DMA-ops descriptor so one could be available in
      the low-level platform drivers. It will be utilized to override some
      callbacks in order to handle the LS2K2000 GNET device specifics. The
      GNET controller support is being added in one of the following up
      commits.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      005c0f07
    • Yanteng Si's avatar
      net: stmmac: Add multi-channel support · ad72f783
      Yanteng Si authored
      DW GMAC v3.73 can be equipped with the Audio Video (AV) feature which
      enables transmission of time-sensitive traffic over bridged local area
      networks (DWC Ethernet QoS Product). In that case there can be up to two
      additional DMA-channels available with no Tx COE support (unless there is
      vendor-specific IP-core alterations). Each channel is implemented as a
      separate Control and Status register (CSR) for managing the transmit and
      receive functions, descriptor handling, and interrupt handling.
      
      Add the multi-channels DW GMAC controllers support just by making sure the
      already implemented DMA-configs are performed on the per-channel basis.
      
      Note the only currently known instance of the multi-channel DW GMAC
      IP-core is the LS2K2000 GNET controller, which has been released with the
      vendor-specific feature extension of having eight DMA-channels. The device
      support will be added in one of the following up commits.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ad72f783
    • Yanteng Si's avatar
      net: stmmac: Move the atds flag to the stmmac_dma_cfg structure · 12dbc67c
      Yanteng Si authored
      ATDS (Alternate Descriptor Size) is a part of the DMA Bus Mode configs
      (together with PBL, ALL, EME, etc) of the DW GMAC controllers. Seeing
      it's not changed at runtime but is activated as long as the IP-core
      has it supported (at least due to the Type 2 Full Checksum Offload
      Engine feature), move the respective parameter from the
      stmmac_dma_ops::init() callback argument to the stmmac_dma_cfg
      structure, which already have the rest of the DMA-related configs
      defined.
      
      Besides the being added in the next commit DW GMAC multi-channels
      support will require to add the stmmac_dma_ops::init_chan() callback
      and have the ATDS flag set/cleared for each channel in there. Having
      the atds-flag in the stmmac_dma_cfg structure will make the parameter
      accessible from stmmac_dma_ops::init_chan() callback too.
      Signed-off-by: default avatarFeiyang Chen <chenfeiyang@loongson.cn>
      Signed-off-by: default avatarYinggang Gu <guyinggang@loongson.cn>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Acked-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Signed-off-by: default avatarYanteng Si <siyanteng@loongson.cn>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      12dbc67c
    • Gustavo A. R. Silva's avatar
      net/smc: Use static_assert() to check struct sizes · 0a3e6939
      Gustavo A. R. Silva authored
      Commit 9748dbc9 ("net/smc: Avoid -Wflex-array-member-not-at-end
      warnings") introduced tagged `struct smc_clc_v2_extension_fixed` and
      `struct smc_clc_smcd_v2_extension_fixed`. We want to ensure that when
      new members need to be added to the flexible structures, they are
      always included within these tagged structs.
      
      So, we use `static_assert()` to ensure that the memory layout for
      both the flexible structure and the tagged struct is the same after
      any changes.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarJan Karcher <jaka@linux.ibm.com>
      Link: https://patch.msgid.link/ZrVBuiqFHAORpFxE@cuteSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0a3e6939
    • Gustavo A. R. Silva's avatar
      nfp: Use static_assert() to check struct sizes · 46dd90fe
      Gustavo A. R. Silva authored
      Commit d88cabfd ("nfp: Avoid -Wflex-array-member-not-at-end
      warnings") introduced tagged `struct nfp_dump_tl_hdr`. We want
      to ensure that when new members need to be added to the flexible
      structure, they are always included within this tagged struct.
      
      So, we use `static_assert()` to ensure that the memory layout for
      both the flexible structure and the tagged struct is the same after
      any changes.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://patch.msgid.link/ZrVB43Hen0H5WQFP@cuteSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      46dd90fe
    • Gustavo A. R. Silva's avatar
      sched: act_ct: avoid -Wflex-array-member-not-at-end warning · e2d0fadd
      Gustavo A. R. Silva authored
      -Wflex-array-member-not-at-end was introduced in GCC-14, and we are
      getting ready to enable it, globally.
      
      Remove unnecessary flex-array member `pad[]` and refactor the related
      code a bit.
      
      Fix the following warning:
      net/sched/act_ct.c:57:29: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://patch.msgid.link/ZrY0JMVsImbDbx6r@cuteSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2d0fadd
    • Jakub Kicinski's avatar
      Merge branch 'net-nexthop-increase-weight-to-u16' · e96f6fd3
      Jakub Kicinski authored
      Petr Machata says:
      
      ====================
      net: nexthop: Increase weight to u16
      
      In CLOS networks, as link failures occur at various points in the network,
      ECMP weights of the involved nodes are adjusted to compensate. With high
      fan-out of the involved nodes, and overall high number of nodes,
      a (non-)ECMP weight ratio that we would like to configure does not fit into
      8 bits. Instead of, say, 255:254, we might like to configure something like
      1000:999. For these deployments, the 8-bit weight may not be enough.
      
      To that end, in this patchset increase the next hop weight from u8 to u16.
      
      Patch #1 adds a flag that indicates whether the reserved fields are zeroed.
      This is a follow-up to a new fix merged in commit 6d745cd0 ("net:
      nexthop: Initialize all fields in dumped nexthops"). The theory behind this
      patch is that there is a strict ordering between the fields actually being
      zeroed, the kernel declaring that they are, and the kernel repurposing the
      fields. Thus clients can use the flag to tell if it is safe to interpret
      the reserved fields in any way.
      
      Patch #2 contains the substantial code and the commit message covers the
      details of the changes.
      
      Patches #3 to #6 add selftests.
      ====================
      
      Link: https://patch.msgid.link/cover.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e96f6fd3
    • Petr Machata's avatar
      selftests: fib_nexthops: Test 16-bit next hop weights · 4b808f44
      Petr Machata authored
      Add tests that attempt to create NH groups that use full 16 bits of NH
      weight.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://patch.msgid.link/101cdd3f2bfd9511c9bec95f909d20ff56f70ba5.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4b808f44
    • Petr Machata's avatar
      selftests: router_mpath_nh_res: Test 16-bit next hop weights · dce0765c
      Petr Machata authored
      Add tests that exercise full 16 bits of NH weight.
      
      Like in the previous patch, omit the 255:65535 test when KSFT_MACHINE_SLOW.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://patch.msgid.link/a91d6ead9d1b1b4b7e276ca58a71ef814f42b7dd.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dce0765c
    • Petr Machata's avatar
      selftests: router_mpath_nh: Test 16-bit next hop weights · bb89fdac
      Petr Machata authored
      Add tests that exercise full 16 bits of NH weight.
      
      To test the 255:65535, it is necessary to run more packets than for the
      other tests. On a debug kernel, the test can take up to a minute, therefore
      avoid the test when KSFT_MACHINE_SLOW.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://patch.msgid.link/c0c257c00ad30b07afc3fa5e2afd135925405544.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bb89fdac
    • Petr Machata's avatar
      selftests: router_mpath: Sleep after MZ · 110d3ffe
      Petr Machata authored
      In the context of an offloaded datapath, it may take a while for the ip
      link stats to be updated. This causes the test to fail when MZ_DELAY is too
      low. Sleep after the packets are sent for the link stats to get up to date.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://patch.msgid.link/8b1971d948273afd7de2da3d6a2ba35200540e55.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      110d3ffe
    • Petr Machata's avatar
      net: nexthop: Increase weight to u16 · b72a6a7a
      Petr Machata authored
      In CLOS networks, as link failures occur at various points in the network,
      ECMP weights of the involved nodes are adjusted to compensate. With high
      fan-out of the involved nodes, and overall high number of nodes,
      a (non-)ECMP weight ratio that we would like to configure does not fit into
      8 bits. Instead of, say, 255:254, we might like to configure something like
      1000:999. For these deployments, the 8-bit weight may not be enough.
      
      To that end, in this patch increase the next hop weight from u8 to u16.
      
      Increasing the width of an integral type can be tricky, because while the
      code still compiles, the types may not check out anymore, and numerical
      errors come up. To prevent this, the conversion was done in two steps.
      First the type was changed from u8 to a single-member structure, which
      invalidated all uses of the field. This allowed going through them one by
      one and audit for type correctness. Then the structure was replaced with a
      vanilla u16 again. This should ensure that no place was missed.
      
      The UAPI for configuring nexthop group members is that an attribute
      NHA_GROUP carries an array of struct nexthop_grp entries:
      
      	struct nexthop_grp {
      		__u32	id;	  /* nexthop id - must exist */
      		__u8	weight;   /* weight of this nexthop */
      		__u8	resvd1;
      		__u16	resvd2;
      	};
      
      The field resvd1 is currently validated and required to be zero. We can
      lift this requirement and carry high-order bits of the weight in the
      reserved field:
      
      	struct nexthop_grp {
      		__u32	id;	  /* nexthop id - must exist */
      		__u8	weight;   /* weight of this nexthop */
      		__u8	weight_high;
      		__u16	resvd2;
      	};
      
      Keeping the fields split this way was chosen in case an existing userspace
      makes assumptions about the width of the weight field, and to sidestep any
      endianness issues.
      
      The weight field is currently encoded as the weight value minus one,
      because weight of 0 is invalid. This same trick is impossible for the new
      weight_high field, because zero must mean actual zero. With this in place:
      
      - Old userspace is guaranteed to carry weight_high of 0, therefore
        configuring 8-bit weights as appropriate. When dumping nexthops with
        16-bit weight, it would only show the lower 8 bits. But configuring such
        nexthops implies existence of userspace aware of the extension in the
        first place.
      
      - New userspace talking to an old kernel will work as long as it only
        attempts to configure 8-bit weights, where the high-order bits are zero.
        Old kernel will bounce attempts at configuring >8-bit weights.
      
      Renaming reserved fields as they are allocated for some purpose is commonly
      done in Linux. Whoever touches a reserved field is doing so at their own
      risk. nexthop_grp::resvd1 in particular is currently used by at least
      strace, however they carry an own copy of UAPI headers, and the conversion
      should be trivial. A helper is provided for decoding the weight out of the
      two fields. Forcing a conversion seems preferable to bending backwards and
      introducing anonymous unions or whatever.
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Link: https://patch.msgid.link/483e2fcf4beb0d9135d62e7d27b46fa2685479d4.1723036486.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b72a6a7a