1. 25 Aug, 2021 40 commits
    • Sunil Goutham's avatar
      octeontx2-af: Remove channel verification while installing MCAM rules · 18603683
      Sunil Goutham authored
      New usecases are popping up where in user wants to install common MCAM
      filters for all interfaces. Having channel verification will result in
      duplicating such MCAM filters for each of the ingress interface. Hence
      removed channel verification.
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18603683
    • Subbaraya Sundeep's avatar
      octeontx2-af: Add PTP device id for CN10K and 95O silcons · a8b90c9d
      Subbaraya Sundeep authored
      CN10K slicon has different device id for PTP device.
      Hence this patch updates the driver with new id.
      Though ptp driver being a separate driver AF manages
      configuring PTP block by all PFs. To manage ptp, AF
      driver checks in its probe whether
      1. ptp hardware device found on silicon
      2. A driver is bound to ptp device
      3. The ptp driver probe is successful
      
      In failure of cases 1 and 3, AF proceeds with out ptp
      and for case 2 defers the probe. This patch refactors
      code also to check for all the PTP device ids given in
      ptp device ids table for case 1.
      
      Also added PTP device ID for 95O silicon
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8b90c9d
    • George Cherian's avatar
      octeontx2-af: Add free rsrc count mbox msg · 275e5d17
      George Cherian authored
      Upon receiving the MBOX_MSG_FREE_RSRC_CNT, the AF will find out the
      current number of free resources and reply it back to the requester. No
      guarantee is given on the future state of the free resources yet.
      If another requester sends MBOX_MSG_ATTACH_RESOURCES after this call,
      the number of available resources might change.
      Signed-off-by: default avatarGeorge Cherian <george.cherian@marvell.com>
      Signed-off-by: default avatarStanislaw Kardach <skardach@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      275e5d17
    • Radha Mohan Chintakuntla's avatar
      octeontx2-af: Add SDP interface support · fe1939bb
      Radha Mohan Chintakuntla authored
      Added support for packet IO via SDK links which is used when
      Octeon is connected as a end-point. Traffic host to end-point
      and vice versa flow through SDP links. This patch also support
      dual SDP blocks supported in 98xx silicon.
      Signed-off-by: default avatarRadha Mohan Chintakuntla <radhac@marvell.com>
      Signed-off-by: default avatarNalla Pradeep <pnalla@marvell.com>
      Signed-off-by: default avatarSubrahmanyam Nilla <snilla@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe1939bb
    • Harman Kalra's avatar
      octeontx2-af: nix and lbk in loop mode in 98xx · aefaa8c7
      Harman Kalra authored
      In 98xx, there are 2 NIX blocks and 4 LBK blocks present. The way
      these NIX-LBK should be configured depends on the use case. By
      default loopback functionality is supported in AF VF pairs which
      are attached to NIX0 and NIX1 LFs alternatively to ensure load
      balancing. NIX0 transmits a packet to LBK1 which will be received
      by NIX1 and packet transmitted by NIX1 will get received by NIX0 via
      LBK2.
      
      There are some requirements where only one AF VF is used and respective
      NIX is expected to operate in a mode where it can receive it own packet
      back. This can be achieved if NIX0 sends packet to LBK0 and not LBK1.
      Adding a flag in LF alloc request mailbox which can setup NIX0 to use
      LBK0 and NIX1 can use LBK3.
      Signed-off-by: default avatarHarman Kalra <hkalra@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aefaa8c7
    • Subbaraya Sundeep's avatar
      octeontx2-pf: cleanup transmit link deriving logic · 039190bb
      Subbaraya Sundeep authored
      Unlike OcteonTx2, the channel numbers used by CGX/RPM
      and LBK on CN10K silicons aren't fixed in HW. They are
      SW programmable, hence we cannot derive transmit link
      from static channel numbers anymore. Get the same from
      admin function via mailbox.
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      039190bb
    • Jerin Jacob's avatar
      octeontx2-af: Allow to configure flow tag LSB byte as RSS adder · 72e192a1
      Jerin Jacob authored
      Before C0 HW revision, The RSS adder was computed based the
      following static formula.
      
      rss_adder<7:0> = flow_tag<7:0> ^ flow_tag<15:8> ^
      flow_tag<23:16> ^ flow_tag<31:24>
      
      The above scheme has the following drawbacks:
      1) It is not in line with other standard NIC behavior.
      2) There can be an SW use case where SW can compute the hash
      upfront using Toeplitz function and predict the queue selection
      to optimize some packet lookup function. The nonstandard
      way of doing XOR makes the consumer to not predict the queue selection.
      
      C0 HW revision onwards, The HW can configure the
      rss_adder<7:0> as flow_tag<7:0> to align with standard NICs.
      
      This patch adds an option to select legacy RSS adder mode
      vs standard NIC behavior by setting NIX_LF_RSS_TAG_LSB_AS_ADDER flag.
      
      Since this bit field is used as reserved in old HW revisions,
      No need to have an additional HW version check.
      Signed-off-by: default avatarJerin Jacob <jerinj@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72e192a1
    • Nithin Dabilpuram's avatar
      octeontx2-af: enable tx shaping feature for 96xx C0 · d0641163
      Nithin Dabilpuram authored
      Starting from 96xx C0 onwards all silicons support traffic shaping.
      This patch enables that feature along with other changes
      - When PIR/CIR shaping config is modified, toggle SW_XOFF
        for config to take effect
      - Before SMQ flush, clear SW_XOFF at all parent schedulers
      - Support to read current transmit scheduler configuration via mbox
      Signed-off-by: default avatarNithin Dabilpuram <ndabilpuram@marvell.com>
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarSubbaraya Sundeep <sbhatta@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0641163
    • Cai Huoqing's avatar
      net: ethernet: actions: Add helper dependency on COMPILE_TEST · fbcf8a34
      Cai Huoqing authored
      it's helpful for complie test in other platform(e.g.X86)
      Signed-off-by: default avatarCai Huoqing <caihuoqing@baidu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbcf8a34
    • Nithin Dabilpuram's avatar
      octeontx2-af: Wait for TX link idle for credits change · 1c74b891
      Nithin Dabilpuram authored
      NIX_AF_TX_LINKX_NORM_CREDIT holds running counter of
      tx credits available per link. But, tx credits should be
      configured based on MTU config. So MTU change needs tx
      credit count update.
      
      An issue exists whereby when both PF & VF are enabled and
      PF traffic is flowing, if VF requests for MTU update,
      updating the NORM_CREDIT register will lead to corruption
      of credit count and subsequent deadlock of tx link as
      the NORM_CREDIT register holds running count.
      
      This patch provides workaround by pausing link traffic
      using NIX_AF_TL1X_SW_XOFF, waiting for existing packets to
      drain, and used credits be returned before updating new
      credit count.
      Signed-off-by: default avatarNithin Dabilpuram <ndabilpuram@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c74b891
    • Nithin Dabilpuram's avatar
      octeontx2-af: Change the order of queue work and interrupt disable · 906999c9
      Nithin Dabilpuram authored
      Clear and disable interrupt before queueing work as there might be
      a chance that work gets completed on other core faster and
      interrupt enable as a part of the work completes before
      interrupt disable in the interrupt context. This leads to
      permanent disable of interrupt.
      Signed-off-by: default avatarNithin Dabilpuram <ndabilpuram@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      906999c9
    • Geetha sowjanya's avatar
      octeontx2-af: cn10k: Set cache lines for NPA batch alloc · ae2c341e
      Geetha sowjanya authored
      Set NPA batch allocation engine to process 35 cache lines
      per turn on CN10k platform.
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae2c341e
    • Shaokun Zhang's avatar
      mctp: Remove the repeated declaration · 87e5ef4b
      Shaokun Zhang authored
      Function 'mctp_dev_get_rtnl' is declared twice, so remove the
      repeated declaration.
      
      Cc: Jeremy Kerr <jk@codeconstruct.com.au>
      Cc: Matt Johnston <matt@codeconstruct.com.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarShaokun Zhang <zhangshaokun@hisilicon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      87e5ef4b
    • David S. Miller's avatar
      Merge tag 'linux-can-next-for-5.15-20210825' of... · 45bc6125
      David S. Miller authored
      Merge tag 'linux-can-next-for-5.15-20210825' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      Marc Kleine says:
      
      ====================
      pull-request: can-next 2021-08-25
      
      this is a pull request of 4 patches for net-next/master.
      
      The first patch is by Cai Huoqing, and enables COMPILE_TEST for the
      rcar CAN drivers.
      
      Lad Prabhakar contributes a patch for the rcar_canfd driver, fixing a
      redundant assignment.
      
      The last 2 patches are by Tang Bin, target the mscan driver, and clean
      up the driver by converting it to of_device_get_match_data() and
      removing a useless BUG_ON.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45bc6125
    • David S. Miller's avatar
      Merge branch 'ravb-gbit-refactor' · b87a542c
      David S. Miller authored
      Biju Das says:
      
      ====================
      Add Factorisation code to support Gigabit Ethernet driver
      
      The DMAC and EMAC blocks of Gigabit Ethernet IP found on RZ/G2L SoC are
      similar to the R-Car Ethernet AVB IP.
      
      The Gigabit Ethernet IP consists of Ethernet controller (E-MAC), Internal
      TCP/IP Offload Engine (TOE)  and Dedicated Direct memory access controller
      (DMAC).
      
      With a few changes in the driver we can support both IPs.
      
      This patch series aims to add factorisation code to support RZ/G2L SoC,
      hardware feature bits for gPTP feature, Multiple irq feature and
      optional reset support.
      
      Ref:-
       * https://lore.kernel.org/linux-renesas-soc/TYCPR01MB59334319695607A2683C1A5E86E59@TYCPR01MB5933.jpnprd01.prod.outlook.com/T/#t
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b87a542c
    • Biju Das's avatar
      ravb: Add reset support · 0d13a1a4
      Biju Das authored
      Reset support is present on R-Car. Let's support it, if it is
      available.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d13a1a4
    • Biju Das's avatar
      ravb: Factorise ravb_emac_init function · 511d74d9
      Biju Das authored
      The E-MAC IP on the R-Car AVB module has different initialization
      parameters for RX frame size, duplex settings, different offset
      for transfer speed setting and has magic packet detection support
      compared to E-MAC on RZ/G2L Gigabit Ethernet module. Factorise
      the ravb_emac_init function to support the later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      511d74d9
    • Biju Das's avatar
      ravb: Factorise ravb_dmac_init function · eb4fd127
      Biju Das authored
      The DMAC IP on the R-Car AVB module has different initialization
      parameters for RCR, TGC, TCCR, RIC0, RIC2, and TIC compared to
      DMAC IP on the RZ/G2L Gigabit Ethernet module. Factorise the
      ravb_dmac_init function to support the later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb4fd127
    • Biju Das's avatar
      ravb: Factorise ravb_set_features · 80f35a0d
      Biju Das authored
      RZ/G2L supports HW checksum on RX and TX whereas R-Car supports on RX.
      Factorise ravb_set_features to support this feature.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80f35a0d
    • Biju Das's avatar
      ravb: Factorise ravb_adjust_link function · cb21104f
      Biju Das authored
      R-Car supports 100 and 1000 Mbps transfer speed whereas RZ/G2L
      in addition support 10Mbps. Factorise ravb_adjust_link function
      in order to support 10Mbps speed.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb21104f
    • Biju Das's avatar
      ravb: Factorise ravb_rx function · d5d95c11
      Biju Das authored
      R-Car uses an extended descriptor in RX whereas, RZ/G2L uses
      normal descriptor in RX. Factorise the ravb_rx function to
      support the later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5d95c11
    • Biju Das's avatar
      ravb: Factorise ravb_ring_init function · 7870a418
      Biju Das authored
      The ravb_ring_init function uses an extended descriptor in RX for
      R-Car and normal descriptor for RZ/G2L. Add a helper function
      for RX ring buffer allocation to support later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7870a418
    • Biju Das's avatar
      ravb: Factorise ravb_ring_format function · 1ae22c19
      Biju Das authored
      The ravb_ring_format function uses an extended descriptor in RX
      for R-Car compared to the normal descriptor for RZ/G2L. Factorise
      RX ring buffer buildup to extend the support for later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ae22c19
    • Biju Das's avatar
      ravb: Factorise ravb_ring_free function · bf46b757
      Biju Das authored
      R-Car uses extended descriptor in RX, whereas RZ/G2L uses normal
      descriptor. Factorise ravb_ring_free function so that it can
      support later SoC.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf46b757
    • Biju Das's avatar
      ravb: Add ptp_cfg_active to struct ravb_hw_info · a69a3d09
      Biju Das authored
      There are some H/W differences for the gPTP feature between
      R-Car Gen3, R-Car Gen2, and RZ/G2L as below.
      
      1) On R-Car Gen3, gPTP support is active in config mode.
      2) On R-Car Gen2, gPTP support is not active in config mode.
      3) RZ/G2L does not support the gPTP feature.
      
      Add a ptp_cfg_active hw feature bit to struct ravb_hw_info for
      supporting gPTP active in config mode for R-Car Gen3.
      This patch also removes enum ravb_chip_id, chip_id from both
      struct ravb_hw_info and struct ravb_private, as it is unused.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a69a3d09
    • Biju Das's avatar
      ravb: Add no_ptp_cfg_active to struct ravb_hw_info · 8f27219a
      Biju Das authored
      There are some H/W differences for the gPTP feature between
      R-Car Gen3, R-Car Gen2, and RZ/G2L as below.
      
      1) On R-Car Gen2, gPTP support is not active in config mode.
      2) On R-Car Gen3, gPTP support is active in config mode.
      3) RZ/G2L does not support the gPTP feature.
      
      Add a no_ptp_cfg_active hw feature bit to struct ravb_hw_info for
      handling gPTP for R-Car Gen2.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f27219a
    • Biju Das's avatar
      ravb: Add multi_irq to struct ravb_hw_info · 6de19fa0
      Biju Das authored
      R-Car Gen3 supports separate interrupts for E-MAC and DMA queues,
      whereas R-Car Gen2 and RZ/G2L have a single interrupt instead.
      
      Add a multi_irq hw feature bit to struct ravb_hw_info to enable
      this only for R-Car Gen3.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6de19fa0
    • Biju Das's avatar
      ravb: Remove the macros NUM_TX_DESC_GEN[23] · c81d8942
      Biju Das authored
      For addressing 4 bytes alignment restriction on transmission
      buffer for R-Car Gen2 we use 2 descriptors whereas it is a single
      descriptor for other cases.
      Replace the macros NUM_TX_DESC_GEN[23] with magic number and
      add a comment to explain it.
      Signed-off-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Suggested-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarLad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c81d8942
    • David S. Miller's avatar
      Merge branch 'dsa-sja1105-vlan-tags' · 6956fa39
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Make sja1105 treat tag_8021q VLANs more like real DSA tags
      
      This series solves a nuisance with the sja1105 driver, which is that
      non-DSA tagged packets sent directly by the DSA master would still exit
      the switch just fine.
      
      We also had an issue for packets coming from the outside world with a
      crafted DSA tag, the switch would not reject that tag but think it was
      valid.
      ====================
      6956fa39
    • Vladimir Oltean's avatar
      net: dsa: tag_sja1105: stop asking the sja1105 driver in sja1105_xmit_tpid · 8ded9160
      Vladimir Oltean authored
      Introduced in commit 38b5beea ("net: dsa: sja1105: prepare tagger
      for handling DSA tags and VLAN simultaneously"), the sja1105_xmit_tpid
      function solved quite a different problem than our needs are now.
      
      Then, we used best-effort VLAN filtering and we were using the xmit_tpid
      to tunnel packets coming from an 8021q upper through the TX VLAN allocated
      by tag_8021q to that egress port. The need for a different VLAN protocol
      depending on switch revision came from the fact that this in itself was
      more of a hack to trick the hardware into accepting tunneled VLANs in
      the first place.
      
      Right now, we deny 8021q uppers (see sja1105_prechangeupper). Even if we
      supported them again, we would not do that using the same method of
      {tunneling the VLAN on egress, retagging the VLAN on ingress} that we
      had in the best-effort VLAN filtering mode. It seems rather simpler that
      we just allocate a VLAN in the VLAN table that is simply not used by the
      bridge at all, or by any other port.
      
      Anyway, I have 2 gripes with the current sja1105_xmit_tpid:
      
      1. When sending packets on behalf of a VLAN-aware bridge (with the new
         TX forwarding offload framework) plus untagged (with the tag_8021q
         VLAN added by the tagger) packets, we can see that on SJA1105P/Q/R/S
         and later (which have a qinq_tpid of ETH_P_8021AD), some packets sent
         through the DSA master have a VLAN protocol of 0x8100 and others of
         0x88a8. This is strange and there is no reason for it now. If we have
         a bridge and are therefore forced to send using that bridge's TPID,
         we can as well blend with that bridge's VLAN protocol for all packets.
      
      2. The sja1105_xmit_tpid introduces a dependency on the sja1105 driver,
         because it looks inside dp->priv. It is desirable to keep as much
         separation between taggers and switch drivers as possible. Now it
         doesn't do that anymore.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ded9160
    • Vladimir Oltean's avatar
      net: dsa: sja1105: drop untagged packets on the CPU and DSA ports · b0b8c67e
      Vladimir Oltean authored
      The sja1105 driver is a bit special in its use of VLAN headers as DSA
      tags. This is because in VLAN-aware mode, the VLAN headers use an actual
      TPID of 0x8100, which is understood even by the DSA master as an actual
      VLAN header.
      
      Furthermore, control packets such as PTP and STP are transmitted with no
      VLAN header as a DSA tag, because, depending on switch generation, there
      are ways to steer these control packets towards a precise egress port
      other than VLAN tags. Transmitting control packets as untagged means
      leaving a door open for traffic in general to be transmitted as untagged
      from the DSA master, and for it to traverse the switch and exit a random
      switch port according to the FDB lookup.
      
      This behavior is a bit out of line with other DSA drivers which have
      native support for DSA tagging. There, it is to be expected that the
      switch only accepts DSA-tagged packets on its CPU port, dropping
      everything that does not match this pattern.
      
      We perhaps rely a bit too much on the switches' hardware dropping on the
      CPU port, and place no other restrictions in the kernel data path to
      avoid that. For example, sja1105 is also a bit special in that STP/PTP
      packets are transmitted using "management routes"
      (sja1105_port_deferred_xmit): when sending a link-local packet from the
      CPU, we must first write a SPI message to the switch to tell it to
      expect a packet towards multicast MAC DA 01-80-c2-00-00-0e, and to route
      it towards port 3 when it gets it. This entry expires as soon as it
      matches a packet received by the switch, and it needs to be reinstalled
      for the next packet etc. All in all quite a ghetto mechanism, but it is
      all that the sja1105 switches offer for injecting a control packet.
      The driver takes a mutex for serializing control packets and making the
      pairs of SPI writes of a management route and its associated skb atomic,
      but to be honest, a mutex is only relevant as long as all parties agree
      to take it. With the DSA design, it is possible to open an AF_PACKET
      socket on the DSA master net device, and blast packets towards
      01-80-c2-00-00-0e, and whatever locking the DSA switch driver might use,
      it all goes kaput because management routes installed by the driver will
      match skbs sent by the DSA master, and not skbs generated by the driver
      itself. So they will end up being routed on the wrong port.
      
      So through the lens of that, maybe it would make sense to avoid that
      from happening by doing something in the network stack, like: introduce
      a new bit in struct sk_buff, like xmit_from_dsa. Then, somewhere around
      dev_hard_start_xmit(), introduce the following check:
      
      	if (netdev_uses_dsa(dev) && !skb->xmit_from_dsa)
      		kfree_skb(skb);
      
      Ok, maybe that is a bit drastic, but that would at least prevent a bunch
      of problems. For example, right now, even though the majority of DSA
      switches drop packets without DSA tags sent by the DSA master (and
      therefore the majority of garbage that user space daemons like avahi and
      udhcpcd and friends create), it is still conceivable that an aggressive
      user space program can open an AF_PACKET socket and inject a spoofed DSA
      tag directly on the DSA master. We have no protection against that; the
      packet will be understood by the switch and be routed wherever user
      space says. Furthermore: there are some DSA switches where we even have
      register access over Ethernet, using DSA tags. So even user space
      drivers are possible in this way. This is a huge hole.
      
      However, the biggest thing that bothers me is that udhcpcd attempts to
      ask for an IP address on all interfaces by default, and with sja1105, it
      will attempt to get a valid IP address on both the DSA master as well as
      on sja1105 switch ports themselves. So with IP addresses in the same
      subnet on multiple interfaces, the routing table will be messed up and
      the system will be unusable for traffic until it is configured manually
      to not ask for an IP address on the DSA master itself.
      
      It turns out that it is possible to avoid that in the sja1105 driver, at
      least very superficially, by requesting the switch to drop VLAN-untagged
      packets on the CPU port. With the exception of control packets, all
      traffic originated from tag_sja1105.c is already VLAN-tagged, so only
      STP and PTP packets need to be converted. For that, we need to uphold
      the equivalence between an untagged and a pvid-tagged packet, and to
      remember that the CPU port of sja1105 uses a pvid of 4095.
      
      Now that we drop untagged traffic on the CPU port, non-aggressive user
      space applications like udhcpcd stop bothering us, and sja1105 effectively
      becomes just as vulnerable to the aggressive kind of user space programs
      as other DSA switches are (ok, users can also create 8021q uppers on top
      of the DSA master in the case of sja1105, but in future patches we can
      easily deny that, but it still doesn't change the fact that VLAN-tagged
      packets can still be injected over raw sockets).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0b8c67e
    • Vladimir Oltean's avatar
      net: dsa: sja1105: prevent tag_8021q VLANs from being received on user ports · 73ceab83
      Vladimir Oltean authored
      Currently it is possible for an attacker to craft packets with a fake
      DSA tag and send them to us, and our user ports will accept them and
      preserve that VLAN when transmitting towards the CPU. Then the tagger
      will be misled into thinking that the packets came on a different port
      than they really came on.
      
      Up until recently there wasn't a good option to prevent this from
      happening. In SJA1105P and later, the MAC Configuration Table introduced
      two options called:
      - DRPSITAG: Drop Single Inner Tagged Frames
      - DRPSOTAG: Drop Single Outer Tagged Frames
      
      Because the sja1105 driver classifies all VLANs as "outer VLANs" (S-Tags),
      it would be in principle possible to enable the DRPSOTAG bit on ports
      using tag_8021q, and drop on ingress all packets which have a VLAN tag.
      When the switch is VLAN-unaware, this works, because it uses a custom
      TPID of 0xdadb, so any "tagged" packets received on a user port are
      probably a spoofing attempt. But when the switch overall is VLAN-aware,
      and some ports are standalone (therefore they use tag_8021q), the TPID
      is 0x8100, and the port can receive a mix of untagged and VLAN-tagged
      packets. The untagged ones will be classified to the tag_8021q pvid, and
      the tagged ones to the VLAN ID from the packet header. Yes, it is true
      that since commit 4fbc08bd ("net: dsa: sja1105: deny 8021q uppers on
      ports") we no longer support this mixed mode, but that is a temporary
      limitation which will eventually be lifted. It would be nice to not
      introduce one more restriction via DRPSOTAG, which would make the
      standalone ports of a VLAN-aware switch drop genuinely VLAN-tagged
      packets.
      
      Also, the DRPSOTAG bit is not available on the first generation of
      switches (SJA1105E, SJA1105T). So since one of the key features of this
      driver is compatibility across switch generations, this makes it an even
      less desirable approach.
      
      The breakthrough comes from commit bef0746c ("net: dsa: sja1105:
      make sure untagged packets are dropped on ingress ports with no pvid"),
      where it became obvious that untagged packets are not dropped even if
      the ingress port is not in the VMEMB_PORT vector of that port's pvid.
      However, VLAN-tagged packets are subject to VLAN ingress
      checking/dropping. This means that instead of using the catch-all
      DRPSOTAG bit introduced in SJA1105P, we can drop tagged packets on a
      per-VLAN basis, and this is already compatible with SJA1105E/T.
      
      This patch adds an "allowed_ingress" argument to sja1105_vlan_add(), and
      we call it with "false" for tag_8021q VLANs on user ports. The tag_8021q
      VLANs still need to be allowed, of course, on ingress to DSA ports and
      CPU ports.
      
      We also need to refine the drop_untagged check in sja1105_commit_pvid to
      make it not freak out about this new configuration. Currently it will
      try to keep the configuration consistent between untagged and pvid-tagged
      packets, so if the pvid of a port is 1 but VLAN 1 is not in VMEMB_PORT,
      packets tagged with VID 1 will behave the same as untagged packets, and
      be dropped. This behavior is what we want for ports under a VLAN-aware
      bridge, but for the ports with a tag_8021q pvid, we want untagged
      packets to be accepted, but packets tagged with a header recognized by
      the switch as a tag_8021q VLAN to be dropped. So only restrict the
      drop_untagged check to apply to the bridge_pvid, not to the tag_8021q_pvid.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73ceab83
    • DENG Qingfang's avatar
      net: dsa: mt7530: manually set up VLAN ID 0 · 1ca8a193
      DENG Qingfang authored
      The driver was relying on dsa_slave_vlan_rx_add_vid to add VLAN ID 0. After
      the blamed commit, VLAN ID 0 won't be set up anymore, breaking software
      bridging fallback on VLAN-unaware bridges.
      
      Manually set up VLAN ID 0 to fix this.
      
      Fixes: 06cfb2df ("net: dsa: don't advertise 'rx-vlan-filter' when not needed")
      Signed-off-by: default avatarDENG Qingfang <dqfext@gmail.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ca8a193
    • David S. Miller's avatar
      Merge branch 'mana-EQ-sharing' · e93826d3
      David S. Miller authored
      Haiyang Zhang says:
      
      ====================
      net: mana: Add support for EQ sharing
      
      The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed
      the device limit.
      
      Support EQ sharing, so that multiple vPorts can share the same set of
      MSIXs.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e93826d3
    • Haiyang Zhang's avatar
      net: mana: Add WARN_ON_ONCE in case of CQE read overflow · c1a3e9f9
      Haiyang Zhang authored
      This is not an expected case normally.
      Add WARN_ON_ONCE in case of CQE read overflow, instead of failing
      silently.
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1a3e9f9
    • Haiyang Zhang's avatar
      net: mana: Add support for EQ sharing · 1e2d0824
      Haiyang Zhang authored
      The existing code uses (1 + #vPorts * #Queues) MSIXs, which may exceed
      the device limit.
      
      Support EQ sharing, so that multiple vPorts (NICs) can share the same
      set of MSIXs.
      
      And, report the EQ-sharing capability bit to the host, which means the
      host can potentially offer more vPorts and queues to the VM.
      
      Also update the resource limit checking and error handling for better
      robustness.
      
      Now, we support up to 256 virtual ports per VF (it was 16/VF), and
      support up to 64 queues per vPort (it was 16).
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e2d0824
    • Haiyang Zhang's avatar
      net: mana: Move NAPI from EQ to CQ · e1b5683f
      Haiyang Zhang authored
      The existing code has NAPI threads polling on EQ directly. To prepare
      for EQ sharing among vPorts, move NAPI from EQ to CQ so that one EQ
      can serve multiple CQs from different vPorts.
      
      The "arm bit" is only set when CQ processing is completed to reduce
      the number of EQ entries, which in turn reduce the number of interrupts
      on EQ.
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1b5683f
    • Shaokun Zhang's avatar
      netxen_nic: Remove the repeated declaration · 807d1032
      Shaokun Zhang authored
      Function 'netxen_rom_fast_read' is declared twice, so remove the
      repeated declaration.
      
      Cc: Manish Chopra <manishc@marvell.com>
      Cc: Rahul Verma <rahulv@marvell.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarShaokun Zhang <zhangshaokun@hisilicon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      807d1032
    • Nathan Chancellor's avatar
      cxgb4: Properly revert VPD changes · bc4f128d
      Nathan Chancellor authored
      Clang warns:
      
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'kw_offset' is uninitialized when used here [-Werror,-Wuninitialized]
              FIND_VPD_KW(i, "RV");
              ^~~~~~~~~~~~~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:39: note: expanded from macro 'FIND_VPD_KW'
              var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \
                                                   ^~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:34: note: initialize the variable 'kw_offset' to silence this warning
              unsigned int vpdr_len, kw_offset, id_len;
                                              ^
                                               = 0
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2785:2: error: variable 'vpdr_len' is uninitialized when used here [-Werror,-Wuninitialized]
              FIND_VPD_KW(i, "RV");
              ^~~~~~~~~~~~~~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2776:50: note: expanded from macro 'FIND_VPD_KW'
              var = pci_vpd_find_info_keyword(vpd, kw_offset, vpdr_len, name); \
                                                              ^~~~~~~~
      drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2748:23: note: initialize the variable 'vpdr_len' to silence this warning
              unsigned int vpdr_len, kw_offset, id_len;
                                   ^
                                    = 0
      2 errors generated.
      
      The series "PCI/VPD: Convert more users to the new VPD API functions"
      was applied to net-next when it should have been applied to the PCI tree
      because of build errors. However, commit 82e34c8a ("Revert "Revert
      "cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()""") reapplied a
      change, resulting in the warning above.
      
      Properly revert commit 8d63ee60 ("cxgb4: Search VPD with
      pci_vpd_find_ro_info_keyword()") to fix the warning and restore proper
      functionality. This also reverts commit 3a93bede ("cxgb4: Remove
      unused vpd_param member ec") to avoid future merge conflicts, as that
      change has been applied to the PCI tree.
      
      Link: https://lore.kernel.org/r/20210823120929.7c6f7a4f@canb.auug.org.au/
      Link: https://lore.kernel.org/r/1ca29408-7bc7-4da5-59c7-87893c9e0442@gmail.com/Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc4f128d
    • David S. Miller's avatar
      Merge branch 'mptcp-next' · cb0f8b03
      David S. Miller authored
      Mat Martineau says:
      
      ====================
      mptcp: Optimize output options and add MP_FAIL
      
      This patch set contains two groups of changes that we've been testing in
      the MPTCP tree.
      
      The first optimizes the code path and data structure for populating
      MPTCP option headers when transmitting.
      
      Patch 1 reorganizes code to reduce the number of conditionals that need
      to be evaluated in common cases.
      
      Patch 2 rearranges struct mptcp_out_options to save 80 bytes (on x86_64).
      
      The next five patches add partial support for the MP_FAIL option as
      defined in RFC 8684. MP_FAIL is an option header used to cleanly handle
      MPTCP checksum failures. When the MPTCP checksum detects an error in the
      MPTCP DSS header or the data mapped by that header, the receiver uses a
      TCP RST with MP_FAIL to close the subflow that experienced the error and
      provide associated MPTCP sequence number information to the peer. RFC
      8684 also describes how a single-subflow connection can discard corrupt
      data and remain connected under certain conditions using MP_FAIL, but
      that feature is not implemented here.
      
      Patches 3-5 implement MP_FAIL transmit and receive, and integrates with
      checksum validation.
      
      Patches 6 & 7 add MP_FAIL selftests and the MIBs required for those
      tests.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb0f8b03