1. 08 May, 2020 25 commits
    • David S. Miller's avatar
      Merge branch 'bonding-report-transmit-status-to-callers' · 738fea32
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      bonding: report transmit status to callers
      
      First patches cleanup netpoll, and make sure it provides tx status to its users.
      
      Last patch changes bonding to not pretend packets were sent without error.
      
      By providing more accurate status, TCP stack can avoid adding more
      packets if the slave qdisc is already full.
      
      This came while testing latest horizon feature in sch_fq, with
      very low pacing rate flows, but should benefit hosts under stress.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      738fea32
    • Eric Dumazet's avatar
      bonding: propagate transmit status · ae46f184
      Eric Dumazet authored
      Currently, bonding always returns NETDEV_TX_OK to its caller.
      
      It is worth trying to be more accurate : TCP for instance
      can have different recovery strategies if it can have more
      precise status, if packet was dropped by slave qdisc.
      
      This is especially important when host is under stress.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae46f184
    • Eric Dumazet's avatar
      netpoll: accept NULL np argument in netpoll_send_skb() · f78ed220
      Eric Dumazet authored
      netpoll_send_skb() callers seem to leak skb if
      the np pointer is NULL. While this should not happen, we
      can make the code more robust.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f78ed220
    • Eric Dumazet's avatar
      netpoll: netpoll_send_skb() returns transmit status · 1ddabdfa
      Eric Dumazet authored
      Some callers want to know if the packet has been sent or
      dropped, to inform upper stacks.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1ddabdfa
    • Eric Dumazet's avatar
      netpoll: move netpoll_send_skb() out of line · fb1eee47
      Eric Dumazet authored
      There is no need to inline this helper, as we intend to add more
      code in this function.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb1eee47
    • Eric Dumazet's avatar
      netpoll: remove dev argument from netpoll_send_skb_on_dev() · 307f660d
      Eric Dumazet authored
      netpoll_send_skb_on_dev() can get the device pointer directly from np->dev
      
      Rename it to __netpoll_send_skb()
      
      Following patch will move netpoll_send_skb() out-of-line.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      307f660d
    • Colin Ian King's avatar
      net: phy: fix less than zero comparison with unsigned variable val · 3a13f98b
      Colin Ian King authored
      The unsigned variable val is being checked for an error by checking
      if it is less than zero. This can never occur because val is unsigned.
      Fix this by making val a plain int.
      
      Addresses-Coverity: ("Unsigned compared against zero")
      Fixes: bdbdac76 ("ethtool: provide UAPI for PHY master/slave configuration.")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a13f98b
    • YueHaibing's avatar
      net/smc: remove set but not used variables 'del_llc, del_llc_resp' · ca7e3edc
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      net/smc/smc_llc.c: In function 'smc_llc_cli_conf_link':
      net/smc/smc_llc.c:753:31: warning:
       variable 'del_llc' set but not used [-Wunused-but-set-variable]
        struct smc_llc_msg_del_link *del_llc;
                                     ^
      net/smc/smc_llc.c: In function 'smc_llc_process_srv_delete_link':
      net/smc/smc_llc.c:1311:33: warning:
       variable 'del_llc_resp' set but not used [-Wunused-but-set-variable]
          struct smc_llc_msg_del_link *del_llc_resp;
                                       ^
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca7e3edc
    • zhang kai's avatar
      tcp: tcp_mark_head_lost is only valid for sack-tcp · 636ef28d
      zhang kai authored
      so tcp_is_sack/reno checks are removed from tcp_mark_head_lost.
      Signed-off-by: default avatarzhang kai <zhangkaiheb@126.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      636ef28d
    • Jacob Keller's avatar
      net: remove newlines in NL_SET_ERR_MSG_MOD · c75a33c8
      Jacob Keller authored
      The NL_SET_ERR_MSG_MOD macro is used to report a string describing an
      error message to userspace via the netlink extended ACK structure. It
      should not have a trailing newline.
      
      Add a cocci script which catches cases where the newline marker is
      present. Using this script, fix the handful of cases which accidentally
      included a trailing new line.
      
      I couldn't figure out a way to get a patch mode working, so this script
      only implements context, report, and org.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Andy Whitcroft <apw@canonical.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c75a33c8
    • David S. Miller's avatar
      Merge branch 'ti-am65x-cpts-follow-up-dt-bindings-update' · 57ea8506
      David S. Miller authored
      Grygorii Strashko says:
      
      ====================
      net: ethernet: ti: am65x-cpts: follow up dt bindings update
      
      This series is follow update for  TI A65x/J721E Common platform time sync (CPTS)
      driver [1] to implement  DT bindings review comments from
      Rob Herring <robh@kernel.org> [2].
       - "reg" and "compatible" properties are made required for CPTS DT nodes which
         also required to change K3 CPSW driver to use of_platform_device_create()
         instead of of_platform_populate() for proper CPTS and MDIO initialization
       - minor DT bindings format changes
       - K3 CPTS example added to K3 MCU CPSW bindings
      
      [1] https://lwn.net/Articles/819313/
      [2] https://lwn.net/ml/linux-kernel/20200505040419.GA8509@bogus/
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57ea8506
    • Grygorii Strashko's avatar
      arm64: dts: ti: k3-am65/j721e-mcu: update cpts node · ef2d1363
      Grygorii Strashko authored
      Update CPTS node following DT binding update:
       - add reg and compatible properties
       - fix node name
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef2d1363
    • Grygorii Strashko's avatar
      dt-binding: net: ti: am65x-cpts: make reg and compatible required · 4786f4a0
      Grygorii Strashko authored
      This patch follows K3 CPTS review comments from Rob Herring
      <robh@kernel.org>.
       - "reg" and "compatible" properties are required now
       - minor format changes
       - K3 CPTS example added to K3 MCU CPSW bindings
      
      Cc: Rob Herring <robh@kernel.org>
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4786f4a0
    • Grygorii Strashko's avatar
      net: ethernet: ti: am65-cpsw-nuss: use of_platform_device_create() for mdio · a45cfcc6
      Grygorii Strashko authored
      The MCU CPSW expected to populate only MDIO device, but follow up patches
      will add "compatible" property to the MCU CPSW CPTS node which will cause
      creation of CPTS device and MCU CPSW init failure. Hence, switch to use
      of_platform_device_create() instead of of_platform_populate() for MDIO
      device population.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a45cfcc6
    • David S. Miller's avatar
      Merge branch 'hsr-hsr-code-refactoring' · a8c9baf2
      David S. Miller authored
      Taehee Yoo says:
      
      ====================
      hsr: hsr code refactoring
      
      There are some unnecessary routine in the hsr module.
      This patch removes these routines.
      
      The first patch removes incorrect comment.
      The second patch removes unnecessary WARN_ONCE() macro.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8c9baf2
    • Ioana Ciornei's avatar
      dpaa2-eth: create a function to flush the XDP fds · 38c440b2
      Ioana Ciornei authored
      Create an independent function that takes a particular frame queue and
      an array of frame descriptors and tries to enqueue them until it hits
      the maximum number fo retries. The same function will be used in the
      next patch also on the XDP_TX path.
      
      Also, create the dpaa2_eth_xdp_fds structure to incorporate the array of
      FDs as well as the number of FDs already populated.
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38c440b2
    • Taehee Yoo's avatar
      hsr: remove WARN_ONCE() in hsr_fill_frame_info() · f96e8717
      Taehee Yoo authored
      When VLAN frame is being sent, hsr calls WARN_ONCE() because hsr doesn't
      support VLAN. But using WARN_ONCE() is overdoing.
      Using netdev_warn_once() is enough.
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f96e8717
    • Ioana Ciornei's avatar
      soc: fsl: dpio: properly compute the consumer index · 7596ac9d
      Ioana Ciornei authored
      Mask the consumer index before using it. Without this, we would be
      writing frame descriptors beyond the ring size supported by the QBMAN
      block.
      
      Fixes: 3b2abda7 ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue")
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Acked-by: default avatarLi Yang <leoyang.li@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7596ac9d
    • David S. Miller's avatar
      Merge branch 'tc-gate-offload-for-SJA1105-DSA-switch' · eb55d7b6
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      tc-gate offload for SJA1105 DSA switch
      
      Expose the TTEthernet hardware features of the switch using standard
      tc-flower actions: trap, drop, redirect and gate.
      
      v1 was submitted at:
      https://patchwork.ozlabs.org/project/netdev/cover/20200503211035.19363-1-olteanv@gmail.com/
      
      v2 was submitted at:
      https://patchwork.ozlabs.org/project/netdev/cover/20200503211035.19363-1-olteanv@gmail.com/
      
      Changes in v3:
      Made sure there are no compilation warnings when
      CONFIG_NET_DSA_SJA1105_TAS or CONFIG_NET_DSA_SJA1105_VL are disabled.
      
      Changes in v2:
      Using a newly introduced dsa_port_from_netdev public helper.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb55d7b6
    • Vladimir Oltean's avatar
      docs: net: dsa: sja1105: document intended usage of virtual links · 47cfa3af
      Vladimir Oltean authored
      Add some verbiage describing how the hardware features of the switch are
      exposed to users through tc-flower.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47cfa3af
    • Vladimir Oltean's avatar
      net: dsa: sja1105: implement tc-gate using time-triggered virtual links · 834f8933
      Vladimir Oltean authored
      Restrict the TTEthernet hardware support on this switch to operate as
      closely as possible to IEEE 802.1Qci as possible. This means that it can
      perform PTP-time-based ingress admission control on streams identified
      by {DMAC, VID, PCP}, which is useful when trying to ensure the
      determinism of traffic scheduled via IEEE 802.1Qbv.
      
      The oddity comes from the fact that in hardware (and in TTEthernet at
      large), virtual links always need a full-blown action, including not
      only the type of policing, but also the list of destination ports. So in
      practice, a single tc-gate action will result in all packets getting
      dropped. Additional actions (either "trap" or "redirect") need to be
      specified in the same filter rule such that the conforming packets are
      actually forwarded somewhere.
      
      Apart from the VL Lookup, Policing and Forwarding tables which need to
      be programmed for each flow (virtual link), the Schedule engine also
      needs to be told to open/close the admission gates for each individual
      virtual link. A fairly accurate (and detailed) description of how that
      works is already present in sja1105_tas.c, since it is already used to
      trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
      point here, we remember that the schedule engine supports 8
      "subschedules" (execution threads that iterate through the global
      schedule in parallel, and that no 2 hardware threads must execute a
      schedule entry at the same time). For tc-taprio, each egress port used
      one of these 8 subschedules, leaving a total of 4 subschedules unused.
      In principle we could have allocated 1 subschedule for the tc-gate
      offload of each ingress port, but actually the schedules of all virtual
      links installed on each ingress port would have needed to be merged
      together, before they could have been programmed to hardware. So
      simplify our life and just merge the entire tc-gate configuration, for
      all virtual links on all ingress ports, into a single subschedule. Be
      sure to check that against the usual hardware scheduling conflicts, and
      program it to hardware alongside any tc-taprio subschedule that may be
      present.
      
      The following scenarios were tested:
      
      1. Quantitative testing:
      
         tc qdisc add dev swp2 clsact
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate index 1 base-time 0 \
                 sched-entry OPEN 1200 -1 -1 \
                 sched-entry CLOSE 1200 -1 -1 \
                 action trap
      
         ping 192.168.1.2 -f
         PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
         .............................
         --- 192.168.1.2 ping statistics ---
         948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
      
      2. Qualitative testing (with a phase-aligned schedule - the clocks are
         synchronized by ptp4l, not shown here):
      
         Receiver (sja1105):
      
         tc qdisc add dev swp2 clsact
         now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc filter add dev swp2 ingress flower skip_sw \
                 dst_mac 42:be:24:9b:76:20 \
                 action gate base-time ${base_time} \
                 sched-entry OPEN  60000 -1 -1 \
                 sched-entry CLOSE 40000 -1 -1 \
                 action trap
      
         Sender (enetc):
         now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
                 sec=$(echo $now | awk -F. '{print $1}') && \
                 base_time="$(((sec + 2) * 1000000000))" && \
                 echo "base time ${base_time}"
         tc qdisc add dev eno0 parent root taprio \
                 num_tc 8 \
                 map 0 1 2 3 4 5 6 7 \
                 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
                 base-time ${base_time} \
                 sched-entry S 01  50000 \
                 sched-entry S 00  50000 \
                 flags 2
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         1425 packets transmitted, 1424 packets received, 0% packet loss
         round-trip min/avg/max = 0.322/0.361/0.990 ms
      
         And just for comparison, with the tc-taprio schedule deleted:
      
         ping -A 192.168.1.1
         PING 192.168.1.1 (192.168.1.1): 56 data bytes
         ...
         ^C
         --- 192.168.1.1 ping statistics ---
         33 packets transmitted, 19 packets received, 42% packet loss
         round-trip min/avg/max = 0.336/0.464/0.597 ms
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      834f8933
    • Vladimir Oltean's avatar
      net: dsa: sja1105: support flow-based redirection via virtual links · dfacc5a2
      Vladimir Oltean authored
      Implement tc-flower offloads for redirect, trap and drop using
      non-critical virtual links.
      
      Commands which were tested to work are:
      
        # Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
        # CPU and to swp3. This type of key (DA only) when the port's VLAN
        # awareness state is off.
        tc qdisc add dev swp2 clsact
        tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
                action mirred egress redirect dev swp3 \
                action trap
      
        # Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
        # of 100 and a PCP of 0.
        tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
                dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
      
      Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
      filtering is disabled, those are set internally by the driver to the
      port-based defaults. Because we would be put in an awkward situation if
      the user were to change the VLAN filtering state while there are active
      rules (packets would no longer match on the specified keys), we simply
      deny changing vlan_filtering unless the list of flows offloaded via
      virtual links is empty. Then the user can re-add new rules.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfacc5a2
    • Vladimir Oltean's avatar
      net: dsa: sja1105: make room for virtual link parsing in flower offload · b70bb8d4
      Vladimir Oltean authored
      Virtual links are a sja1105 hardware concept of executing various flow
      actions based on a key extracted from the frame's DMAC, VID and PCP.
      
      Currently the tc-flower offload code supports only parsing the DMAC if
      that is the broadcast MAC address, and the VLAN PCP. Extract the key
      parsing logic from the L2 policers functionality and move it into its
      own function, after adding extra logic for matching on any DMAC and VID.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b70bb8d4
    • Vladimir Oltean's avatar
      net: dsa: sja1105: add static tables for virtual links · 94f94d4a
      Vladimir Oltean authored
      This patch adds the register definitions for the:
      - VL Lookup Table
      - VL Policing Table
      - VL Forwarding Table
      - VL Forwarding Parameters Table
      
      These are needed in order to perform TTEthernet operations: QoS
      classification, flow-based policing and/or frame redirecting with the
      switch.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94f94d4a
    • Vladimir Oltean's avatar
      net: dsa: introduce a dsa_port_from_netdev public helper · e1eea811
      Vladimir Oltean authored
      As its implementation shows, this is synonimous with calling
      dsa_slave_dev_check followed by dsa_slave_to_port, so it is quite simple
      already and provides functionality which is already there.
      
      However there is now a need for these functions outside dsa_priv.h, for
      example in drivers that perform mirroring and redirection through
      tc-flower offloads (they are given raw access to the flow_cls_offload
      structure), where they need to call this function on act->dev.
      
      But simply exporting dsa_slave_to_port would make it non-inline and
      would result in an extra function call in the hotpath, as can be seen
      for example in sja1105:
      
      Before:
      
      000006dc <sja1105_xmit>:
      {
       6dc:	e92d4ff0 	push	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
       6e0:	e1a04000 	mov	r4, r0
       6e4:	e591958c 	ldr	r9, [r1, #1420]	; 0x58c <- Inline dsa_slave_to_port
       6e8:	e1a05001 	mov	r5, r1
       6ec:	e24dd004 	sub	sp, sp, #4
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       6f0:	e1c901d8 	ldrd	r0, [r9, #24]
       6f4:	ebfffffe 	bl	0 <dsa_8021q_tx_vid>
      			6f4: R_ARM_CALL	dsa_8021q_tx_vid
      	u8 pcp = netdev_txq_to_tc(netdev, queue_mapping);
       6f8:	e1d416b0 	ldrh	r1, [r4, #96]	; 0x60
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       6fc:	e1a08000 	mov	r8, r0
      
      After:
      
      000006e4 <sja1105_xmit>:
      {
       6e4:	e92d4ff0 	push	{r4, r5, r6, r7, r8, r9, sl, fp, lr}
       6e8:	e1a04000 	mov	r4, r0
       6ec:	e24dd004 	sub	sp, sp, #4
      	struct dsa_port *dp = dsa_slave_to_port(netdev);
       6f0:	e1a00001 	mov	r0, r1
      {
       6f4:	e1a05001 	mov	r5, r1
      	struct dsa_port *dp = dsa_slave_to_port(netdev);
       6f8:	ebfffffe 	bl	0 <dsa_slave_to_port>
      			6f8: R_ARM_CALL	dsa_slave_to_port
       6fc:	e1a09000 	mov	r9, r0
      	u16 tx_vid = dsa_8021q_tx_vid(dp->ds, dp->index);
       700:	e1c001d8 	ldrd	r0, [r0, #24]
       704:	ebfffffe 	bl	0 <dsa_8021q_tx_vid>
      			704: R_ARM_CALL	dsa_8021q_tx_vid
      
      Because we want to avoid possible performance regressions, introduce
      this new function which is designed to be public.
      Suggested-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1eea811
  2. 07 May, 2020 15 commits