1. 15 Nov, 2019 1 commit
    • Tonghao Zhang's avatar
      net: openvswitch: add hash info to upcall · bd1903b7
      Tonghao Zhang authored
      When using the kernel datapath, the upcall don't
      include skb hash info relatived. That will introduce
      some problem, because the hash of skb is important
      in kernel stack. For example, VXLAN module uses
      it to select UDP src port. The tx queue selection
      may also use the hash in stack.
      
      Hash is computed in different ways. Hash is random
      for a TCP socket, and hash may be computed in hardware,
      or software stack. Recalculation hash is not easy.
      
      Hash of TCP socket is computed:
      tcp_v4_connect
          -> sk_set_txhash (is random)
      
      __tcp_transmit_skb
          -> skb_set_hash_from_sk
      
      There will be one upcall, without information of skb
      hash, to ovs-vswitchd, for the first packet of a TCP
      session. The rest packets will be processed in Open vSwitch
      modules, hash kept. If this tcp session is forward to
      VXLAN module, then the UDP src port of first tcp packet
      is different from rest packets.
      
      TCP packets may come from the host or dockers, to Open vSwitch.
      To fix it, we store the hash info to upcall, and restore hash
      when packets sent back.
      
      +---------------+          +-------------------------+
      |   Docker/VMs  |          |     ovs-vswitchd        |
      +----+----------+          +-+--------------------+--+
           |                       ^                    |
           |                       |                    |
           |                       |  upcall            v restore packet hash (not recalculate)
           |                     +-+--------------------+--+
           |  tap netdev         |                         |   vxlan module
           +--------------->     +-->  Open vSwitch ko     +-->
             or internal type    |                         |
                                 +-------------------------+
      
      Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-October/364062.htmlSigned-off-by: default avatarTonghao Zhang <xiangxia.m.yue@gmail.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd1903b7
  2. 14 Nov, 2019 8 commits
    • David S. Miller's avatar
      Merge branch 'Rework-mt762x-GDM-setup-flow' · 839554b7
      David S. Miller authored
      MarkLee says:
      
      ====================
      Rework mt762x GDM setup flow
      
      The mt762x GDM block is mainly used to setup the HW internal
      rx path from GMAC to RX DMA engine(PDMA) and the packet
      switching engine(PSE) is responsed to do the data forward
      following the GDM configuration.
      
      This patch set have three goals :
      
      1. Integrate GDM/PSE setup operations into single function "mtk_gdm_config"
      
      2. Refine the timing of GDM/PSE setup, move it from mtk_hw_init
         to mtk_open
      
      3. Enable GDM GDMA_DROP_ALL mode to drop all packet during the
         stop operation
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      839554b7
    • MarkLee's avatar
      net: ethernet: mediatek: Enable GDM GDMA_DROP_ALL mode · 8d66a818
      MarkLee authored
      Enable GDM GDMA_DROP_ALL mode to drop all packet during the
      stop operation. This is recommended by the mt762x HW design
      to drop all packet from GMAC before stopping PDMA.
      Signed-off-by: default avatarMarkLee <Mark-MC.Lee@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d66a818
    • MarkLee's avatar
      net: ethernet: mediatek: Refine the timing of GDM/PSE setup · 5ac9eda0
      MarkLee authored
      Refine the timing of GDM/PSE setup, move it from mtk_hw_init
      to mtk_open. This is recommended by the mt762x HW design to
      do GDM/PSE setup only after PDMA has been started.
      
      We exclude mt7628 in mtk_gdm_config function since it is a old IP
      and there is no GDM/PSE block on it.
      Signed-off-by: default avatarMarkLee <Mark-MC.Lee@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5ac9eda0
    • MarkLee's avatar
      net: ethernet: mediatek: Integrate GDM/PSE setup operations · 8d3f4a95
      MarkLee authored
      Integrate GDM/PSE setup operations into single function "mtk_gdm_config"
      Signed-off-by: default avatarMarkLee <Mark-MC.Lee@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8d3f4a95
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Simplify reset handling · abfb228a
      Vladimir Oltean authored
      We don't really need 10k species of reset. Remove everything except cold
      reset which is what is actually used. Too bad the hardware designers
      couldn't agree to use the same bit field for rev 1 and rev 2, so the
      (*reset_cmd) function pointer is there to stay.
      
      However let's simplify the prototype and give it a struct dsa_switch (we
      want to avoid forward-declarations of structures, in this case struct
      sja1105_private, wherever we can).
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      abfb228a
    • David S. Miller's avatar
      Merge branch 'PTP-clock-source-for-SJA1105-tc-taprio-offload' · ccb68993
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      PTP clock source for SJA1105 tc-taprio offload
      
      This series makes the IEEE 802.1Qbv egress scheduler of the sja1105
      switch use a time reference that is synchronized to the network. This
      enables quite a few real Time Sensitive Networking use cases, since in
      this mode the switch can offer its clients a TDMA sort of access to the
      network, and guaranteed latency for frames that are properly scheduled
      based on the common PTP time.
      
      The driver needs to do a 2-part activity:
      - Program the gate control list into the static config and upload it
        over SPI to the switch (already supported)
      - Write the activation time of the scheduler (base-time) into the
        PTPSCHTM register, and set the PTPSTRTSCH bit.
      - Monitor the activation of the scheduler at the planned time and its
        health.
      
      Ok, 3 parts.
      
      The time-aware scheduler cannot be programmed to activate at a time in
      the past, and there is some logic to avoid that.
      
      PTPCLKCORP is one of those "black magic" registers that just need to be
      written to the length of the cycle. There is a 40-line long comment in
      the second patch which explains why.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccb68993
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Implement state machine for TAS with PTP clock source · 86db36a3
      Vladimir Oltean authored
      Tested using the following bash script and the tc from iproute2-next:
      
      	#!/bin/bash
      
      	set -e -u -o pipefail
      
      	NSEC_PER_SEC="1000000000"
      
      	gatemask() {
      		local tc_list="$1"
      		local mask=0
      
      		for tc in ${tc_list}; do
      			mask=$((${mask} | (1 << ${tc})))
      		done
      
      		printf "%02x" ${mask}
      	}
      
      	if ! systemctl is-active --quiet ptp4l; then
      		echo "Please start the ptp4l service"
      		exit
      	fi
      
      	now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
      	# Phase-align the base time to the start of the next second.
      	sec=$(echo "${now}" | gawk -F. '{ print $1; }')
      	base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"
      
      	tc qdisc add dev swp5 parent root handle 100 taprio \
      		num_tc 8 \
      		map 0 1 2 3 5 6 7 \
      		queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
      		base-time ${base_time} \
      		sched-entry S $(gatemask 7) 100000 \
      		sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
      		clockid CLOCK_TAI flags 2
      
      The "state machine" is a workqueue invoked after each manipulation
      command on the PTP clock (reset, adjust time, set time, adjust
      frequency) which checks over the state of the time-aware scheduler.
      So it is not monitored periodically, only in reaction to a PTP command
      typically triggered from a userspace daemon (linuxptp). Otherwise there
      is no reason for things to go wrong.
      
      Now that the timecounter/cyclecounter has been replaced with hardware
      operations on the PTP clock, the TAS Kconfig now depends upon PTP and
      the standalone clocksource operating mode has been removed.
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86db36a3
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Make the PTP command read-write · 41603d78
      Vladimir Oltean authored
      The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate
      whether the time-aware scheduler is running or not. We will be using
      that for monitoring the scheduler in the next patch, so refactor the PTP
      command API in order to allow that.
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41603d78
  3. 13 Nov, 2019 31 commits