Commit db539cae authored by David S. Miller's avatar David S. Miller

Merge branch 'tc-taprio-offload-for-SJA1105-DSA'

Vladimir Oltean says:

====================
tc-taprio offload for SJA1105 DSA

This is the third attempt to submit the tc-taprio offload model for
inclusion in the networking tree. The sja1105 switch driver will provide
the first implementation of the offload. Only the bare minimum is added:

- The offload model and a DSA pass-through
- The hardware implementation
- The interaction with the netdev queues in the tagger code
- Documentation

What has been removed from previous attempts is support for
PTP-as-clocksource in sja1105, as well as configuring the traffic class
for management traffic.  These will be added as soon as the offload
model is settled.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 67e80b99 7c95afa4
......@@ -146,6 +146,96 @@ enslaves eth0 and eth1 (the DSA master of the switch ports). This is because in
this mode, the switch ports beneath br0 are not capable of regular traffic, and
are only used as a conduit for switchdev operations.
Offloads
========
Time-aware scheduling
---------------------
The switch supports a variation of the enhancements for scheduled traffic
specified in IEEE 802.1Q-2018 (formerly 802.1Qbv). This means it can be used to
ensure deterministic latency for priority traffic that is sent in-band with its
gate-open event in the network schedule.
This capability can be managed through the tc-taprio offload ('flags 2'). The
difference compared to the software implementation of taprio is that the latter
would only be able to shape traffic originated from the CPU, but not
autonomously forwarded flows.
The device has 8 traffic classes, and maps incoming frames to one of them based
on the VLAN PCP bits (if no VLAN is present, the port-based default is used).
As described in the previous sections, depending on the value of
``vlan_filtering``, the EtherType recognized by the switch as being VLAN can
either be the typical 0x8100 or a custom value used internally by the driver
for tagging. Therefore, the switch ignores the VLAN PCP if used in standalone
or bridge mode with ``vlan_filtering=0``, as it will not recognize the 0x8100
EtherType. In these modes, injecting into a particular TX queue can only be
done by the DSA net devices, which populate the PCP field of the tagging header
on egress. Using ``vlan_filtering=1``, the behavior is the other way around:
offloaded flows can be steered to TX queues based on the VLAN PCP, but the DSA
net devices are no longer able to do that. To inject frames into a hardware TX
queue with VLAN awareness active, it is necessary to create a VLAN
sub-interface on the DSA master port, and send normal (0x8100) VLAN-tagged
towards the switch, with the VLAN PCP bits set appropriately.
Management traffic (having DMAC 01-80-C2-xx-xx-xx or 01-19-1B-xx-xx-xx) is the
notable exception: the switch always treats it with a fixed priority and
disregards any VLAN PCP bits even if present. The traffic class for management
traffic has a value of 7 (highest priority) at the moment, which is not
configurable in the driver.
Below is an example of configuring a 500 us cyclic schedule on egress port
``swp5``. The traffic class gate for management traffic (7) is open for 100 us,
and the gates for all other traffic classes are open for 400 us::
#!/bin/bash
set -e -u -o pipefail
NSEC_PER_SEC="1000000000"
gatemask() {
local tc_list="$1"
local mask=0
for tc in ${tc_list}; do
mask=$((${mask} | (1 << ${tc})))
done
printf "%02x" ${mask}
}
if ! systemctl is-active --quiet ptp4l; then
echo "Please start the ptp4l service"
exit
fi
now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
# Phase-align the base time to the start of the next second.
sec=$(echo "${now}" | gawk -F. '{ print $1; }')
base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"
tc qdisc add dev swp5 parent root handle 100 taprio \
num_tc 8 \
map 0 1 2 3 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S $(gatemask 7) 100000 \
sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
flags 2
It is possible to apply the tc-taprio offload on multiple egress ports. There
are hardware restrictions related to the fact that no gate event may trigger
simultaneously on two ports. The driver checks the consistency of the schedules
against this restriction and errors out when appropriate. Schedule analysis is
needed to avoid this, which is outside the scope of the document.
At the moment, the time-aware scheduler can only be triggered based on a
standalone clock and not based on PTP time. This means the base-time argument
from tc-taprio is ignored and the schedule starts right away. It also means it
is more difficult to phase-align the scheduler with the other devices in the
network.
Device Tree bindings and board design
=====================================
......
......@@ -23,3 +23,11 @@ config NET_DSA_SJA1105_PTP
help
This enables support for timestamping and PTP clock manipulations in
the SJA1105 DSA driver.
config NET_DSA_SJA1105_TAS
bool "Support for the Time-Aware Scheduler on NXP SJA1105"
depends on NET_DSA_SJA1105
help
This enables support for the TTEthernet-based egress scheduling
engine in the SJA1105 DSA driver, which is controlled using a
hardware offload of the tc-tqprio qdisc.
......@@ -12,3 +12,7 @@ sja1105-objs := \
ifdef CONFIG_NET_DSA_SJA1105_PTP
sja1105-objs += sja1105_ptp.o
endif
ifdef CONFIG_NET_DSA_SJA1105_TAS
sja1105-objs += sja1105_tas.o
endif
......@@ -20,6 +20,8 @@
*/
#define SJA1105_AGEING_TIME_MS(ms) ((ms) / 10)
#include "sja1105_tas.h"
/* Keeps the different addresses between E/T and P/Q/R/S */
struct sja1105_regs {
u64 device_id;
......@@ -104,6 +106,7 @@ struct sja1105_private {
*/
struct mutex mgmt_lock;
struct sja1105_tagger_data tagger_data;
struct sja1105_tas_data tas_data;
};
#include "sja1105_dynamic_config.h"
......@@ -120,6 +123,9 @@ typedef enum {
SPI_WRITE = 1,
} sja1105_spi_rw_mode_t;
/* From sja1105_main.c */
int sja1105_static_config_reload(struct sja1105_private *priv);
/* From sja1105_spi.c */
int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
sja1105_spi_rw_mode_t rw, u64 reg_addr,
......
......@@ -488,6 +488,8 @@ sja1105et_general_params_entry_packing(void *buf, void *entry_ptr,
/* SJA1105E/T: First generation */
struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_SCHEDULE] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS] = {0},
[BLK_IDX_L2_LOOKUP] = {
.entry_packing = sja1105et_dyn_l2_lookup_entry_packing,
.cmd_packing = sja1105et_l2_lookup_cmd_packing,
......@@ -529,6 +531,8 @@ struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105ET_SIZE_MAC_CONFIG_DYN_CMD,
.addr = 0x36,
},
[BLK_IDX_SCHEDULE_PARAMS] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS_PARAMS] = {0},
[BLK_IDX_L2_LOOKUP_PARAMS] = {
.entry_packing = sja1105et_l2_lookup_params_entry_packing,
.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
......@@ -552,6 +556,8 @@ struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
/* SJA1105P/Q/R/S: Second generation */
struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_SCHEDULE] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS] = {0},
[BLK_IDX_L2_LOOKUP] = {
.entry_packing = sja1105pqrs_dyn_l2_lookup_entry_packing,
.cmd_packing = sja1105pqrs_l2_lookup_cmd_packing,
......@@ -593,6 +599,8 @@ struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
.packed_size = SJA1105PQRS_SIZE_MAC_CONFIG_DYN_CMD,
.addr = 0x4B,
},
[BLK_IDX_SCHEDULE_PARAMS] = {0},
[BLK_IDX_SCHEDULE_ENTRY_POINTS_PARAMS] = {0},
[BLK_IDX_L2_LOOKUP_PARAMS] = {
.entry_packing = sja1105et_l2_lookup_params_entry_packing,
.cmd_packing = sja1105et_l2_lookup_params_cmd_packing,
......
......@@ -22,6 +22,7 @@
#include <linux/if_ether.h>
#include <linux/dsa/8021q.h>
#include "sja1105.h"
#include "sja1105_tas.h"
static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len,
unsigned int startup_delay)
......@@ -384,7 +385,9 @@ static int sja1105_init_general_params(struct sja1105_private *priv)
/* Disallow dynamic changing of the mirror port */
.mirr_ptacu = 0,
.switchid = priv->ds->index,
/* Priority queue for link-local frames trapped to CPU */
/* Priority queue for link-local management frames
* (both ingress to and egress from CPU - PTP, STP etc)
*/
.hostprio = 7,
.mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A,
.mac_flt1 = SJA1105_LINKLOCAL_FILTER_A_MASK,
......@@ -1380,7 +1383,7 @@ static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
* modify at runtime (currently only MAC) and restore them after uploading,
* such that this operation is relatively seamless.
*/
static int sja1105_static_config_reload(struct sja1105_private *priv)
int sja1105_static_config_reload(struct sja1105_private *priv)
{
struct sja1105_mac_config_entry *mac;
int speed_mbps[SJA1105_NUM_PORTS];
......@@ -1711,6 +1714,9 @@ static int sja1105_setup(struct dsa_switch *ds)
*/
ds->vlan_filtering_is_global = true;
/* Advertise the 8 egress queues */
ds->num_tx_queues = SJA1105_NUM_TC;
/* The DSA/switchdev model brings up switch ports in standalone mode by
* default, and that means vlan_filtering is 0 since they're not under
* a bridge, so it's safe to set up switch tagging at this time.
......@@ -1722,6 +1728,7 @@ static void sja1105_teardown(struct dsa_switch *ds)
{
struct sja1105_private *priv = ds->priv;
sja1105_tas_teardown(ds);
cancel_work_sync(&priv->tagger_data.rxtstamp_work);
skb_queue_purge(&priv->tagger_data.skb_rxtstamp_queue);
sja1105_ptp_clock_unregister(priv);
......@@ -2051,6 +2058,18 @@ static bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
return true;
}
static int sja1105_port_setup_tc(struct dsa_switch *ds, int port,
enum tc_setup_type type,
void *type_data)
{
switch (type) {
case TC_SETUP_QDISC_TAPRIO:
return sja1105_setup_tc_taprio(ds, port, type_data);
default:
return -EOPNOTSUPP;
}
}
static const struct dsa_switch_ops sja1105_switch_ops = {
.get_tag_protocol = sja1105_get_tag_protocol,
.setup = sja1105_setup,
......@@ -2083,6 +2102,7 @@ static const struct dsa_switch_ops sja1105_switch_ops = {
.port_hwtstamp_set = sja1105_hwtstamp_set,
.port_rxtstamp = sja1105_port_rxtstamp,
.port_txtstamp = sja1105_port_txtstamp,
.port_setup_tc = sja1105_port_setup_tc,
};
static int sja1105_check_device_id(struct sja1105_private *priv)
......@@ -2192,6 +2212,8 @@ static int sja1105_probe(struct spi_device *spi)
}
mutex_init(&priv->mgmt_lock);
sja1105_tas_setup(ds);
return dsa_register_switch(priv->ds);
}
......
......@@ -11,11 +11,15 @@
#define SJA1105_SIZE_DEVICE_ID 4
#define SJA1105_SIZE_TABLE_HEADER 12
#define SJA1105_SIZE_SCHEDULE_ENTRY 8
#define SJA1105_SIZE_SCHEDULE_ENTRY_POINTS_ENTRY 4
#define SJA1105_SIZE_L2_POLICING_ENTRY 8
#define SJA1105_SIZE_VLAN_LOOKUP_ENTRY 8
#define SJA1105_SIZE_L2_FORWARDING_ENTRY 8
#define SJA1105_SIZE_L2_FORWARDING_PARAMS_ENTRY 12
#define SJA1105_SIZE_XMII_PARAMS_ENTRY 4
#define SJA1105_SIZE_SCHEDULE_PARAMS_ENTRY 12
#define SJA1105_SIZE_SCHEDULE_ENTRY_POINTS_PARAMS_ENTRY 4
#define SJA1105ET_SIZE_L2_LOOKUP_ENTRY 12
#define SJA1105ET_SIZE_MAC_CONFIG_ENTRY 28
#define SJA1105ET_SIZE_L2_LOOKUP_PARAMS_ENTRY 4
......@@ -29,11 +33,15 @@
/* UM10944.pdf Page 11, Table 2. Configuration Blocks */
enum {
BLKID_SCHEDULE = 0x00,
BLKID_SCHEDULE_ENTRY_POINTS = 0x01,
BLKID_L2_LOOKUP = 0x05,
BLKID_L2_POLICING = 0x06,
BLKID_VLAN_LOOKUP = 0x07,
BLKID_L2_FORWARDING = 0x08,
BLKID_MAC_CONFIG = 0x09,
BLKID_SCHEDULE_PARAMS = 0x0A,
BLKID_SCHEDULE_ENTRY_POINTS_PARAMS = 0x0B,
BLKID_L2_LOOKUP_PARAMS = 0x0D,
BLKID_L2_FORWARDING_PARAMS = 0x0E,
BLKID_AVB_PARAMS = 0x10,
......@@ -42,11 +50,15 @@ enum {
};
enum sja1105_blk_idx {
BLK_IDX_L2_LOOKUP = 0,
BLK_IDX_SCHEDULE = 0,
BLK_IDX_SCHEDULE_ENTRY_POINTS,
BLK_IDX_L2_LOOKUP,
BLK_IDX_L2_POLICING,
BLK_IDX_VLAN_LOOKUP,
BLK_IDX_L2_FORWARDING,
BLK_IDX_MAC_CONFIG,
BLK_IDX_SCHEDULE_PARAMS,
BLK_IDX_SCHEDULE_ENTRY_POINTS_PARAMS,
BLK_IDX_L2_LOOKUP_PARAMS,
BLK_IDX_L2_FORWARDING_PARAMS,
BLK_IDX_AVB_PARAMS,
......@@ -59,11 +71,15 @@ enum sja1105_blk_idx {
BLK_IDX_INVAL = -1,
};
#define SJA1105_MAX_SCHEDULE_COUNT 1024
#define SJA1105_MAX_SCHEDULE_ENTRY_POINTS_COUNT 2048
#define SJA1105_MAX_L2_LOOKUP_COUNT 1024
#define SJA1105_MAX_L2_POLICING_COUNT 45
#define SJA1105_MAX_VLAN_LOOKUP_COUNT 4096
#define SJA1105_MAX_L2_FORWARDING_COUNT 13
#define SJA1105_MAX_MAC_CONFIG_COUNT 5
#define SJA1105_MAX_SCHEDULE_PARAMS_COUNT 1
#define SJA1105_MAX_SCHEDULE_ENTRY_POINTS_PARAMS_COUNT 1
#define SJA1105_MAX_L2_LOOKUP_PARAMS_COUNT 1
#define SJA1105_MAX_L2_FORWARDING_PARAMS_COUNT 1
#define SJA1105_MAX_GENERAL_PARAMS_COUNT 1
......@@ -83,6 +99,23 @@ enum sja1105_blk_idx {
#define SJA1105R_PART_NO 0x9A86
#define SJA1105S_PART_NO 0x9A87
struct sja1105_schedule_entry {
u64 winstindex;
u64 winend;
u64 winst;
u64 destports;
u64 setvalid;
u64 txen;
u64 resmedia_en;
u64 resmedia;
u64 vlindex;
u64 delta;
};
struct sja1105_schedule_params_entry {
u64 subscheind[8];
};
struct sja1105_general_params_entry {
u64 vllupformat;
u64 mirr_ptacu;
......@@ -112,6 +145,17 @@ struct sja1105_general_params_entry {
u64 replay_port;
};
struct sja1105_schedule_entry_points_entry {
u64 subschindx;
u64 delta;
u64 address;
};
struct sja1105_schedule_entry_points_params_entry {
u64 clksrc;
u64 actsubsch;
};
struct sja1105_vlan_lookup_entry {
u64 ving_mirr;
u64 vegr_mirr;
......@@ -256,6 +300,8 @@ sja1105_static_config_get_length(const struct sja1105_static_config *config);
typedef enum {
SJA1105_CONFIG_OK = 0,
SJA1105_TTETHERNET_NOT_SUPPORTED,
SJA1105_INCORRECT_TTETHERNET_CONFIGURATION,
SJA1105_MISSING_L2_POLICING_TABLE,
SJA1105_MISSING_L2_FORWARDING_TABLE,
SJA1105_MISSING_L2_FORWARDING_PARAMS_TABLE,
......
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0
* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
*/
#ifndef _SJA1105_TAS_H
#define _SJA1105_TAS_H
#include <net/pkt_sched.h>
#if IS_ENABLED(CONFIG_NET_DSA_SJA1105_TAS)
struct sja1105_tas_data {
struct tc_taprio_qopt_offload *offload[SJA1105_NUM_PORTS];
};
int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
struct tc_taprio_qopt_offload *admin);
void sja1105_tas_setup(struct dsa_switch *ds);
void sja1105_tas_teardown(struct dsa_switch *ds);
#else
/* C doesn't allow empty structures, bah! */
struct sja1105_tas_data {
u8 dummy;
};
static inline int sja1105_setup_tc_taprio(struct dsa_switch *ds, int port,
struct tc_taprio_qopt_offload *admin)
{
return -EOPNOTSUPP;
}
static inline void sja1105_tas_setup(struct dsa_switch *ds) { }
static inline void sja1105_tas_teardown(struct dsa_switch *ds) { }
#endif /* IS_ENABLED(CONFIG_NET_DSA_SJA1105_TAS) */
#endif /* _SJA1105_TAS_H */
......@@ -847,6 +847,7 @@ enum tc_setup_type {
TC_SETUP_QDISC_ETF,
TC_SETUP_ROOT_QDISC,
TC_SETUP_QDISC_GRED,
TC_SETUP_QDISC_TAPRIO,
};
/* These structures hold the attributes of bpf state that are being passed
......
......@@ -515,6 +515,8 @@ struct dsa_switch_ops {
bool ingress);
void (*port_mirror_del)(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror);
int (*port_setup_tc)(struct dsa_switch *ds, int port,
enum tc_setup_type type, void *type_data);
/*
* Cross-chip operations
......
......@@ -161,4 +161,27 @@ struct tc_etf_qopt_offload {
s32 queue;
};
struct tc_taprio_sched_entry {
u8 command; /* TC_TAPRIO_CMD_* */
/* The gate_mask in the offloading side refers to traffic classes */
u32 gate_mask;
u32 interval;
};
struct tc_taprio_qopt_offload {
u8 enable;
ktime_t base_time;
u64 cycle_time;
u64 cycle_time_extension;
size_t num_entries;
struct tc_taprio_sched_entry entries[0];
};
/* Reference counting */
struct tc_taprio_qopt_offload *taprio_offload_get(struct tc_taprio_qopt_offload
*offload);
void taprio_offload_free(struct tc_taprio_qopt_offload *offload);
#endif
......@@ -1160,7 +1160,8 @@ enum {
* [TCA_TAPRIO_ATTR_SCHED_ENTRY_INTERVAL]
*/
#define TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST 0x1
#define TCA_TAPRIO_ATTR_FLAG_TXTIME_ASSIST BIT(0)
#define TCA_TAPRIO_ATTR_FLAG_FULL_OFFLOAD BIT(1)
enum {
TCA_TAPRIO_ATTR_UNSPEC,
......
......@@ -1035,12 +1035,16 @@ static int dsa_slave_setup_tc_block(struct net_device *dev,
static int dsa_slave_setup_tc(struct net_device *dev, enum tc_setup_type type,
void *type_data)
{
switch (type) {
case TC_SETUP_BLOCK:
struct dsa_port *dp = dsa_slave_to_port(dev);
struct dsa_switch *ds = dp->ds;
if (type == TC_SETUP_BLOCK)
return dsa_slave_setup_tc_block(dev, type_data);
default:
if (!ds->ops->port_setup_tc)
return -EOPNOTSUPP;
}
return ds->ops->port_setup_tc(ds, dp->index, type, type_data);
}
static void dsa_slave_get_stats64(struct net_device *dev,
......
......@@ -89,7 +89,8 @@ static struct sk_buff *sja1105_xmit(struct sk_buff *skb,
struct dsa_port *dp = dsa_slave_to_port(netdev);
struct dsa_switch *ds = dp->ds;
u16 tx_vid = dsa_8021q_tx_vid(ds, dp->index);
u8 pcp = skb->priority;
u16 queue_mapping = skb_get_queue_mapping(skb);
u8 pcp = netdev_txq_to_tc(netdev, queue_mapping);
/* Transmitting management traffic does not rely upon switch tagging,
* but instead SPI-installed management routes. Part 2 of this
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment