Commit 4ee9e6e0 authored by David S. Miller's avatar David S. Miller

Merge branch 'mlxsw-Add-tunnel-devlink-trap-support'

Ido Schimmel says:

====================
mlxsw: Add tunnel devlink-trap support

This patch set from Amit adds support in mlxsw for tunnel traps and a
few additional layer 3 traps that can report drops and exceptions via
devlink-trap.

These traps allow the user to more quickly diagnose problems relating to
tunnel decapsulation errors, such as packet being too short to
decapsulate or a packet containing wrong GRE key in its GRE header.

Patch set overview:

Patches #1-#4 add three additional layer 3 traps. Two of which are
mlxsw-specific as they relate to hardware-specific errors. The patches
include documentation of each trap and selftests.

Patches #5-#8 are preparations. They ensure that the correct ECN bits
are set in the outer header during IPinIP encapsulation and that packets
with an invalid ECN combination in underlay and overlay are trapped to
the kernel and not decapsulated in hardware.

Patches #9-#15 add support for two tunnel related traps. Each trap is
documented and selftested using both VXLAN and IPinIP tunnels, if
applicable.
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 95ae2d1d b3073dfb
......@@ -223,6 +223,21 @@ be added to the following table:
* - ``ipv6_lpm_miss``
- ``exception``
- Traps unicast IPv6 packets that did not match any route
* - ``non_routable_packet``
- ``drop``
- Traps packets that the device decided to drop because they are not
supposed to be routed. For example, IGMP queries can be flooded by the
device in layer 2 and reach the router. Such packets should not be
routed and instead dropped
* - ``decap_error``
- ``exception``
- Traps NVE and IPinIP packets that the device decided to drop because of
failure during decapsulation (e.g., packet being too short, reserved
bits set in VXLAN header)
* - ``overlay_smac_is_mc``
- ``drop``
- Traps NVE packets that the device decided to drop because their overlay
source MAC is multicast
Driver-specific Packet Traps
============================
......@@ -234,6 +249,7 @@ links to the description of driver-specific traps registered by various device
drivers:
* :doc:`netdevsim`
* :doc:`mlxsw`
Generic Packet Trap Groups
==========================
......@@ -258,6 +274,9 @@ narrow. The description of these groups must be added to the following table:
* - ``buffer_drops``
- Contains packet traps for packets that were dropped by the device due to
an enqueue decision
* - ``tunnel_drops``
- Contains packet traps for packets that were dropped by the device during
tunnel encapsulation / decapsulation
Testing
=======
......
......@@ -57,3 +57,25 @@ The ``mlxsw`` driver reports the following versions
* - ``fw.version``
- running
- Three digit firmware version
Driver-specific Traps
=====================
.. list-table:: List of Driver-specific Traps Registered by ``mlxsw``
:widths: 5 5 90
* - Name
- Type
- Description
* - ``irif_disabled``
- ``drop``
- Traps packets that the device decided to drop because they need to be
routed from a disabled router interface (RIF). This can happen during
RIF dismantle, when the RIF is first disabled before being removed
completely
* - ``erif_disabled``
- ``drop``
- Traps packets that the device decided to drop because they need to be
routed through a disabled router interface (RIF). This can happen during
RIF dismantle, when the RIF is first disabled before being removed
completely
......@@ -5513,6 +5513,7 @@ enum mlxsw_reg_htgt_discard_trap_group {
MLXSW_REG_HTGT_DISCARD_TRAP_GROUP_BASE = MLXSW_REG_HTGT_TRAP_GROUP_MAX,
MLXSW_REG_HTGT_TRAP_GROUP_SP_L2_DISCARDS,
MLXSW_REG_HTGT_TRAP_GROUP_SP_L3_DISCARDS,
MLXSW_REG_HTGT_TRAP_GROUP_SP_TUNNEL_DISCARDS,
};
/* reg_htgt_trap_group
......@@ -10140,6 +10141,92 @@ static inline void mlxsw_reg_tigcr_pack(char *payload, bool ttlc, u8 ttl_uc)
mlxsw_reg_tigcr_ttl_uc_set(payload, ttl_uc);
}
/* TIEEM - Tunneling IPinIP Encapsulation ECN Mapping Register
* -----------------------------------------------------------
* The TIEEM register maps ECN of the IP header at the ingress to the
* encapsulation to the ECN of the underlay network.
*/
#define MLXSW_REG_TIEEM_ID 0xA812
#define MLXSW_REG_TIEEM_LEN 0x0C
MLXSW_REG_DEFINE(tieem, MLXSW_REG_TIEEM_ID, MLXSW_REG_TIEEM_LEN);
/* reg_tieem_overlay_ecn
* ECN of the IP header in the overlay network.
* Access: Index
*/
MLXSW_ITEM32(reg, tieem, overlay_ecn, 0x04, 24, 2);
/* reg_tineem_underlay_ecn
* ECN of the IP header in the underlay network.
* Access: RW
*/
MLXSW_ITEM32(reg, tieem, underlay_ecn, 0x04, 16, 2);
static inline void mlxsw_reg_tieem_pack(char *payload, u8 overlay_ecn,
u8 underlay_ecn)
{
MLXSW_REG_ZERO(tieem, payload);
mlxsw_reg_tieem_overlay_ecn_set(payload, overlay_ecn);
mlxsw_reg_tieem_underlay_ecn_set(payload, underlay_ecn);
}
/* TIDEM - Tunneling IPinIP Decapsulation ECN Mapping Register
* -----------------------------------------------------------
* The TIDEM register configures the actions that are done in the
* decapsulation.
*/
#define MLXSW_REG_TIDEM_ID 0xA813
#define MLXSW_REG_TIDEM_LEN 0x0C
MLXSW_REG_DEFINE(tidem, MLXSW_REG_TIDEM_ID, MLXSW_REG_TIDEM_LEN);
/* reg_tidem_underlay_ecn
* ECN field of the IP header in the underlay network.
* Access: Index
*/
MLXSW_ITEM32(reg, tidem, underlay_ecn, 0x04, 24, 2);
/* reg_tidem_overlay_ecn
* ECN field of the IP header in the overlay network.
* Access: Index
*/
MLXSW_ITEM32(reg, tidem, overlay_ecn, 0x04, 16, 2);
/* reg_tidem_eip_ecn
* Egress IP ECN. ECN field of the IP header of the packet which goes out
* from the decapsulation.
* Access: RW
*/
MLXSW_ITEM32(reg, tidem, eip_ecn, 0x04, 8, 2);
/* reg_tidem_trap_en
* Trap enable:
* 0 - No trap due to decap ECN
* 1 - Trap enable with trap_id
* Access: RW
*/
MLXSW_ITEM32(reg, tidem, trap_en, 0x08, 28, 4);
/* reg_tidem_trap_id
* Trap ID. Either DECAP_ECN0 or DECAP_ECN1.
* Reserved when trap_en is '0'.
* Access: RW
*/
MLXSW_ITEM32(reg, tidem, trap_id, 0x08, 0, 9);
static inline void mlxsw_reg_tidem_pack(char *payload, u8 underlay_ecn,
u8 overlay_ecn, u8 eip_ecn,
bool trap_en, u16 trap_id)
{
MLXSW_REG_ZERO(tidem, payload);
mlxsw_reg_tidem_underlay_ecn_set(payload, underlay_ecn);
mlxsw_reg_tidem_overlay_ecn_set(payload, overlay_ecn);
mlxsw_reg_tidem_eip_ecn_set(payload, eip_ecn);
mlxsw_reg_tidem_trap_en_set(payload, trap_en);
mlxsw_reg_tidem_trap_id_set(payload, trap_id);
}
/* SBPR - Shared Buffer Pools Register
* -----------------------------------
* The SBPR configures and retrieves the shared buffer pools and configuration.
......@@ -10684,6 +10771,8 @@ static const struct mlxsw_reg_info *mlxsw_reg_infos[] = {
MLXSW_REG(tndem),
MLXSW_REG(tnpc),
MLXSW_REG(tigcr),
MLXSW_REG(tieem),
MLXSW_REG(tidem),
MLXSW_REG(sbpr),
MLXSW_REG(sbcm),
MLXSW_REG(sbpm),
......
......@@ -4538,8 +4538,6 @@ static const struct mlxsw_listener mlxsw_sp_listener[] = {
false),
MLXSW_SP_RXL_MARK(ROUTER_ALERT_IPV4, TRAP_TO_CPU, ROUTER_EXP, false),
MLXSW_SP_RXL_MARK(ROUTER_ALERT_IPV6, TRAP_TO_CPU, ROUTER_EXP, false),
MLXSW_SP_RXL_MARK(IPIP_DECAP_ERROR, TRAP_TO_CPU, ROUTER_EXP, false),
MLXSW_SP_RXL_MARK(DECAP_ECN0, TRAP_TO_CPU, ROUTER_EXP, false),
MLXSW_SP_RXL_MARK(IPV4_VRRP, TRAP_TO_CPU, VRRP, false),
MLXSW_SP_RXL_MARK(IPV6_VRRP, TRAP_TO_CPU, VRRP, false),
MLXSW_SP_RXL_NO_MARK(DISCARD_ING_ROUTER_SIP_CLASS_E, FORWARD,
......
......@@ -3,8 +3,10 @@
#include <net/ip_tunnels.h>
#include <net/ip6_tunnel.h>
#include <net/inet_ecn.h>
#include "spectrum_ipip.h"
#include "reg.h"
struct ip_tunnel_parm
mlxsw_sp_ipip_netdev_parms4(const struct net_device *ol_dev)
......@@ -338,3 +340,61 @@ static const struct mlxsw_sp_ipip_ops mlxsw_sp_ipip_gre4_ops = {
const struct mlxsw_sp_ipip_ops *mlxsw_sp_ipip_ops_arr[] = {
[MLXSW_SP_IPIP_TYPE_GRE4] = &mlxsw_sp_ipip_gre4_ops,
};
static int mlxsw_sp_ipip_ecn_encap_init_one(struct mlxsw_sp *mlxsw_sp,
u8 inner_ecn, u8 outer_ecn)
{
char tieem_pl[MLXSW_REG_TIEEM_LEN];
mlxsw_reg_tieem_pack(tieem_pl, inner_ecn, outer_ecn);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(tieem), tieem_pl);
}
int mlxsw_sp_ipip_ecn_encap_init(struct mlxsw_sp *mlxsw_sp)
{
int i;
/* Iterate over inner ECN values */
for (i = INET_ECN_NOT_ECT; i <= INET_ECN_CE; i++) {
u8 outer_ecn = INET_ECN_encapsulate(0, i);
int err;
err = mlxsw_sp_ipip_ecn_encap_init_one(mlxsw_sp, i, outer_ecn);
if (err)
return err;
}
return 0;
}
static int mlxsw_sp_ipip_ecn_decap_init_one(struct mlxsw_sp *mlxsw_sp,
u8 inner_ecn, u8 outer_ecn)
{
char tidem_pl[MLXSW_REG_TIDEM_LEN];
bool trap_en, set_ce = false;
u8 new_inner_ecn;
trap_en = __INET_ECN_decapsulate(outer_ecn, inner_ecn, &set_ce);
new_inner_ecn = set_ce ? INET_ECN_CE : inner_ecn;
mlxsw_reg_tidem_pack(tidem_pl, outer_ecn, inner_ecn, new_inner_ecn,
trap_en, trap_en ? MLXSW_TRAP_ID_DECAP_ECN0 : 0);
return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(tidem), tidem_pl);
}
int mlxsw_sp_ipip_ecn_decap_init(struct mlxsw_sp *mlxsw_sp)
{
int i, j, err;
/* Iterate over inner ECN values */
for (i = INET_ECN_NOT_ECT; i <= INET_ECN_CE; i++) {
/* Iterate over outer ECN values */
for (j = INET_ECN_NOT_ECT; j <= INET_ECN_CE; j++) {
err = mlxsw_sp_ipip_ecn_decap_init_one(mlxsw_sp, i, j);
if (err)
return err;
}
}
return 0;
}
......@@ -7791,8 +7791,18 @@ mlxsw_sp_ipip_config_tigcr(struct mlxsw_sp *mlxsw_sp)
static int mlxsw_sp_ipips_init(struct mlxsw_sp *mlxsw_sp)
{
int err;
mlxsw_sp->router->ipip_ops_arr = mlxsw_sp_ipip_ops_arr;
INIT_LIST_HEAD(&mlxsw_sp->router->ipip_list);
err = mlxsw_sp_ipip_ecn_encap_init(mlxsw_sp);
if (err)
return err;
err = mlxsw_sp_ipip_ecn_decap_init(mlxsw_sp);
if (err)
return err;
return mlxsw_sp_ipip_config_tigcr(mlxsw_sp);
}
......
......@@ -104,4 +104,7 @@ static inline bool mlxsw_sp_l3addr_eq(const union mlxsw_sp_l3addr *addr1,
return !memcmp(addr1, addr2, sizeof(*addr1));
}
int mlxsw_sp_ipip_ecn_encap_init(struct mlxsw_sp *mlxsw_sp);
int mlxsw_sp_ipip_ecn_decap_init(struct mlxsw_sp *mlxsw_sp);
#endif /* _MLXSW_ROUTER_H_*/
......@@ -9,6 +9,20 @@
#include "reg.h"
#include "spectrum.h"
/* All driver-specific traps must be documented in
* Documentation/networking/devlink/mlxsw.rst
*/
enum {
DEVLINK_MLXSW_TRAP_ID_BASE = DEVLINK_TRAP_GENERIC_ID_MAX,
DEVLINK_MLXSW_TRAP_ID_IRIF_DISABLED,
DEVLINK_MLXSW_TRAP_ID_ERIF_DISABLED,
};
#define DEVLINK_MLXSW_TRAP_NAME_IRIF_DISABLED \
"irif_disabled"
#define DEVLINK_MLXSW_TRAP_NAME_ERIF_DISABLED \
"erif_disabled"
#define MLXSW_SP_TRAP_METADATA DEVLINK_TRAP_METADATA_TYPE_F_IN_PORT
static void mlxsw_sp_rx_drop_listener(struct sk_buff *skb, u8 local_port,
......@@ -21,6 +35,12 @@ static void mlxsw_sp_rx_exception_listener(struct sk_buff *skb, u8 local_port,
DEVLINK_TRAP_GROUP_GENERIC(_group_id), \
MLXSW_SP_TRAP_METADATA)
#define MLXSW_SP_TRAP_DRIVER_DROP(_id, _group_id) \
DEVLINK_TRAP_DRIVER(DROP, DROP, DEVLINK_MLXSW_TRAP_ID_##_id, \
DEVLINK_MLXSW_TRAP_NAME_##_id, \
DEVLINK_TRAP_GROUP_GENERIC(_group_id), \
MLXSW_SP_TRAP_METADATA)
#define MLXSW_SP_TRAP_EXCEPTION(_id, _group_id) \
DEVLINK_TRAP_GENERIC(EXCEPTION, TRAP, _id, \
DEVLINK_TRAP_GROUP_GENERIC(_group_id), \
......@@ -58,6 +78,11 @@ static struct devlink_trap mlxsw_sp_traps_arr[] = {
MLXSW_SP_TRAP_EXCEPTION(UNRESOLVED_NEIGH, L3_DROPS),
MLXSW_SP_TRAP_EXCEPTION(IPV4_LPM_UNICAST_MISS, L3_DROPS),
MLXSW_SP_TRAP_EXCEPTION(IPV6_LPM_UNICAST_MISS, L3_DROPS),
MLXSW_SP_TRAP_DRIVER_DROP(IRIF_DISABLED, L3_DROPS),
MLXSW_SP_TRAP_DRIVER_DROP(ERIF_DISABLED, L3_DROPS),
MLXSW_SP_TRAP_DROP(NON_ROUTABLE, L3_DROPS),
MLXSW_SP_TRAP_EXCEPTION(DECAP_ERROR, TUNNEL_DROPS),
MLXSW_SP_TRAP_DROP(OVERLAY_SMAC_MC, TUNNEL_DROPS),
};
static struct mlxsw_listener mlxsw_sp_listeners_arr[] = {
......@@ -90,6 +115,15 @@ static struct mlxsw_listener mlxsw_sp_listeners_arr[] = {
TRAP_EXCEPTION_TO_CPU),
MLXSW_SP_RXL_EXCEPTION(DISCARD_ROUTER_LPM6, ROUTER_EXP,
TRAP_EXCEPTION_TO_CPU),
MLXSW_SP_RXL_DISCARD(ROUTER_IRIF_EN, L3_DISCARDS),
MLXSW_SP_RXL_DISCARD(ROUTER_ERIF_EN, L3_DISCARDS),
MLXSW_SP_RXL_DISCARD(NON_ROUTABLE, L3_DISCARDS),
MLXSW_SP_RXL_EXCEPTION(DECAP_ECN0, ROUTER_EXP, TRAP_EXCEPTION_TO_CPU),
MLXSW_SP_RXL_EXCEPTION(IPIP_DECAP_ERROR, ROUTER_EXP,
TRAP_EXCEPTION_TO_CPU),
MLXSW_SP_RXL_EXCEPTION(DISCARD_DEC_PKT, TUNNEL_DISCARDS,
TRAP_EXCEPTION_TO_CPU),
MLXSW_SP_RXL_DISCARD(OVERLAY_SMAC_MC, TUNNEL_DISCARDS),
};
/* Mapping between hardware trap and devlink trap. Multiple hardware traps can
......@@ -123,6 +157,13 @@ static u16 mlxsw_sp_listener_devlink_map[] = {
DEVLINK_TRAP_GENERIC_ID_UNRESOLVED_NEIGH,
DEVLINK_TRAP_GENERIC_ID_IPV4_LPM_UNICAST_MISS,
DEVLINK_TRAP_GENERIC_ID_IPV6_LPM_UNICAST_MISS,
DEVLINK_MLXSW_TRAP_ID_IRIF_DISABLED,
DEVLINK_MLXSW_TRAP_ID_ERIF_DISABLED,
DEVLINK_TRAP_GENERIC_ID_NON_ROUTABLE,
DEVLINK_TRAP_GENERIC_ID_DECAP_ERROR,
DEVLINK_TRAP_GENERIC_ID_DECAP_ERROR,
DEVLINK_TRAP_GENERIC_ID_DECAP_ERROR,
DEVLINK_TRAP_GENERIC_ID_OVERLAY_SMAC_MC,
};
static int mlxsw_sp_rx_listener(struct mlxsw_sp *mlxsw_sp, struct sk_buff *skb,
......@@ -304,8 +345,9 @@ mlxsw_sp_trap_group_policer_init(struct mlxsw_sp *mlxsw_sp,
u32 rate;
switch (group->id) {
case DEVLINK_TRAP_GROUP_GENERIC_ID_L3_DROPS:/* fall through */
case DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS:
case DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS: /* fall through */
case DEVLINK_TRAP_GROUP_GENERIC_ID_L3_DROPS: /* fall through */
case DEVLINK_TRAP_GROUP_GENERIC_ID_TUNNEL_DROPS:
policer_id = MLXSW_SP_DISCARD_POLICER_ID;
ir_units = MLXSW_REG_QPCR_IR_UNITS_M;
is_bytes = false;
......@@ -342,6 +384,12 @@ __mlxsw_sp_trap_group_init(struct mlxsw_sp *mlxsw_sp,
priority = 0;
tc = 1;
break;
case DEVLINK_TRAP_GROUP_GENERIC_ID_TUNNEL_DROPS:
group_id = MLXSW_REG_HTGT_TRAP_GROUP_SP_TUNNEL_DISCARDS;
policer_id = MLXSW_SP_DISCARD_POLICER_ID;
priority = 0;
tc = 1;
break;
default:
return -EINVAL;
}
......
......@@ -67,6 +67,7 @@ enum {
MLXSW_TRAP_ID_NVE_ENCAP_ARP = 0xBD,
MLXSW_TRAP_ID_ROUTER_ALERT_IPV4 = 0xD6,
MLXSW_TRAP_ID_ROUTER_ALERT_IPV6 = 0xD7,
MLXSW_TRAP_ID_DISCARD_NON_ROUTABLE = 0x11A,
MLXSW_TRAP_ID_DISCARD_ROUTER2 = 0x130,
MLXSW_TRAP_ID_DISCARD_ROUTER3 = 0x131,
MLXSW_TRAP_ID_DISCARD_ING_PACKET_SMAC_MC = 0x140,
......@@ -88,8 +89,12 @@ enum {
MLXSW_TRAP_ID_DISCARD_ING_ROUTER_IPV4_SIP_BC = 0x16A,
MLXSW_TRAP_ID_DISCARD_ING_ROUTER_IPV4_DIP_LOCAL_NET = 0x16B,
MLXSW_TRAP_ID_DISCARD_ING_ROUTER_DIP_LINK_LOCAL = 0x16C,
MLXSW_TRAP_ID_DISCARD_ROUTER_IRIF_EN = 0x178,
MLXSW_TRAP_ID_DISCARD_ROUTER_ERIF_EN = 0x179,
MLXSW_TRAP_ID_DISCARD_ROUTER_LPM4 = 0x17B,
MLXSW_TRAP_ID_DISCARD_ROUTER_LPM6 = 0x17C,
MLXSW_TRAP_ID_DISCARD_DEC_PKT = 0x188,
MLXSW_TRAP_ID_DISCARD_OVERLAY_SMAC_MC = 0x190,
MLXSW_TRAP_ID_DISCARD_IPV6_MC_DIP_RESERVED_SCOPE = 0x1B0,
MLXSW_TRAP_ID_DISCARD_IPV6_MC_DIP_INTERFACE_LOCAL_SCOPE = 0x1B1,
MLXSW_TRAP_ID_ACL0 = 0x1C0,
......
......@@ -591,6 +591,9 @@ enum devlink_trap_generic_id {
DEVLINK_TRAP_GENERIC_ID_REJECT_ROUTE,
DEVLINK_TRAP_GENERIC_ID_IPV4_LPM_UNICAST_MISS,
DEVLINK_TRAP_GENERIC_ID_IPV6_LPM_UNICAST_MISS,
DEVLINK_TRAP_GENERIC_ID_NON_ROUTABLE,
DEVLINK_TRAP_GENERIC_ID_DECAP_ERROR,
DEVLINK_TRAP_GENERIC_ID_OVERLAY_SMAC_MC,
/* Add new generic trap IDs above */
__DEVLINK_TRAP_GENERIC_ID_MAX,
......@@ -604,6 +607,7 @@ enum devlink_trap_group_generic_id {
DEVLINK_TRAP_GROUP_GENERIC_ID_L2_DROPS,
DEVLINK_TRAP_GROUP_GENERIC_ID_L3_DROPS,
DEVLINK_TRAP_GROUP_GENERIC_ID_BUFFER_DROPS,
DEVLINK_TRAP_GROUP_GENERIC_ID_TUNNEL_DROPS,
/* Add new generic trap group IDs above */
__DEVLINK_TRAP_GROUP_GENERIC_ID_MAX,
......@@ -659,6 +663,12 @@ enum devlink_trap_group_generic_id {
"ipv4_lpm_miss"
#define DEVLINK_TRAP_GENERIC_NAME_IPV6_LPM_UNICAST_MISS \
"ipv6_lpm_miss"
#define DEVLINK_TRAP_GENERIC_NAME_NON_ROUTABLE \
"non_routable_packet"
#define DEVLINK_TRAP_GENERIC_NAME_DECAP_ERROR \
"decap_error"
#define DEVLINK_TRAP_GENERIC_NAME_OVERLAY_SMAC_MC \
"overlay_smac_is_mc"
#define DEVLINK_TRAP_GROUP_GENERIC_NAME_L2_DROPS \
"l2_drops"
......@@ -666,6 +676,8 @@ enum devlink_trap_group_generic_id {
"l3_drops"
#define DEVLINK_TRAP_GROUP_GENERIC_NAME_BUFFER_DROPS \
"buffer_drops"
#define DEVLINK_TRAP_GROUP_GENERIC_NAME_TUNNEL_DROPS \
"tunnel_drops"
#define DEVLINK_TRAP_GENERIC(_type, _init_action, _id, _group, _metadata_cap) \
{ \
......
......@@ -7706,6 +7706,9 @@ static const struct devlink_trap devlink_trap_generic[] = {
DEVLINK_TRAP(REJECT_ROUTE, EXCEPTION),
DEVLINK_TRAP(IPV4_LPM_UNICAST_MISS, EXCEPTION),
DEVLINK_TRAP(IPV6_LPM_UNICAST_MISS, EXCEPTION),
DEVLINK_TRAP(NON_ROUTABLE, DROP),
DEVLINK_TRAP(DECAP_ERROR, EXCEPTION),
DEVLINK_TRAP(OVERLAY_SMAC_MC, DROP),
};
#define DEVLINK_TRAP_GROUP(_id) \
......@@ -7718,6 +7721,7 @@ static const struct devlink_trap_group devlink_trap_group_generic[] = {
DEVLINK_TRAP_GROUP(L2_DROPS),
DEVLINK_TRAP_GROUP(L3_DROPS),
DEVLINK_TRAP_GROUP(BUFFER_DROPS),
DEVLINK_TRAP_GROUP(TUNNEL_DROPS),
};
static int devlink_trap_generic_verify(const struct devlink_trap *trap)
......
......@@ -50,6 +50,8 @@ ALL_TESTS="
ipv6_mc_dip_reserved_scope_test
ipv6_mc_dip_interface_local_scope_test
blackhole_route_test
irif_disabled_test
erif_disabled_test
"
NUM_NETIFS=4
......@@ -553,6 +555,116 @@ blackhole_route_test()
__blackhole_route_test "6" "2001:db8:2::/120" "ipv6" $h2_ipv6 "icmpv6"
}
irif_disabled_test()
{
local trap_name="irif_disabled"
local group_name="l3_drops"
local t0_packets t0_bytes
local t1_packets t1_bytes
local mz_pid
RET=0
ping_check $trap_name
devlink_trap_action_set $trap_name "trap"
# When RIF of a physical port ("Sub-port RIF") is destroyed, we first
# block the STP of the {Port, VLAN} so packets cannot get into the RIF.
# Using bridge enables us to see this trap because when bridge is
# destroyed, there is a small time window that packets can go into the
# RIF, while it is disabled.
ip link add dev br0 type bridge
ip link set dev $rp1 master br0
ip address flush dev $rp1
__addr_add_del br0 add 192.0.2.2/24
ip li set dev br0 up
t0_packets=$(devlink_trap_rx_packets_get $trap_name)
t0_bytes=$(devlink_trap_rx_bytes_get $trap_name)
# Generate packets to h2 through br0 RIF that will be removed later
$MZ $h1 -t udp "sp=54321,dp=12345" -c 0 -p 100 -a own -b $rp1mac \
-B $h2_ipv4 -q &
mz_pid=$!
# Wait before removing br0 RIF to allow packets to go into the bridge.
sleep 1
# Flushing address will dismantle the RIF
ip address flush dev br0
t1_packets=$(devlink_trap_rx_packets_get $trap_name)
t1_bytes=$(devlink_trap_rx_bytes_get $trap_name)
if [[ $t0_packets -eq $t1_packets && $t0_bytes -eq $t1_bytes ]]; then
check_err 1 "Trap stats idle when packets should be trapped"
fi
log_test "Ingress RIF disabled"
kill $mz_pid && wait $mz_pid &> /dev/null
ip link set dev $rp1 nomaster
__addr_add_del $rp1 add 192.0.2.2/24 2001:db8:1::2/64
ip link del dev br0 type bridge
devlink_trap_action_set $trap_name "drop"
}
erif_disabled_test()
{
local trap_name="erif_disabled"
local group_name="l3_drops"
local t0_packets t0_bytes
local t1_packets t1_bytes
local mz_pid
RET=0
ping_check $trap_name
devlink_trap_action_set $trap_name "trap"
ip link add dev br0 type bridge
ip add flush dev $rp1
ip link set dev $rp1 master br0
__addr_add_del br0 add 192.0.2.2/24
ip link set dev br0 up
t0_packets=$(devlink_trap_rx_packets_get $trap_name)
t0_bytes=$(devlink_trap_rx_bytes_get $trap_name)
rp2mac=$(mac_get $rp2)
# Generate packets that should go out through br0 RIF that will be
# removed later
$MZ $h2 -t udp "sp=54321,dp=12345" -c 0 -p 100 -a own -b $rp2mac \
-B 192.0.2.1 -q &
mz_pid=$!
sleep 5
# In order to see this trap we need a route that points to disabled RIF.
# When ipv6 address is flushed, there is a delay and the routes are
# deleted before the RIF and we cannot get state that we have route
# to disabled RIF.
# Delete IPv6 address first and then check this trap with flushing IPv4.
ip -6 add flush dev br0
ip -4 add flush dev br0
t1_packets=$(devlink_trap_rx_packets_get $trap_name)
t1_bytes=$(devlink_trap_rx_bytes_get $trap_name)
if [[ $t0_packets -eq $t1_packets && $t0_bytes -eq $t1_bytes ]]; then
check_err 1 "Trap stats idle when packets should be trapped"
fi
log_test "Egress RIF disabled"
kill $mz_pid && wait $mz_pid &> /dev/null
ip link set dev $rp1 nomaster
__addr_add_del $rp1 add 192.0.2.2/24 2001:db8:1::2/64
ip link del dev br0 type bridge
devlink_trap_action_set $trap_name "drop"
}
trap cleanup EXIT
setup_prepare
......
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# Test devlink-trap tunnel exceptions functionality over mlxsw.
# Check all exception traps to make sure they are triggered under the right
# conditions.
# +-------------------------+
# | H1 |
# | $h1 + |
# | 192.0.2.1/28 | |
# +-------------------|-----+
# |
# +-------------------|-----+
# | SW1 | |
# | $swp1 + |
# | 192.0.2.2/28 |
# | |
# | + g1a (gre) |
# | loc=192.0.2.65 |
# | rem=192.0.2.66 |
# | tos=inherit |
# | |
# | + $rp1 |
# | | 198.51.100.1/28 |
# +--|----------------------+
# |
# +--|----------------------+
# | | VRF2 |
# | + $rp2 |
# | 198.51.100.2/28 |
# +-------------------------+
lib_dir=$(dirname $0)/../../../net/forwarding
ALL_TESTS="
decap_error_test
"
NUM_NETIFS=4
source $lib_dir/lib.sh
source $lib_dir/tc_common.sh
source $lib_dir/devlink_lib.sh
h1_create()
{
simple_if_init $h1 192.0.2.1/28
}
h1_destroy()
{
simple_if_fini $h1 192.0.2.1/28
}
vrf2_create()
{
simple_if_init $rp2 198.51.100.2/28
}
vrf2_destroy()
{
simple_if_fini $rp2 198.51.100.2/28
}
switch_create()
{
__addr_add_del $swp1 add 192.0.2.2/28
tc qdisc add dev $swp1 clsact
ip link set dev $swp1 up
tunnel_create g1 gre 192.0.2.65 192.0.2.66 tos inherit
__addr_add_del g1 add 192.0.2.65/32
ip link set dev g1 up
__addr_add_del $rp1 add 198.51.100.1/28
ip link set dev $rp1 up
}
switch_destroy()
{
ip link set dev $rp1 down
__addr_add_del $rp1 del 198.51.100.1/28
ip link set dev g1 down
__addr_add_del g1 del 192.0.2.65/32
tunnel_destroy g1
ip link set dev $swp1 down
tc qdisc del dev $swp1 clsact
__addr_add_del $swp1 del 192.0.2.2/28
}
setup_prepare()
{
h1=${NETIFS[p1]}
swp1=${NETIFS[p2]}
rp1=${NETIFS[p3]}
rp2=${NETIFS[p4]}
forwarding_enable
vrf_prepare
h1_create
switch_create
vrf2_create
}
cleanup()
{
pre_cleanup
vrf2_destroy
switch_destroy
h1_destroy
vrf_cleanup
forwarding_restore
}
ecn_payload_get()
{
p=$(:
)"0"$( : GRE flags
)"0:00:"$( : Reserved + version
)"08:00:"$( : ETH protocol type
)"4"$( : IP version
)"5:"$( : IHL
)"00:"$( : IP TOS
)"00:14:"$( : IP total length
)"00:00:"$( : IP identification
)"20:00:"$( : IP flags + frag off
)"30:"$( : IP TTL
)"01:"$( : IP proto
)"E7:E6:"$( : IP header csum
)"C0:00:01:01:"$( : IP saddr : 192.0.1.1
)"C0:00:02:01:"$( : IP daddr : 192.0.2.1
)
echo $p
}
ecn_decap_test()
{
local trap_name="decap_error"
local group_name="tunnel_drops"
local desc=$1; shift
local ecn_desc=$1; shift
local outer_tos=$1; shift
local mz_pid
RET=0
tc filter add dev $swp1 egress protocol ip pref 1 handle 101 \
flower src_ip 192.0.1.1 dst_ip 192.0.2.1 action pass
rp1_mac=$(mac_get $rp1)
rp2_mac=$(mac_get $rp2)
payload=$(ecn_payload_get)
ip vrf exec v$rp2 $MZ $rp2 -c 0 -d 1msec -a $rp2_mac -b $rp1_mac \
-A 192.0.2.66 -B 192.0.2.65 -t ip \
len=48,tos=$outer_tos,proto=47,p=$payload -q &
mz_pid=$!
devlink_trap_exception_test $trap_name $group_name
tc_check_packets "dev $swp1 egress" 101 0
check_err $? "Packets were not dropped"
log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc"
kill $mz_pid && wait $mz_pid &> /dev/null
tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower
}
ipip_payload_get()
{
local flags=$1; shift
local key=$1; shift
p=$(:
)"$flags"$( : GRE flags
)"0:00:"$( : Reserved + version
)"08:00:"$( : ETH protocol type
)"$key"$( : Key
)"4"$( : IP version
)"5:"$( : IHL
)"00:"$( : IP TOS
)"00:14:"$( : IP total length
)"00:00:"$( : IP identification
)"20:00:"$( : IP flags + frag off
)"30:"$( : IP TTL
)"01:"$( : IP proto
)"E7:E6:"$( : IP header csum
)"C0:00:01:01:"$( : IP saddr : 192.0.1.1
)"C0:00:02:01:"$( : IP daddr : 192.0.2.1
)
echo $p
}
no_matching_tunnel_test()
{
local trap_name="decap_error"
local group_name="tunnel_drops"
local desc=$1; shift
local sip=$1; shift
local mz_pid
RET=0
tc filter add dev $swp1 egress protocol ip pref 1 handle 101 \
flower src_ip 192.0.1.1 dst_ip 192.0.2.1 action pass
rp1_mac=$(mac_get $rp1)
rp2_mac=$(mac_get $rp2)
payload=$(ipip_payload_get "$@")
ip vrf exec v$rp2 $MZ $rp2 -c 0 -d 1msec -a $rp2_mac -b $rp1_mac \
-A $sip -B 192.0.2.65 -t ip len=48,proto=47,p=$payload -q &
mz_pid=$!
devlink_trap_exception_test $trap_name $group_name
tc_check_packets "dev $swp1 egress" 101 0
check_err $? "Packets were not dropped"
log_test "$desc"
kill $mz_pid && wait $mz_pid &> /dev/null
tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower
}
decap_error_test()
{
# Correct source IP - the remote address
local sip=192.0.2.66
ecn_decap_test "Decap error" "ECT(1)" 01
ecn_decap_test "Decap error" "ECT(0)" 02
ecn_decap_test "Decap error" "CE" 03
no_matching_tunnel_test "Decap error: Source IP check failed" \
192.0.2.68 "0"
no_matching_tunnel_test \
"Decap error: Key exists but was not expected" $sip "2" ":E9:"
# Destroy the tunnel and create new one with key
__addr_add_del g1 del 192.0.2.65/32
tunnel_destroy g1
tunnel_create g1 gre 192.0.2.65 192.0.2.66 tos inherit key 233
__addr_add_del g1 add 192.0.2.65/32
no_matching_tunnel_test \
"Decap error: Key does not exist but was expected" $sip "0"
no_matching_tunnel_test \
"Decap error: Packet has a wrong key field" $sip "2" "E8:"
}
trap cleanup EXIT
setup_prepare
setup_wait
tests_run
exit $EXIT_STATUS
#!/bin/bash
# SPDX-License-Identifier: GPL-2.0
#
# Test devlink-trap tunnel drops and exceptions functionality over mlxsw.
# Check all traps to make sure they are triggered under the right
# conditions.
# +--------------------+
# | H1 (vrf) |
# | + $h1 |
# | | 192.0.2.1/28 |
# +----|---------------+
# |
# +----|----------------------------------------------------------------------+
# | SW | |
# | +--|--------------------------------------------------------------------+ |
# | | + $swp1 BR1 (802.1d) | |
# | | | |
# | | + vx1 (vxlan) | |
# | | local 192.0.2.17 | |
# | | id 1000 dstport $VXPORT | |
# | +-----------------------------------------------------------------------+ |
# | |
# | + $rp1 |
# | | 192.0.2.17/28 |
# +----|----------------------------------------------------------------------+
# |
# +----|--------------------------------------------------------+
# | | VRF2 |
# | + $rp2 |
# | 192.0.2.18/28 |
# | |
# +-------------------------------------------------------------+
lib_dir=$(dirname $0)/../../../net/forwarding
ALL_TESTS="
decap_error_test
overlay_smac_is_mc_test
"
NUM_NETIFS=4
source $lib_dir/lib.sh
source $lib_dir/tc_common.sh
source $lib_dir/devlink_lib.sh
: ${VXPORT:=4789}
export VXPORT
h1_create()
{
simple_if_init $h1 192.0.2.1/28
}
h1_destroy()
{
simple_if_fini $h1 192.0.2.1/28
}
switch_create()
{
ip link add name br1 type bridge vlan_filtering 0 mcast_snooping 0
# Make sure the bridge uses the MAC address of the local port and not
# that of the VxLAN's device.
ip link set dev br1 address $(mac_get $swp1)
ip link set dev br1 up
tc qdisc add dev $swp1 clsact
ip link set dev $swp1 master br1
ip link set dev $swp1 up
ip link add name vx1 type vxlan id 1000 local 192.0.2.17 \
dstport "$VXPORT" nolearning noudpcsum tos inherit ttl 100
ip link set dev vx1 master br1
ip link set dev vx1 up
ip address add dev $rp1 192.0.2.17/28
ip link set dev $rp1 up
}
switch_destroy()
{
ip link set dev $rp1 down
ip address del dev $rp1 192.0.2.17/28
ip link set dev vx1 down
ip link set dev vx1 nomaster
ip link del dev vx1
ip link set dev $swp1 down
ip link set dev $swp1 nomaster
tc qdisc del dev $swp1 clsact
ip link set dev br1 down
ip link del dev br1
}
vrf2_create()
{
simple_if_init $rp2 192.0.2.18/28
}
vrf2_destroy()
{
simple_if_fini $rp2 192.0.2.18/28
}
setup_prepare()
{
h1=${NETIFS[p1]}
swp1=${NETIFS[p2]}
rp1=${NETIFS[p3]}
rp2=${NETIFS[p4]}
vrf_prepare
forwarding_enable
h1_create
switch_create
vrf2_create
}
cleanup()
{
pre_cleanup
vrf2_destroy
switch_destroy
h1_destroy
forwarding_restore
vrf_cleanup
}
ecn_payload_get()
{
dest_mac=$(mac_get $h1)
p=$(:
)"08:"$( : VXLAN flags
)"00:00:00:"$( : VXLAN reserved
)"00:03:e8:"$( : VXLAN VNI : 1000
)"00:"$( : VXLAN reserved
)"$dest_mac:"$( : ETH daddr
)"00:00:00:00:00:00:"$( : ETH saddr
)"08:00:"$( : ETH type
)"45:"$( : IP version + IHL
)"00:"$( : IP TOS
)"00:14:"$( : IP total length
)"00:00:"$( : IP identification
)"20:00:"$( : IP flags + frag off
)"40:"$( : IP TTL
)"00:"$( : IP proto
)"D6:E5:"$( : IP header csum
)"c0:00:02:03:"$( : IP saddr: 192.0.2.3
)"c0:00:02:01:"$( : IP daddr: 192.0.2.1
)
echo $p
}
ecn_decap_test()
{
local trap_name="decap_error"
local group_name="tunnel_drops"
local desc=$1; shift
local ecn_desc=$1; shift
local outer_tos=$1; shift
local mz_pid
RET=0
tc filter add dev $swp1 egress protocol ip pref 1 handle 101 \
flower src_ip 192.0.2.3 dst_ip 192.0.2.1 action pass
rp1_mac=$(mac_get $rp1)
payload=$(ecn_payload_get)
ip vrf exec v$rp2 $MZ $rp2 -c 0 -d 1msec -b $rp1_mac -B 192.0.2.17 \
-t udp sp=12345,dp=$VXPORT,tos=$outer_tos,p=$payload -q &
mz_pid=$!
devlink_trap_exception_test $trap_name $group_name
tc_check_packets "dev $swp1 egress" 101 0
check_err $? "Packets were not dropped"
log_test "$desc: Inner ECN is not ECT and outer is $ecn_desc"
kill $mz_pid && wait $mz_pid &> /dev/null
tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower
}
reserved_bits_payload_get()
{
dest_mac=$(mac_get $h1)
p=$(:
)"08:"$( : VXLAN flags
)"01:00:00:"$( : VXLAN reserved
)"00:03:e8:"$( : VXLAN VNI : 1000
)"00:"$( : VXLAN reserved
)"$dest_mac:"$( : ETH daddr
)"00:00:00:00:00:00:"$( : ETH saddr
)"08:00:"$( : ETH type
)"45:"$( : IP version + IHL
)"00:"$( : IP TOS
)"00:14:"$( : IP total length
)"00:00:"$( : IP identification
)"20:00:"$( : IP flags + frag off
)"40:"$( : IP TTL
)"00:"$( : IP proto
)"00:00:"$( : IP header csum
)"c0:00:02:03:"$( : IP saddr: 192.0.2.3
)"c0:00:02:01:"$( : IP daddr: 192.0.2.1
)
echo $p
}
short_payload_get()
{
dest_mac=$(mac_get $h1)
p=$(:
)"08:"$( : VXLAN flags
)"01:00:00:"$( : VXLAN reserved
)"00:03:e8:"$( : VXLAN VNI : 1000
)"00:"$( : VXLAN reserved
)
echo $p
}
corrupted_packet_test()
{
local trap_name="decap_error"
local group_name="tunnel_drops"
local desc=$1; shift
local payload_get=$1; shift
local mz_pid
RET=0
# In case of too short packet, there is no any inner packet,
# so the matching will always succeed
tc filter add dev $swp1 egress protocol ip pref 1 handle 101 \
flower skip_hw src_ip 192.0.2.3 dst_ip 192.0.2.1 action pass
rp1_mac=$(mac_get $rp1)
payload=$($payload_get)
ip vrf exec v$rp2 $MZ $rp2 -c 0 -d 1msec -b $rp1_mac \
-B 192.0.2.17 -t udp sp=12345,dp=$VXPORT,p=$payload -q &
mz_pid=$!
devlink_trap_exception_test $trap_name $group_name
tc_check_packets "dev $swp1 egress" 101 0
check_err $? "Packets were not dropped"
log_test "$desc"
kill $mz_pid && wait $mz_pid &> /dev/null
tc filter del dev $swp1 egress protocol ip pref 1 handle 101 flower
}
decap_error_test()
{
ecn_decap_test "Decap error" "ECT(1)" 01
ecn_decap_test "Decap error" "ECT(0)" 02
ecn_decap_test "Decap error" "CE" 03
corrupted_packet_test "Decap error: Reserved bits in use" \
"reserved_bits_payload_get"
corrupted_packet_test "Decap error: No L2 header" "short_payload_get"
}
mc_smac_payload_get()
{
dest_mac=$(mac_get $h1)
source_mac=01:02:03:04:05:06
p=$(:
)"08:"$( : VXLAN flags
)"00:00:00:"$( : VXLAN reserved
)"00:03:e8:"$( : VXLAN VNI : 1000
)"00:"$( : VXLAN reserved
)"$dest_mac:"$( : ETH daddr
)"$source_mac:"$( : ETH saddr
)"08:00:"$( : ETH type
)"45:"$( : IP version + IHL
)"00:"$( : IP TOS
)"00:14:"$( : IP total length
)"00:00:"$( : IP identification
)"20:00:"$( : IP flags + frag off
)"40:"$( : IP TTL
)"00:"$( : IP proto
)"00:00:"$( : IP header csum
)"c0:00:02:03:"$( : IP saddr: 192.0.2.3
)"c0:00:02:01:"$( : IP daddr: 192.0.2.1
)
echo $p
}
overlay_smac_is_mc_test()
{
local trap_name="overlay_smac_is_mc"
local group_name="tunnel_drops"
local mz_pid
RET=0
# The matching will be checked on devlink_trap_drop_test()
# and the filter will be removed on devlink_trap_drop_cleanup()
tc filter add dev $swp1 egress protocol ip pref 1 handle 101 \
flower src_mac 01:02:03:04:05:06 action pass
rp1_mac=$(mac_get $rp1)
payload=$(mc_smac_payload_get)
ip vrf exec v$rp2 $MZ $rp2 -c 0 -d 1msec -b $rp1_mac \
-B 192.0.2.17 -t udp sp=12345,dp=$VXPORT,p=$payload -q &
mz_pid=$!
devlink_trap_drop_test $trap_name $group_name $swp1
log_test "Overlay source MAC is multicast"
devlink_trap_drop_cleanup $mz_pid $swp1 "ip"
}
trap cleanup EXIT
setup_prepare
setup_wait
tests_run
exit $EXIT_STATUS
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment