Commit db473df2 authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'selftests/xsk: speed-ups, fixes, and new XDP programs'

Magnus Karlsson says:

====================

This is a patch set of various performance improvements, fixes, and
the introduction of more than one XDP program to the xsk selftests
framework so we can test more things in the future such as upcoming
multi-buffer and metadata support for AF_XDP. The new programs just
reuse the framework that all the other eBPF selftests use. The new
feature is used to implement one new test that does XDP_DROP on every
other packet. More tests using this will be added in future commits.

Contents:

* The run-time of the test suite is cut by 10x when executing the
  tests on a real NIC, by only attaching the XDP program once per mode
  tested, instead of once per test program.

* Over 700 lines of code have been removed. The xsk.c control file was
  moved straight over from libbpf when the xsk support was deprecated
  there. As it is now not used as library code that has to work with
  all kinds of versions of Linux, a lot of code could be dropped or
  simplified.

* Add a new command line option "-d" that can be used when a test
  fails and you want to debug it with gdb or some other debugger. The
  option creates the two veth netdevs and prints them to the screen
  without deleting them afterwards. This way these veth netdevs can be
  used when running xskxceiver in a debugger.

* Implemented the possibility to load external XDP programs so we can
  have more than the default one. This feature is used to implement a
  test where every other packet is dropped. Good exercise for the
  recycling mechanism of the xsk buffer pool used in zero-copy mode.

* Various clean-ups and small fixes in patches 1 to 5. None of these
  fixes has any impact on the correct execution of the tests when they
  pass, though they can be irritating when a test fails. IMHO, they do
  not need to go to bpf as they will not fix anything there. The first
  version of patches 1, 2, and 4 where previously sent to bpf, but has
  now been included here.

v2 -> v3:
* Fixed compilation error for llvm [David]
* Made the function xsk_is_in_drv_mode(ifobj) more generic by changing
  it to xsk_is_in_mode(ifobj, xdp_mode) [Maciej]
* Added Maciej's acks to all the patches

v1 -> v2:
* Fixed spelling error in commit message of patch #6 [Björn]
* Added explanation on why it is safe to use C11 atomics in patch #7
  [Daniel]
* Put all XDP programs in the same file so that adding more XDP
  programs to xskxceiver.c becomes more scalable in patches #11 and
  #12 [Maciej]
* Removed more dead code in patch #8 [Maciej]
* Removed stale %s specifier in error print, patch #9 [Maciej]
* Changed name of XDP_CONSUMES_SOME_PACKETS to XDP_DROP_HALF to
  hopefully make it clearer [Maciej]
* ifobj_rx and ifobj_tx name changes in patch #13 [Maciej]
* Simplified XDP attachment code in patch #15 [Maciej]

Patches:
1-5:   Small fixes and clean-ups
6:     New convenient debug option when using a debugger such as gdb
7-8:   Removal of unnecessary code
9:     Add the ability to load external XDP programs
10-11: Removal of more unnecessary code
12:    Implement a new test where every other packet is XDP_DROP:ed
13:    Unify the thread dispatching code
14-15: Simplify the way tests are written when using custom packet_streams
       or custom XDP programs

Thanks: Magnus
====================
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 5fbea423 7d8319a7
......@@ -240,7 +240,6 @@ $(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS)
$(OUTPUT)/test_maps: $(TESTING_HELPERS)
$(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS)
$(OUTPUT)/xsk.o: $(BPFOBJ)
$(OUTPUT)/xskxceiver: $(OUTPUT)/xsk.o
BPFTOOL ?= $(DEFAULT_BPFTOOL)
$(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
......@@ -383,6 +382,7 @@ linked_maps.skel.h-deps := linked_maps1.bpf.o linked_maps2.bpf.o
test_subskeleton.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o test_subskeleton.bpf.o
test_subskeleton_lib.skel.h-deps := test_subskeleton_lib2.bpf.o test_subskeleton_lib.bpf.o
test_usdt.skel.h-deps := test_usdt.bpf.o test_usdt_multispec.bpf.o
xsk_xdp_progs.skel.h-deps := xsk_xdp_progs.bpf.o
LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps)))
......@@ -576,6 +576,10 @@ $(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT)
$(call msg,BINARY,,$@)
$(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@
$(OUTPUT)/xskxceiver: xskxceiver.c $(OUTPUT)/xsk.o $(OUTPUT)/xsk_xdp_progs.skel.h $(BPFOBJ) | $(OUTPUT)
$(call msg,BINARY,,$@)
$(Q)$(CC) $(CFLAGS) $(filter %.a %.o %.c,$^) $(LDLIBS) -o $@
# Make sure we are able to include and link libbpf against c++.
$(OUTPUT)/test_cpp: test_cpp.cpp $(OUTPUT)/test_core_extern.skel.h $(BPFOBJ)
$(call msg,CXX,,$@)
......
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2022 Intel */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_XSKMAP);
__uint(max_entries, 1);
__uint(key_size, sizeof(int));
__uint(value_size, sizeof(int));
} xsk SEC(".maps");
static unsigned int idx;
SEC("xdp") int xsk_def_prog(struct xdp_md *xdp)
{
return bpf_redirect_map(&xsk, 0, XDP_DROP);
}
SEC("xdp") int xsk_xdp_drop(struct xdp_md *xdp)
{
/* Drop every other packet */
if (idx++ % 2)
return XDP_DROP;
return bpf_redirect_map(&xsk, 0, XDP_DROP);
}
char _license[] SEC("license") = "GPL";
......@@ -24,8 +24,6 @@
# ----------- | ----------
# | vethX | --------- | vethY |
# ----------- peer ----------
# | | |
# namespaceX | namespaceY
#
# AF_XDP is an address family optimized for high performance packet processing,
# it is XDP’s user-space interface.
......@@ -39,10 +37,9 @@
# Prerequisites setup by script:
#
# Set up veth interfaces as per the topology shown ^^:
# * setup two veth interfaces and one namespace
# ** veth<xxxx> in root namespace
# ** veth<yyyy> in af_xdp<xxxx> namespace
# ** namespace af_xdp<xxxx>
# * setup two veth interfaces
# ** veth<xxxx>
# ** veth<yyyy>
# *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid
# conflict with any existing interface
# * tests the veth and xsk layers of the topology
......@@ -74,6 +71,9 @@
# Run and dump packet contents:
# sudo ./test_xsk.sh -D
#
# Set up veth interfaces and leave them up so xskxceiver can be launched in a debugger:
# sudo ./test_xsk.sh -d
#
# Run test suite for physical device in loopback mode
# sudo ./test_xsk.sh -i IFACE
......@@ -81,11 +81,12 @@
ETH=""
while getopts "vDi:" flag
while getopts "vDi:d" flag
do
case "${flag}" in
v) verbose=1;;
D) dump_pkts=1;;
d) debug=1;;
i) ETH=${OPTARG};;
esac
done
......@@ -99,28 +100,25 @@ VETH0_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head -
VETH0=ve${VETH0_POSTFIX}
VETH1_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head --bytes 4)
VETH1=ve${VETH1_POSTFIX}
NS0=root
NS1=af_xdp${VETH1_POSTFIX}
MTU=1500
trap ctrl_c INT
function ctrl_c() {
cleanup_exit ${VETH0} ${VETH1} ${NS1}
cleanup_exit ${VETH0} ${VETH1}
exit 1
}
setup_vethPairs() {
if [[ $verbose -eq 1 ]]; then
echo "setting up ${VETH0}: namespace: ${NS0}"
echo "setting up ${VETH0}"
fi
ip netns add ${NS1}
ip link add ${VETH0} numtxqueues 4 numrxqueues 4 type veth peer name ${VETH1} numtxqueues 4 numrxqueues 4
if [ -f /proc/net/if_inet6 ]; then
echo 1 > /proc/sys/net/ipv6/conf/${VETH0}/disable_ipv6
fi
if [[ $verbose -eq 1 ]]; then
echo "setting up ${VETH1}: namespace: ${NS1}"
echo "setting up ${VETH1}"
fi
if [[ $busy_poll -eq 1 ]]; then
......@@ -130,18 +128,15 @@ setup_vethPairs() {
echo 200000 > /sys/class/net/${VETH1}/gro_flush_timeout
fi
ip link set ${VETH1} netns ${NS1}
ip netns exec ${NS1} ip link set ${VETH1} mtu ${MTU}
ip link set ${VETH1} mtu ${MTU}
ip link set ${VETH0} mtu ${MTU}
ip netns exec ${NS1} ip link set ${VETH1} up
ip netns exec ${NS1} ip link set dev lo up
ip link set ${VETH1} up
ip link set ${VETH0} up
}
if [ ! -z $ETH ]; then
VETH0=${ETH}
VETH1=${ETH}
NS1=""
else
validate_root_exec
validate_veth_support ${VETH0}
......@@ -151,7 +146,7 @@ else
retval=$?
if [ $retval -ne 0 ]; then
test_status $retval "${TEST_NAME}"
cleanup_exit ${VETH0} ${VETH1} ${NS1}
cleanup_exit ${VETH0} ${VETH1}
exit $retval
fi
fi
......@@ -174,10 +169,15 @@ statusList=()
TEST_NAME="XSK_SELFTESTS_${VETH0}_SOFTIRQ"
if [[ $debug -eq 1 ]]; then
echo "-i" ${VETH0} "-i" ${VETH1}
exit
fi
exec_xskxceiver
if [ -z $ETH ]; then
cleanup_exit ${VETH0} ${VETH1} ${NS1}
cleanup_exit ${VETH0} ${VETH1}
fi
TEST_NAME="XSK_SELFTESTS_${VETH0}_BUSY_POLL"
busy_poll=1
......@@ -190,7 +190,7 @@ exec_xskxceiver
## END TESTS
if [ -z $ETH ]; then
cleanup_exit ${VETH0} ${VETH1} ${NS1}
cleanup_exit ${VETH0} ${VETH1}
fi
failures=0
......
This diff is collapsed.
......@@ -23,77 +23,6 @@
extern "C" {
#endif
/* This whole API has been deprecated and moved to libxdp that can be found at
* https://github.com/xdp-project/xdp-tools. The APIs are exactly the same so
* it should just be linking with libxdp instead of libbpf for this set of
* functionality. If not, please submit a bug report on the aforementioned page.
*/
/* Load-Acquire Store-Release barriers used by the XDP socket
* library. The following macros should *NOT* be considered part of
* the xsk.h API, and is subject to change anytime.
*
* LIBRARY INTERNAL
*/
#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x)
#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v)
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile("" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile("" : : : "memory"); \
___p1; \
})
#elif defined(__aarch64__)
# define libbpf_smp_store_release(p, v) \
asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory")
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1; \
asm volatile ("ldar %w0, %1" \
: "=r" (___p1) : "Q" (*p) : "memory"); \
___p1; \
})
#elif defined(__riscv)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile ("fence rw,w" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile ("fence r,rw" : : : "memory"); \
___p1; \
})
#endif
#ifndef libbpf_smp_store_release
#define libbpf_smp_store_release(p, v) \
do { \
__sync_synchronize(); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
#endif
#ifndef libbpf_smp_load_acquire
#define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
__sync_synchronize(); \
___p1; \
})
#endif
/* LIBRARY INTERNAL -- END */
/* Do not access these members directly. Use the functions below. */
#define DEFINE_XSK_RING(name) \
struct name { \
......@@ -168,7 +97,7 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
* this function. Without this optimization it whould have been
* free_entries = r->cached_prod - r->cached_cons + r->size.
*/
r->cached_cons = libbpf_smp_load_acquire(r->consumer);
r->cached_cons = __atomic_load_n(r->consumer, __ATOMIC_ACQUIRE);
r->cached_cons += r->size;
return r->cached_cons - r->cached_prod;
......@@ -179,7 +108,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
__u32 entries = r->cached_prod - r->cached_cons;
if (entries == 0) {
r->cached_prod = libbpf_smp_load_acquire(r->producer);
r->cached_prod = __atomic_load_n(r->producer, __ATOMIC_ACQUIRE);
entries = r->cached_prod - r->cached_cons;
}
......@@ -202,7 +131,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
/* Make sure everything has been written to the ring before indicating
* this to the kernel by writing the producer pointer.
*/
libbpf_smp_store_release(prod->producer, *prod->producer + nb);
__atomic_store_n(prod->producer, *prod->producer + nb, __ATOMIC_RELEASE);
}
static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
......@@ -227,8 +156,7 @@ static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
/* Make sure data has been read before indicating we are done
* with the entries by updating the consumer pointer.
*/
libbpf_smp_store_release(cons->consumer, *cons->consumer + nb);
__atomic_store_n(cons->consumer, *cons->consumer + nb, __ATOMIC_RELEASE);
}
static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
......@@ -269,18 +197,15 @@ struct xsk_umem_config {
__u32 flags;
};
int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd);
int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
/* Flags for the libbpf_flags field. */
#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)
int xsk_attach_xdp_program(struct bpf_program *prog, int ifindex, u32 xdp_flags);
void xsk_detach_xdp_program(int ifindex, u32 xdp_flags);
int xsk_update_xskmap(struct bpf_map *map, struct xsk_socket *xsk);
void xsk_clear_xskmap(struct bpf_map *map);
bool xsk_is_in_mode(u32 ifindex, int mode);
struct xsk_socket_config {
__u32 rx_size;
__u32 tx_size;
__u32 libbpf_flags;
__u32 xdp_flags;
__u16 bind_flags;
};
......@@ -291,13 +216,13 @@ int xsk_umem__create(struct xsk_umem **umem,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
int xsk_socket__create(struct xsk_socket **xsk,
const char *ifname, __u32 queue_id,
int ifindex, __u32 queue_id,
struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
const struct xsk_socket_config *config);
int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
int ifindex,
__u32 queue_id, struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
......
......@@ -55,21 +55,13 @@ test_exit()
clear_configs()
{
if [ $(ip netns show | grep $3 &>/dev/null; echo $?;) == 0 ]; then
[ $(ip netns exec $3 ip link show $2 &>/dev/null; echo $?;) == 0 ] &&
{ ip netns exec $3 ip link del $2; }
ip netns del $3
fi
#Once we delete a veth pair node, the entire veth pair is removed,
#this is just to be cautious just incase the NS does not exist then
#veth node inside NS won't get removed so we explicitly remove it
[ $(ip link show $1 &>/dev/null; echo $?;) == 0 ] &&
{ ip link del $1; }
}
cleanup_exit()
{
clear_configs $1 $2 $3
clear_configs $1 $2
}
validate_ip_utility()
......@@ -83,7 +75,7 @@ exec_xskxceiver()
ARGS+="-b "
fi
./${XSKOBJ} -i ${VETH0} -i ${VETH1},${NS1} ${ARGS}
./${XSKOBJ} -i ${VETH0} -i ${VETH1} ${ARGS}
retval=$?
test_status $retval "${TEST_NAME}"
......
This diff is collapsed.
......@@ -5,6 +5,8 @@
#ifndef XSKXCEIVER_H_
#define XSKXCEIVER_H_
#include "xsk_xdp_progs.skel.h"
#ifndef SOL_XDP
#define SOL_XDP 283
#endif
......@@ -30,7 +32,6 @@
#define TEST_CONTINUE 1
#define MAX_INTERFACES 2
#define MAX_INTERFACE_NAME_CHARS 16
#define MAX_INTERFACES_NAMESPACE_CHARS 16
#define MAX_SOCKETS 2
#define MAX_TEST_NAME_SIZE 32
#define MAX_TEARDOWN_ITER 10
......@@ -86,6 +87,7 @@ enum test_type {
TEST_TYPE_STATS_RX_FULL,
TEST_TYPE_STATS_FILL_EMPTY,
TEST_TYPE_BPF_RES,
TEST_TYPE_XDP_DROP_HALF,
TEST_TYPE_MAX
};
......@@ -133,18 +135,19 @@ typedef void *(*thread_func_t)(void *arg);
struct ifobject {
char ifname[MAX_INTERFACE_NAME_CHARS];
char nsname[MAX_INTERFACES_NAMESPACE_CHARS];
struct xsk_socket_info *xsk;
struct xsk_socket_info *xsk_arr;
struct xsk_umem_info *umem;
thread_func_t func_ptr;
validation_func_t validation_func;
struct pkt_stream *pkt_stream;
int ns_fd;
int xsk_map_fd;
struct xsk_xdp_progs *xdp_progs;
struct bpf_map *xskmap;
struct bpf_program *xdp_prog;
enum test_mode mode;
int ifindex;
u32 dst_ip;
u32 src_ip;
u32 xdp_flags;
u32 bind_flags;
u16 src_port;
u16 dst_port;
......@@ -164,6 +167,10 @@ struct test_spec {
struct ifobject *ifobj_rx;
struct pkt_stream *tx_pkt_stream_default;
struct pkt_stream *rx_pkt_stream_default;
struct bpf_program *xdp_prog_rx;
struct bpf_program *xdp_prog_tx;
struct bpf_map *xskmap_rx;
struct bpf_map *xskmap_tx;
u16 total_steps;
u16 current_step;
u16 nb_sockets;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment