Commit 7312cb64 authored by yonghong-song's avatar yonghong-song Committed by GitHub

Merge pull request #1810 from lcp/xdp_redirect-v2

[V2] Add two map types for bpf_redirect_map()
parents 3fc78a4d 87a873f7
......@@ -41,15 +41,18 @@ This guide is incomplete. If something feels missing, check the bcc and kernel s
- [7. BPF_PERCPU_ARRAY](#7-bpf_percpu_array)
- [8. BPF_LPM_TRIE](#8-bpf_lpm_trie)
- [9. BPF_PROG_ARRAY](#9-bpf_prog_array)
- [10. map.lookup()](#10-maplookup)
- [11. map.lookup_or_init()](#11-maplookup_or_init)
- [12. map.delete()](#12-mapdelete)
- [13. map.update()](#13-mapupdate)
- [14. map.insert()](#14-mapinsert)
- [15. map.increment()](#15-mapincrement)
- [16. map.get_stackid()](#16-mapget_stackid)
- [17. map.perf_read()](#17-mapperf_read)
- [18. map.call()](#18-mapcall)
- [10. BPF_DEVMAP](#10-bpf_devmap)
- [11. BPF_CPUMAP](#11-bpf_cpumap)
- [12. map.lookup()](#12-maplookup)
- [13. map.lookup_or_init()](#13-maplookup_or_init)
- [14. map.delete()](#14-mapdelete)
- [15. map.update()](#15-mapupdate)
- [16. map.insert()](#16-mapinsert)
- [17. map.increment()](#17-mapincrement)
- [18. map.get_stackid()](#18-mapget_stackid)
- [19. map.perf_read()](#19-mapperf_read)
- [20. map.call()](#20-mapcall)
- [21. map.redirect_map()](#21-mapredirect_map)
- [bcc Python](#bcc-python)
- [Initialization](#initialization)
......@@ -664,7 +667,39 @@ Examples in situ:
[search /tests](https://github.com/iovisor/bcc/search?q=BPF_PROG_ARRAY+path%3Atests&type=Code),
[assign fd](https://github.com/iovisor/bcc/blob/master/examples/networking/tunnel_monitor/monitor.py#L24-L26)
### 10. map.lookup()
### 10. BPF_DEVMAP
Syntax: ```BPF_DEVMAP(name, size)```
This creates a device map named ```name``` with ```size``` entries. Each entry of the map is an `ifindex` to a network interface. This map is only used in XDP.
For example:
```C
BPF_DEVMAP(devmap, 10);
```
Methods (covered later): map.redirect_map().
Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=BPF_DEVMAP+path%3Aexamples&type=Code),
### 11. BPF_CPUMAP
Syntax: ```BPF_CPUMAP(name, size)```
This creates a cpu map named ```name``` with ```size``` entries. The index of the map represents the CPU id and each entry is the size of the ring buffer allocated for the CPU. This map is only used in XDP.
For example:
```C
BPF_CPUMAP(cpumap, 16);
```
Methods (covered later): map.redirect_map().
Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=BPF_CPUMAP+path%3Aexamples&type=Code),
### 12. map.lookup()
Syntax: ```*val map.lookup(&key)```
......@@ -674,7 +709,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=lookup+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=lookup+path%3Atools&type=Code)
### 11. map.lookup_or_init()
### 13. map.lookup_or_init()
Syntax: ```*val map.lookup_or_init(&key, &zero)```
......@@ -684,7 +719,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=lookup_or_init+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=lookup_or_init+path%3Atools&type=Code)
### 12. map.delete()
### 14. map.delete()
Syntax: ```map.delete(&key)```
......@@ -694,7 +729,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=delete+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=delete+path%3Atools&type=Code)
### 13. map.update()
### 15. map.update()
Syntax: ```map.update(&key, &val)```
......@@ -704,7 +739,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=update+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=update+path%3Atools&type=Code)
### 14. map.insert()
### 16. map.insert()
Syntax: ```map.insert(&key, &val)```
......@@ -713,7 +748,7 @@ Associate the value in the second argument to the key, only if there was no prev
Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=insert+path%3Aexamples&type=Code)
### 15. map.increment()
### 17. map.increment()
Syntax: ```map.increment(key)```
......@@ -723,7 +758,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=increment+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=increment+path%3Atools&type=Code)
### 16. map.get_stackid()
### 18. map.get_stackid()
Syntax: ```int map.get_stackid(void *ctx, u64 flags)```
......@@ -733,7 +768,7 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?q=get_stackid+path%3Aexamples&type=Code),
[search /tools](https://github.com/iovisor/bcc/search?q=get_stackid+path%3Atools&type=Code)
### 17. map.perf_read()
### 19. map.perf_read()
Syntax: ```u64 map.perf_read(u32 cpu)```
......@@ -742,7 +777,7 @@ This returns the hardware performance counter as configured in [5. BPF_PERF_ARRA
Examples in situ:
[search /tests](https://github.com/iovisor/bcc/search?q=perf_read+path%3Atests&type=Code)
### 18. map.call()
### 20. map.call()
Syntax: ```void map.call(void *ctx, int index)```
......@@ -781,6 +816,44 @@ Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?l=C&q=call+path%3Aexamples&type=Code),
[search /tests](https://github.com/iovisor/bcc/search?l=C&q=call+path%3Atests&type=Code)
### 21. map.redirect_map()
Syntax: ```int map.redirect_map(int index, int flags)```
This redirects the incoming packets based on the ```index``` entry. If the map is [10. BPF_DEVMAP](#10-bpf_devmap), the packet will be sent to the transmit queue of the network interface that the entry points to. If the map is [11. BPF_CPUMAP](#11-bpf_cpumap), the packet will be sent to the ring buffer of the ```index``` CPU and be processed by the CPU later.
If the packet is redirected successfully, the function will return XDP_REDIRECT. Otherwise, it will return XDP_ABORTED to discard the packet.
For example:
```C
BPF_DEVMAP(devmap, 1);
int redirect_example(struct xdp_md *ctx) {
return devmap.redirect_map(0, 0);
}
int xdp_dummy(struct xdp_md *ctx) {
return XDP_PASS;
}
```
```Python
ip = pyroute2.IPRoute()
idx = ip.link_lookup(ifname="eth1")[0]
b = bcc.BPF(src_file="example.c")
devmap = b.get_table("devmap")
devmap[c_uint32(0)] = c_int(idx)
in_fn = b.load_func("redirect_example", BPF.XDP)
out_fn = b.load_func("xdp_dummy", BPF.XDP)
b.attach_xdp("eth0", in_fn, 0)
b.attach_xdp("eth1", out_fn, 0)
```
Examples in situ:
[search /examples](https://github.com/iovisor/bcc/search?l=C&q=redirect_map+path%3Aexamples&type=Code),
# bcc Python
## Initialization
......
#!/usr/bin/env python
#
# xdp_redirect_cpu.py Redirect the incoming packet to the specific CPU
#
# Copyright (c) 2018 Gary Lin
# Licensed under the Apache License, Version 2.0 (the "License")
from bcc import BPF
import time
import sys
from multiprocessing import cpu_count
import ctypes as ct
flags = 0
def usage():
print("Usage: {0} <in ifdev> <CPU id>".format(sys.argv[0]))
print("e.g.: {0} eth0 2\n".format(sys.argv[0]))
exit(1)
if len(sys.argv) != 3:
usage()
in_if = sys.argv[1]
cpu_id = int(sys.argv[2])
max_cpu = cpu_count()
if (cpu_id > max_cpu):
print("Invalid CPU id")
exit(1)
# load BPF program
b = BPF(text = """
#define KBUILD_MODNAME "foo"
#include <uapi/linux/bpf.h>
#include <linux/in.h>
#include <linux/if_ether.h>
BPF_CPUMAP(cpumap, __MAX_CPU__);
BPF_ARRAY(dest, uint32_t, 1);
BPF_PERCPU_ARRAY(rxcnt, long, 1);
int xdp_redirect_cpu(struct xdp_md *ctx) {
void* data_end = (void*)(long)ctx->data_end;
void* data = (void*)(long)ctx->data;
struct ethhdr *eth = data;
uint32_t key = 0;
long *value;
uint32_t *cpu;
uint64_t nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
cpu = dest.lookup(&key);
if (!cpu)
return XDP_PASS;
value = rxcnt.lookup(&key);
if (value)
*value += 1;
return cpumap.redirect_map(*cpu, 0);
}
int xdp_dummy(struct xdp_md *ctx) {
return XDP_PASS;
}
""", cflags=["-w", "-D__MAX_CPU__=%u" % max_cpu], debug=0)
dest = b.get_table("dest");
dest[0] = ct.c_uint32(cpu_id)
cpumap = b.get_table("cpumap");
cpumap[cpu_id] = ct.c_uint32(192)
in_fn = b.load_func("xdp_redirect_cpu", BPF.XDP)
b.attach_xdp(in_if, in_fn, flags)
rxcnt = b.get_table("rxcnt")
prev = 0
print("Printing redirected packets, hit CTRL+C to stop")
while 1:
try:
val = rxcnt.sum(0).value
if val:
delta = val - prev
prev = val
print("{} pkt/s to CPU {}".format(delta, cpu_id))
time.sleep(1)
except KeyboardInterrupt:
print("Removing filter from device")
break;
b.remove_xdp(in_if, flags)
#!/usr/bin/env python
#
# xdp_redirect_map.py Redirect the incoming packet to another interface
# with the helper: bpf_redirect_map()
#
# Copyright (c) 2018 Gary Lin
# Licensed under the Apache License, Version 2.0 (the "License")
from bcc import BPF
import pyroute2
import time
import sys
import ctypes as ct
flags = 0
def usage():
print("Usage: {0} <in ifdev> <out ifdev>".format(sys.argv[0]))
print("e.g.: {0} eth0 eth1\n".format(sys.argv[0]))
exit(1)
if len(sys.argv) != 3:
usage()
in_if = sys.argv[1]
out_if = sys.argv[2]
ip = pyroute2.IPRoute()
out_idx = ip.link_lookup(ifname=out_if)[0]
# load BPF program
b = BPF(text = """
#define KBUILD_MODNAME "foo"
#include <uapi/linux/bpf.h>
#include <linux/in.h>
#include <linux/if_ether.h>
BPF_DEVMAP(tx_port, 1);
BPF_PERCPU_ARRAY(rxcnt, long, 1);
static inline void swap_src_dst_mac(void *data)
{
unsigned short *p = data;
unsigned short dst[3];
dst[0] = p[0];
dst[1] = p[1];
dst[2] = p[2];
p[0] = p[3];
p[1] = p[4];
p[2] = p[5];
p[3] = dst[0];
p[4] = dst[1];
p[5] = dst[2];
}
int xdp_redirect_map(struct xdp_md *ctx) {
void* data_end = (void*)(long)ctx->data_end;
void* data = (void*)(long)ctx->data;
struct ethhdr *eth = data;
uint32_t key = 0;
long *value;
uint64_t nh_off;
nh_off = sizeof(*eth);
if (data + nh_off > data_end)
return XDP_DROP;
value = rxcnt.lookup(&key);
if (value)
*value += 1;
swap_src_dst_mac(data);
return tx_port.redirect_map(0, 0);
}
int xdp_dummy(struct xdp_md *ctx) {
return XDP_PASS;
}
""", cflags=["-w"])
tx_port = b.get_table("tx_port")
tx_port[0] = ct.c_int(out_idx)
in_fn = b.load_func("xdp_redirect_map", BPF.XDP)
out_fn = b.load_func("xdp_dummy", BPF.XDP)
b.attach_xdp(in_if, in_fn, flags)
b.attach_xdp(out_if, out_fn, flags)
rxcnt = b.get_table("rxcnt")
prev = 0
print("Printing redirected packets, hit CTRL+C to stop")
while 1:
try:
val = rxcnt.sum(0).value
if val:
delta = val - prev
prev = val
print("{} pkt/s".format(delta))
time.sleep(1)
except KeyboardInterrupt:
print("Removing filter from device")
break;
b.remove_xdp(in_if, flags)
b.remove_xdp(out_if, flags)
......@@ -201,6 +201,23 @@ struct bpf_stacktrace {
#define BPF_PROG_ARRAY(_name, _max_entries) \
BPF_TABLE("prog", u32, u32, _name, _max_entries)
#define BPF_XDP_REDIRECT_MAP(_table_type, _leaf_type, _name, _max_entries) \
struct _name##_table_t { \
u32 key; \
_leaf_type leaf; \
/* xdp_act = map.redirect_map(index, flag) */ \
u64 (*redirect_map) (int, int); \
u32 max_entries; \
}; \
__attribute__((section("maps/"_table_type))) \
struct _name##_table_t _name = { .max_entries = (_max_entries) }
#define BPF_DEVMAP(_name, _max_entries) \
BPF_XDP_REDIRECT_MAP("devmap", int, _name, _max_entries)
#define BPF_CPUMAP(_name, _max_entries) \
BPF_XDP_REDIRECT_MAP("cpumap", u32, _name, _max_entries)
// packet parsing state machine helpers
#define cursor_advance(_cursor, _len) \
({ void *_tmp = _cursor; _cursor += _len; _tmp; })
......
......@@ -609,6 +609,9 @@ bool BTypeVisitor::VisitCallExpr(CallExpr *Call) {
} else if (memb_name == "check_current_task") {
prefix = "bpf_current_task_under_cgroup";
suffix = ")";
} else if (memb_name == "redirect_map") {
prefix = "bpf_redirect_map";
suffix = ")";
} else {
error(Call->getLocStart(), "invalid bpf_table operation %0") << memb_name;
return false;
......@@ -920,6 +923,10 @@ bool BTypeVisitor::VisitVarDecl(VarDecl *Decl) {
map_type = BPF_MAP_TYPE_CGROUP_ARRAY;
} else if (A->getName() == "maps/stacktrace") {
map_type = BPF_MAP_TYPE_STACK_TRACE;
} else if (A->getName() == "maps/devmap") {
map_type = BPF_MAP_TYPE_DEVMAP;
} else if (A->getName() == "maps/cpumap") {
map_type = BPF_MAP_TYPE_CPUMAP;
} else if (A->getName() == "maps/extern") {
if (!fe_.table_storage().Find(global_path, table_it)) {
error(Decl->getLocStart(), "reference to undefined table");
......
......@@ -152,6 +152,7 @@ class BPF(object):
XDP_DROP = 1
XDP_PASS = 2
XDP_TX = 3
XDP_REDIRECT = 4
_probe_repl = re.compile(b"[^a-zA-Z0-9_]")
_sym_caches = {}
......
......@@ -36,6 +36,13 @@ BPF_MAP_TYPE_CGROUP_ARRAY = 8
BPF_MAP_TYPE_LRU_HASH = 9
BPF_MAP_TYPE_LRU_PERCPU_HASH = 10
BPF_MAP_TYPE_LPM_TRIE = 11
BPF_MAP_TYPE_ARRAY_OF_MAPS = 12
BPF_MAP_TYPE_HASH_OF_MAPS = 13
BPF_MAP_TYPE_DEVMAP = 14
BPF_MAP_TYPE_SOCKMAP = 15
BPF_MAP_TYPE_CPUMAP = 16
BPF_MAP_TYPE_XSKMAP = 17
BPF_MAP_TYPE_SOCKHASH = 18
stars_max = 40
log2_index_max = 65
......@@ -144,6 +151,10 @@ def Table(bpf, map_id, map_fd, keytype, leaftype, **kwargs):
t = LruPerCpuHash(bpf, map_id, map_fd, keytype, leaftype)
elif ttype == BPF_MAP_TYPE_CGROUP_ARRAY:
t = CgroupArray(bpf, map_id, map_fd, keytype, leaftype)
elif ttype == BPF_MAP_TYPE_DEVMAP:
t = DevMap(bpf, map_id, map_fd, keytype, leaftype)
elif ttype == BPF_MAP_TYPE_CPUMAP:
t = CpuMap(bpf, map_id, map_fd, keytype, leaftype)
if t == None:
raise Exception("Unknown table type %d" % ttype)
return t
......@@ -766,3 +777,11 @@ class StackTrace(TableBase):
def clear(self):
pass
class DevMap(ArrayBase):
def __init__(self, *args, **kwargs):
super(DevMap, self).__init__(*args, **kwargs)
class CpuMap(ArrayBase):
def __init__(self, *args, **kwargs):
super(CpuMap, self).__init__(*args, **kwargs)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment