Commit d96f6ca1 authored by affansyed's avatar affansyed

Merge branch 'master' into master

parents 7c83a343 8ddb79d3
![BCC Logo](images/logo2.png)
# BPF Compiler Collection (BCC)
This directory contains source code for BCC, a toolkit for creating small
programs that can be dynamically loaded into a Linux kernel.
BCC is a toolkit for creating efficient kernel tracing and manipulation
programs, and includes several useful tools and examples. It makes use of eBPF
(Extended Berkeley Packet Filters), a new feature that was first added to
Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.
eBPF was [described by](https://lkml.org/lkml/2015/4/14/232) Ingo Molnár as:
> One of the more interesting features in this cycle is the ability to attach eBPF programs (user-defined, sandboxed bytecode executed by the kernel) to kprobes. This allows user-defined instrumentation on a live kernel image that can never crash, hang or interfere with the kernel negatively.
BCC makes eBPF programs easier to write, with kernel instrumentation in C
and a front-end in Python. It is suited for many tasks, including performance
analysis and network traffic control.
## Screenshot
This example traces a disk I/O kernel function, and populates an in-kernel
power-of-2 histogram of the I/O size. For efficiency, only the histogram
summary is returned to user-level.
```Shell
# ./bitehist.py
Tracing... Hit Ctrl-C to end.
^C
value : count distribution
0 -> 1 : 3 | |
2 -> 3 : 0 | |
4 -> 7 : 211 |********** |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 1 | |
128 -> 255 : 800 |**************************************|
```
The above output shows a bimodal distribution, where the largest mode of
800 I/O was between 128 and 255 Kbytes in size.
The compiler relies upon eBPF (Extended Berkeley Packet Filters), which is a
feature in Linux kernels starting from 3.15. Currently, this compiler leverages
features which are mostly available in Linux 4.1 and above.
See the source: [bitehist.c](examples/bitehist.c) and
[bitehist.py](examples/bitehist.py). What this traces, what this stores, and how
the data is presented, can be entirely customized. This shows only some of
many possible capabilities.
## Installing
See [INSTALL.md](INSTALL.md) for installation steps on your platform.
## Contents
Some of these are single files that contain both C and Python, others have a
pair of .c and .py files, and some are directories of files.
### Tracing
Examples:
- examples/[bitehist.py](examples/bitehist.py) examples/[bitehist.c](examples/bitehist.c): Block I/O size histogram. [Examples](examples/bitehist_example.txt).
- examples/[disksnoop.py](examples/disksnoop.py) examples/[disksnoop.c](examples/disksnoop.c): Trace block device I/O latency. [Examples](examples/disksnoop_example.txt).
- examples/[hello_world.py](examples/hello_world.py): Prints "Hello, World!" for new processes.
- examples/[trace_fields.py](examples/trace_fields.py): Simple example of printing fields from traced events.
- examples/[vfsreadlat.py](examples/vfsreadlat.py) examples/[vfsreadlat.c](examples/vfsreadlat.c): VFS read latency distribution. [Examples](examples/vfsreadlat_example.txt).
Tools:
- tools/[funccount](tools/funccount): Count kernel function calls. [Examples](tools/funccount_example.txt).
- tools/[pidpersec](tools/pidpersec): Count new processes (via fork). [Examples](tools/pidpersec_example.txt).
- tools/[syncsnoop](tools/syncsnoop): Trace sync() syscall. [Examples](tools/syncsnoop_example.txt).
- tools/[vfscount](tools/vfscount) tools/[vfscount.c](tools/vfscount.c): Count VFS calls. [Examples](tools/vfscount_example.txt).
- tools/[vfsstat](tools/vfsstat) tools/[vfsstat.c](tools/vfsstat.c): Count some VFS calls, with column output. [Examples](tools/vfsstat_example.txt).
### Networking
Examples:
- examples/[distributed_bridge/](examples/distributed_bridge): Distributed bridge example.
- examples/[simple_tc.py](examples/simple_tc.py): Simple traffic control example.
- examples/[simulation.py](examples/simulation.py): Simulation helper.
- examples/[tc_neighbor_sharing.py](examples/tc_neighbor_sharing.py) examples/[tc_neighbor_sharing.c](examples/tc_neighbor_sharing.c): Per-IP classification and rate limiting.
- examples/[tunnel_monitor/](examples/tunnel_monitor): Efficiently monitor traffic flows. [Example video](https://www.youtube.com/watch?v=yYy3Cwce02k).
- examples/[vlan_learning.py](examples/vlan_learning.py) examples/[vlan_learning.c](examples/vlan_learning.c): Demux Ethernet traffic into worker veth+namespaces.
## Motivation
BPF guarantees that the programs loaded into the kernel cannot crash, and
......@@ -46,11 +115,11 @@ The features of this toolkit include:
In the future, more bindings besides python will likely be supported. Feel free
to add support for the language of your choice and send a pull request!
## Examples
## Tutorial
This toolchain is currently composed of two parts: a C wrapper around LLVM, and
a Python API to interact with the running program. Later, we will go into more
detail of how this all works.
The BCC toolchain is currently composed of two parts: a C wrapper around LLVM,
and a Python API to interact with the running program. Later, we will go into
more detail of how this all works.
### Hello, World
......@@ -65,32 +134,20 @@ The BPF program always takes at least one argument, which is a pointer to the
context for this type of program. Different program types have different calling
conventions, but for this one we don't care so `void *` is fine.
```python
prog = """
int hello(void *ctx) {
bpf_trace_printk("Hello, World!\\n");
return 0;
};
"""
b = BPF(text=prog)
BPF(text='void kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); }').trace_print()
```
For this example, we will call the program every time `fork()` is called by a
userspace process. Underneath the hood, fork translates to the `clone` syscall,
so we will attach our program to the kernel symbol `sys_clone`.
```python
b.attach_kprobe(event="sys_clone", fn_name="hello")
```
userspace process. Underneath the hood, fork translates to the `clone` syscall.
BCC recognizes prefix `kprobe__`, and will auto attach our program to the kernel symbol `sys_clone`.
The python process will then print the trace printk circular buffer until ctrl-c
is pressed. The BPF program is removed from the kernel when the userspace
process that loaded it closes the fd (or exits).
```python
b.trace_print()
```
Output:
```
bcc/examples$ sudo python hello_world.py
bcc/examples$ sudo python hello_world.py
python-7282 [002] d... 3757.488508: : Hello, World!
```
......
......@@ -38,7 +38,7 @@ static unsigned int log2l(unsigned long v)
return log2(v) + 1;
}
int do_request(struct pt_regs *ctx, struct request *req)
int kprobe__blk_start_request(struct pt_regs *ctx, struct request *req)
{
int index = log2l(req->__data_len / 1024);
u64 *leaf = dist.lookup(&index);
......
......@@ -5,8 +5,6 @@
#
# Written as a basic example of using a histogram to show a distribution.
#
# USAGE: bitehist.py [interval [count]]
#
# The default interval is 5 seconds. A Ctrl-C will print the partially
# gathered histogram then exit.
#
......@@ -16,90 +14,18 @@
# 15-Aug-2015 Brendan Gregg Created this.
from bcc import BPF
from ctypes import c_ushort, c_int, c_ulonglong
from time import sleep
from sys import argv
def usage():
print("USAGE: %s [interval [count]]" % argv[0])
exit()
# arguments
interval = 5
count = -1
if len(argv) > 1:
try:
interval = int(argv[1])
if interval == 0:
raise
if len(argv) > 2:
count = int(argv[2])
except: # also catches -h, --help
usage()
# load BPF program
b = BPF(src_file = "bitehist.c")
b.attach_kprobe(event="blk_start_request", fn_name="do_request")
dist_max = 64
# header
print("Tracing... Hit Ctrl-C to end.")
# functions
stars_max = 38
def stars(val, val_max, width):
i = 0
text = ""
while (1):
if (i > (width * val / val_max) - 1) or (i > width - 1):
break
text += "*"
i += 1
if val > val_max:
text = text[:-1] + "+"
return text
def print_log2_hist(dist, val_type):
idx_max = -1
val_max = 0
for i in range(1, dist_max + 1):
try:
val = dist[c_int(i)].value
if (val > 0):
idx_max = i
if (val > val_max):
val_max = val
except:
break
if idx_max > 0:
print(" %-15s : count distribution" % val_type);
for i in range(1, idx_max + 1):
low = (1 << i) >> 1
high = (1 << i) - 1
if (low == high):
low -= 1
try:
val = dist[c_int(i)].value
print("%8d -> %-8d : %-8d |%-*s|" % (low, high, val,
stars_max, stars(val, val_max, stars_max)))
except:
break
# output
loop = 0
do_exit = 0
while (1):
if count > 0:
loop += 1
if loop > count:
exit()
try:
sleep(interval)
except KeyboardInterrupt:
pass; do_exit = 1
try:
sleep(99999999)
except KeyboardInterrupt:
print
print_log2_hist(b["dist"], "kbytes")
b["dist"].clear()
if do_exit:
exit()
b["dist"].print_log2_hist()
Demonstrations of bitehist.py, the Linux eBPF/bcc version.
This prints a power-of-2 histogram to show the block I/O size distribution.
By default, a summary is printed every five seconds:
A summary is printed after Ctrl-C is hit.
# ./bitehist.py
# ./bitehist.py
Tracing... Hit Ctrl-C to end.
kbytes : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 26 |************* |
8 -> 15 : 3 |* |
16 -> 31 : 5 |** |
32 -> 63 : 6 |*** |
64 -> 127 : 7 |*** |
128 -> 255 : 75 |**************************************|
kbytes : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 83 |**************************************|
8 -> 15 : 2 | |
16 -> 31 : 6 |** |
32 -> 63 : 6 |** |
64 -> 127 : 5 |** |
128 -> 255 : 21 |********* |
^C
kbytes : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 8 |**************************************|
The first output shows a bimodal distribution. The largest mode of 75 I/O were
between 128 and 255 Kbytes in size, and another mode of 26 I/O were between 4
and 7 Kbytes in size.
The next output summary shows the workload is doing more 4 - 7 Kbyte I/O.
The final output is partial, showing what was measured until Ctrl-C was hit.
For an output interval of one second, and three summaries:
# ./bitehist.py 1 3
Tracing... Hit Ctrl-C to end.
kbytes : count distribution
0 -> 1 : 0 | |
value : count distribution
0 -> 1 : 3 | |
2 -> 3 : 0 | |
4 -> 7 : 4 |**************************************|
kbytes : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 5 |**************************************|
4 -> 7 : 211 |********** |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 1 |******* |
kbytes : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 4 |**************************************|
32 -> 63 : 0 | |
64 -> 127 : 1 | |
128 -> 255 : 800 |**************************************|
Full usage:
This output shows a bimodal distribution. The largest mod of 800 I/O were
between 128 and 255 Kbytes in size, and another mode of 211 I/O were between
4 and 7 Kbytes in size.
# ./bitehist.py -h
USAGE: ./bitehist.py [interval [count]]
Understanding this distribution is useful for characterizing workloads and
understanding performance. The existance of this distribution is not visible
from averages alone.
......@@ -13,36 +13,23 @@
#include <uapi/linux/ptrace.h>
#include <linux/blkdev.h>
struct key_t {
struct request *req;
};
BPF_TABLE("hash", struct key_t, u64, start, 10240);
int do_request(struct pt_regs *ctx, struct request *req) {
struct key_t key = {};
u64 ts;
BPF_HASH(start, struct request *);
void kprobe__blk_start_request(struct pt_regs *ctx, struct request *req) {
// stash start timestamp by request ptr
ts = bpf_ktime_get_ns();
key.req = req;
start.update(&key, &ts);
u64 ts = bpf_ktime_get_ns();
return 0;
start.update(&req, &ts);
}
int do_completion(struct pt_regs *ctx, struct request *req) {
struct key_t key = {};
void kprobe__blk_update_request(struct pt_regs *ctx, struct request *req) {
u64 *tsp, delta;
key.req = req;
tsp = start.lookup(&key);
tsp = start.lookup(&req);
if (tsp != 0) {
delta = bpf_ktime_get_ns() - *tsp;
bpf_trace_printk("%d %x %d\n", req->__data_len,
req->cmd_flags, delta / 1000);
start.delete(&key);
start.delete(&req);
}
return 0;
}
......@@ -17,8 +17,6 @@ REQ_WRITE = 1 # from include/linux/blk_types.h
# load BPF program
b = BPF(src_file="disksnoop.c")
b.attach_kprobe(event="blk_start_request", fn_name="do_request")
b.attach_kprobe(event="blk_update_request", fn_name="do_completion")
# header
print("%-18s %-2s %-7s %8s" % ("TIME(s)", "T", "BYTES", "LAT(ms)"))
......
......@@ -4,15 +4,8 @@
# run in project examples directory with:
# sudo ./hello_world.py"
# see trace_fields.py for a longer example
from bcc import BPF
prog = """
int hello(void *ctx) {
bpf_trace_printk("Hello, World!\\n");
return 0;
}
"""
b = BPF(text=prog)
b.attach_kprobe(event="sys_clone", fn_name="hello")
b.trace_print(fmt="{1} {5}")
BPF(text='void kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); }').trace_print()
#!/usr/bin/env python
# Copyright (c) PLUMgrid, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")
# This is an example of tracing an event and printing custom fields.
# run in project examples directory with:
# sudo ./trace_fields.py"
from bcc import BPF
prog = """
int hello(void *ctx) {
bpf_trace_printk("Hello, World!\\n");
return 0;
}
"""
b = BPF(text=prog)
b.attach_kprobe(event="sys_clone", fn_name="hello")
print "PID MESSAGE"
b.trace_print(fmt="{1} {5}")
......@@ -13,11 +13,7 @@
#include <uapi/linux/ptrace.h>
struct key_t {
u32 pid;
};
BPF_TABLE("hash", struct key_t, u64, start, 10240);
BPF_HASH(start, u32);
BPF_TABLE("array", int, u64, dist, 64);
static unsigned int log2(unsigned int v)
......@@ -44,29 +40,29 @@ static unsigned int log2l(unsigned long v)
int do_entry(struct pt_regs *ctx)
{
struct key_t key = {};
u64 ts, *val, zero = 0;
u32 pid;
u64 ts, *val;
key.pid = bpf_get_current_pid_tgid();
pid = bpf_get_current_pid_tgid();
ts = bpf_ktime_get_ns();
start.update(&key, &ts);
start.update(&pid, &ts);
return 0;
}
int do_return(struct pt_regs *ctx)
{
struct key_t key = {};
u32 pid;
u64 *tsp, delta;
key.pid = bpf_get_current_pid_tgid();
tsp = start.lookup(&key);
pid = bpf_get_current_pid_tgid();
tsp = start.lookup(&pid);
if (tsp != 0) {
delta = bpf_ktime_get_ns() - *tsp;
int index = log2l(delta / 1000);
u64 *leaf = dist.lookup(&index);
if (leaf) (*leaf)++;
start.delete(&key);
start.delete(&pid);
}
return 0;
......
......@@ -41,51 +41,10 @@ if len(argv) > 1:
b = BPF(src_file = "vfsreadlat.c")
b.attach_kprobe(event="vfs_read", fn_name="do_entry")
b.attach_kretprobe(event="vfs_read", fn_name="do_return")
dist_max = 64
# header
print("Tracing... Hit Ctrl-C to end.")
# functions
stars_max = 38
def stars(val, val_max, width):
i = 0
text = ""
while (1):
if (i > (width * val / val_max) - 1) or (i > width - 1):
break
text += "*"
i += 1
if val > val_max:
text = text[:-1] + "+"
return text
def print_log2_hist(dist, val_type):
idx_max = -1
val_max = 0
for i in range(1, dist_max + 1):
try:
val = dist[c_int(i)].value
if (val > 0):
idx_max = i
if (val > val_max):
val_max = val
except:
break
if idx_max > 0:
print(" %-15s : count distribution" % val_type);
for i in range(1, idx_max + 1):
low = (1 << i) >> 1
high = (1 << i) - 1
if (low == high):
low -= 1
try:
val = dist[c_int(i)].value
print("%8d -> %-8d : %-8d |%-*s|" % (low, high, val,
stars_max, stars(val, val_max, stars_max)))
except:
break
# output
loop = 0
do_exit = 0
......@@ -100,7 +59,7 @@ while (1):
pass; do_exit = 1
print
print_log2_hist(b["dist"], "usecs")
b["dist"].print_log2_hist("usecs")
b["dist"].clear()
if do_exit:
exit()
.TH funccount 8 "2015-08-18" "USER COMMANDS"
.SH NAME
funccount \- Count kernel function calls matching a pattern. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B funccount [\-h] [\-p PID] [\-i INTERVAL] [\-T] [\-r] pattern
.SH DESCRIPTION
This tool is a quick way to determine which kernel functions are being called,
and at what rate. It uses in-kernel eBPF maps to count function calls.
WARNING: This uses dynamic tracing of (what can be many) kernel functions, an
activity that has had issues on some kernel versions (risk of panics or
freezes). Test, and know what you are doing, before use.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
pattern
Search pattern. Supports "*" wildcards. See EXAMPLES. You can also use \-r for regular expressions.
\-h
Print usage message.
.TP
\-p PID
Trace this process ID only.
.TP
\-i INTERVAL
Print output every interval seconds.
.TP
\-T
Include timestamps on output.
.TP
\-r
Use regular expressions for the search pattern.
.SH EXAMPLES
.TP
Count kernel functions beginning with "vfs_", until Ctrl-C is hit:
#
.B funccount 'vfs_*'
.TP
Count kernel functions beginning with "tcp_send", until Ctrl-C is hit:
#
.B funccount 'tcp_send*'
.TP
Print kernel functions beginning with "vfs_", every second:
#
.B funccount \-i 1 'vfs_*'
.TP
Match kernel functions beginning with "vfs_", using regular expressions:
#
.B funccount \-r '^vfs_.*'
.TP
Count vfs calls for process ID 181 only:
#
.B funccount \-p 181 'vfs_*'
.SH FIELDS
.TP
ADDR
Address of the instruction pointer that was traced (only useful if the FUNC column is suspicious and you would like to double check the translation).
.TP
FUNC
Kernel function name
.TP
COUNT
Number of calls while tracing
.SH OVERHEAD
This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of VFS operations can
be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads.
Measure in a test environment, and if overheads are an issue, edit the script
to reduce the types of vfs functions traced (currently all beginning with
"vfs_").
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
vfscount(8)
......@@ -36,7 +36,7 @@ endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DBCC_INSTALL_PREFIX='\"${CMAKE_INSTALL_PREFIX}\"'")
add_library(bcc SHARED bpf_common.cc bpf_module.cc libbpf.c)
set_target_properties(bcc PROPERTIES VERSION ${REVISION} SOVERSION 0)
set_target_properties(bcc PROPERTIES VERSION ${REVISION_LAST} SOVERSION 0)
# BPF is still experimental otherwise it should be available
#llvm_map_components_to_libnames(llvm_libs bpf mcjit irreader passes)
......
......@@ -295,7 +295,7 @@ unique_ptr<ExecutionEngine> BPFModule::finalize_rw(unique_ptr<Module> m) {
// load an entire c file as a module
int BPFModule::load_cfile(const string &file, bool in_memory) {
clang_loader_ = make_unique<ClangLoader>(&*ctx_);
clang_loader_ = make_unique<ClangLoader>(&*ctx_, flags_);
if (clang_loader_->parse(&mod_, &tables_, file, in_memory))
return -1;
return 0;
......@@ -307,7 +307,7 @@ int BPFModule::load_cfile(const string &file, bool in_memory) {
// Load in a pre-built list of functions into the initial Module object, then
// build an ExecutionEngine.
int BPFModule::load_includes(const string &tmpfile) {
clang_loader_ = make_unique<ClangLoader>(&*ctx_);
clang_loader_ = make_unique<ClangLoader>(&*ctx_, flags_);
if (clang_loader_->parse(&mod_, &tables_, tmpfile, false))
return -1;
return 0;
......@@ -652,7 +652,7 @@ int BPFModule::load_b(const string &filename, const string &proto_filename) {
if (int rc = load_includes(BCC_INSTALL_PREFIX "/share/bcc/include/bcc/helpers.h"))
return rc;
b_loader_.reset(new BLoader);
b_loader_.reset(new BLoader(flags_));
if (int rc = b_loader_->parse(&*mod_, filename, proto_filename, &tables_))
return rc;
if (int rc = annotate())
......
......@@ -41,6 +41,20 @@ struct _name##_table_t { \
__attribute__((section("maps/" _table_type))) \
struct _name##_table_t _name
#define BPF_HASH1(_name) \
BPF_TABLE("hash", u64, u64, _name, 10240)
#define BPF_HASH2(_name, _key_type) \
BPF_TABLE("hash", _key_type, u64, _name, 10240)
#define BPF_HASH3(_name, _key_type, _leaf_type) \
BPF_TABLE("hash", _key_type, _leaf_type, _name, 10240)
// helper for default-variable macro function
#define BPF_HASHX(_1, _2, _3, NAME, ...) NAME
// Define a hash function, some arguments optional
// BPF_HASH(name, key_type=u64, leaf_type=u64, size=10240)
#define BPF_HASH(...) \
BPF_HASHX(__VA_ARGS__, BPF_HASH3, BPF_HASH2, BPF_HASH1)(__VA_ARGS__)
// packet parsing state machine helpers
#define cursor_advance(_cursor, _len) \
({ void *_tmp = _cursor; _cursor += _len; _tmp; })
......
......@@ -27,7 +27,8 @@ using std::vector;
namespace ebpf {
BLoader::BLoader() {
BLoader::BLoader(unsigned flags) : flags_(flags) {
(void)flags_;
}
BLoader::~BLoader() {
......
......@@ -35,11 +35,12 @@ class CodegenLLVM;
class BLoader {
public:
BLoader();
explicit BLoader(unsigned flags);
~BLoader();
int parse(llvm::Module *mod, const std::string &filename, const std::string &proto_filename,
std::unique_ptr<std::vector<TableDesc>> *tables);
private:
unsigned flags_;
std::unique_ptr<cc::Parser> parser_;
std::unique_ptr<cc::Parser> proto_parser_;
std::unique_ptr<cc::CodegenLLVM> codegen_;
......
......@@ -102,6 +102,7 @@ bool BTypeVisitor::VisitFunctionDecl(FunctionDecl *D) {
rewriter_.InsertText(D->getLocStart(), attr);
// remember the arg names of the current function...first one is the ctx
fn_args_.clear();
string preamble = "{";
for (auto arg : D->params()) {
if (arg->getName() == "") {
C.getDiagnostics().Report(arg->getLocEnd(), diag::err_expected)
......@@ -109,7 +110,15 @@ bool BTypeVisitor::VisitFunctionDecl(FunctionDecl *D) {
return false;
}
fn_args_.push_back(arg);
if (fn_args_.size() > 1) {
size_t d = fn_args_.size() - 2;
const char *reg = calling_conv_regs[d];
preamble += arg->getName().str() + " = " + fn_args_[0]->getName().str() + "->" + string(reg) + ";";
}
}
// for each trace argument, convert the variable from ptregs to something on stack
if (CompoundStmt *S = dyn_cast<CompoundStmt>(D->getBody()))
rewriter_.ReplaceText(S->getLBracLoc(), 1, preamble);
}
return true;
}
......@@ -268,7 +277,7 @@ bool BTypeVisitor::VisitMemberExpr(MemberExpr *E) {
string base_type = Ref->getType()->getPointeeType().getAsString();
string pre, post;
pre = "({ " + E->getType().getAsString() + " _val; memset(&_val, 0, sizeof(_val));";
pre += " bpf_probe_read(&_val, sizeof(_val), ";
pre += " bpf_probe_read(&_val, sizeof(_val), (u64)";
post = " + offsetof(" + base_type + ", " + F->getName().str() + ")";
post += "); _val; })";
rewriter_.InsertText(E->getLocStart(), pre);
......@@ -278,23 +287,6 @@ bool BTypeVisitor::VisitMemberExpr(MemberExpr *E) {
return true;
}
bool BTypeVisitor::VisitDeclRefExpr(DeclRefExpr *E) {
auto it = std::find(fn_args_.begin() + 1, fn_args_.end(), E->getDecl());
if (it != fn_args_.end()) {
if (!rewriter_.isRewritable(E->getLocStart())) {
C.getDiagnostics().Report(E->getLocStart(), diag::err_expected)
<< "use of probe argument not in a macro";
return false;
}
size_t d = std::distance(fn_args_.begin() + 1, it);
const char *reg = calling_conv_regs[d];
string text = "((u64)" + fn_args_[0]->getName().str() + "->" + string(reg) + ")";
rewriter_.ReplaceText(SourceRange(E->getLocStart(), E->getLocEnd()), text);
return true;
}
return true;
}
bool BTypeVisitor::VisitBinaryOperator(BinaryOperator *E) {
if (!E->isAssignmentOp())
return true;
......@@ -442,13 +434,14 @@ bool BTypeConsumer::HandleTopLevelDecl(DeclGroupRef D) {
return true;
}
BFrontendAction::BFrontendAction(llvm::raw_ostream &os)
: rewriter_(new Rewriter), os_(os), tables_(new vector<TableDesc>) {
BFrontendAction::BFrontendAction(llvm::raw_ostream &os, unsigned flags)
: os_(os), flags_(flags), rewriter_(new Rewriter), tables_(new vector<TableDesc>) {
}
void BFrontendAction::EndSourceFileAction() {
// uncomment to see rewritten source
//rewriter_->getEditBuffer(rewriter_->getSourceMgr().getMainFileID()).write(llvm::errs());
if (flags_ & 0x4)
rewriter_->getEditBuffer(rewriter_->getSourceMgr().getMainFileID()).write(llvm::errs());
rewriter_->getEditBuffer(rewriter_->getSourceMgr().getMainFileID()).write(os_);
os_.flush();
}
......
......@@ -67,7 +67,6 @@ class BTypeVisitor : public clang::RecursiveASTVisitor<BTypeVisitor> {
bool VisitCallExpr(clang::CallExpr *Call);
bool VisitVarDecl(clang::VarDecl *Decl);
bool VisitMemberExpr(clang::MemberExpr *E);
bool VisitDeclRefExpr(clang::DeclRefExpr *E);
bool VisitBinaryOperator(clang::BinaryOperator *E);
bool VisitImplicitCastExpr(clang::ImplicitCastExpr *E);
......@@ -96,7 +95,7 @@ class BFrontendAction : public clang::ASTFrontendAction {
public:
// Initialize with the output stream where the new source file contents
// should be written.
explicit BFrontendAction(llvm::raw_ostream &os);
BFrontendAction(llvm::raw_ostream &os, unsigned flags);
// Called by clang when the AST has been completed, here the output stream
// will be flushed.
......@@ -108,8 +107,9 @@ class BFrontendAction : public clang::ASTFrontendAction {
// take ownership of the table-to-fd mapping data structure
std::unique_ptr<std::vector<TableDesc>> take_tables() { return move(tables_); }
private:
std::unique_ptr<clang::Rewriter> rewriter_;
llvm::raw_ostream &os_;
unsigned flags_;
std::unique_ptr<clang::Rewriter> rewriter_;
std::unique_ptr<std::vector<TableDesc>> tables_;
};
......
......@@ -58,8 +58,8 @@ using std::vector;
namespace ebpf {
ClangLoader::ClangLoader(llvm::LLVMContext *ctx)
: ctx_(ctx)
ClangLoader::ClangLoader(llvm::LLVMContext *ctx, unsigned flags)
: ctx_(ctx), flags_(flags)
{}
ClangLoader::~ClangLoader() {}
......@@ -138,6 +138,13 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
// Initialize a compiler invocation object from the clang (-cc1) arguments.
const driver::ArgStringList &ccargs = cmd.getArguments();
if (flags_ & 0x4) {
llvm::errs() << "clang";
for (auto arg : ccargs)
llvm::errs() << " " << arg;
llvm::errs() << "\n";
}
// first pass
auto invocation1 = make_unique<CompilerInvocation>();
if (!CompilerInvocation::CreateFromArgs(*invocation1, const_cast<const char **>(ccargs.data()),
......@@ -159,7 +166,7 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
// capture the rewritten c file
string out_str;
llvm::raw_string_ostream os(out_str);
BFrontendAction bact(os);
BFrontendAction bact(os, flags_);
if (!compiler1.ExecuteAction(bact))
return -1;
// this contains the open FDs
......
......@@ -36,12 +36,13 @@ class CodegenLLVM;
class ClangLoader {
public:
explicit ClangLoader(llvm::LLVMContext *ctx);
explicit ClangLoader(llvm::LLVMContext *ctx, unsigned flags);
~ClangLoader();
int parse(std::unique_ptr<llvm::Module> *mod, std::unique_ptr<std::vector<TableDesc>> *tables,
const std::string &file, bool in_memory);
private:
llvm::LLVMContext *ctx_;
unsigned flags_;
};
} // namespace ebpf
......@@ -239,7 +239,9 @@ int bpf_attach_kprobe(int progfd, const char *event,
}
if (write(kfd, event_desc, strlen(event_desc)) < 0) {
perror("write(kprobe_events)");
fprintf(stderr, "write of \"%s\" into kprobe_events failed: %s\n", event_desc, strerror(errno));
if (errno == EINVAL)
fprintf(stderr, "check dmesg output for possible cause\n");
goto cleanup;
}
......
This diff is collapsed.
......@@ -46,6 +46,28 @@ int count_foo(struct pt_regs *ctx, unsigned long a, unsigned long b) {
b = BPF(text=text, debug=0)
fn = b.load_func("count_foo", BPF.KPROBE)
def test_probe_read_keys(self):
text = """
#include <uapi/linux/ptrace.h>
#include <linux/blkdev.h>
BPF_HASH(start, struct request *);
int do_request(struct pt_regs *ctx, struct request *req) {
u64 ts = bpf_ktime_get_ns();
start.update(&req, &ts);
return 0;
}
int do_completion(struct pt_regs *ctx, struct request *req) {
u64 *tsp = start.lookup(&req);
if (tsp != 0) {
start.delete(&req);
}
return 0;
}
"""
b = BPF(text=text, debug=0)
fns = b.load_funcs(BPF.KPROBE)
def test_sscanf(self):
text = """
BPF_TABLE("hash", int, struct { u64 a; u64 b; u64 c:36; u64 d:28; struct { u32 a; u32 b; } s; }, stats, 10);
......@@ -100,5 +122,13 @@ int do_request(struct pt_regs *ctx, int req) {
b = BPF(text=text, debug=0)
fn = b.load_func("do_request", BPF.KPROBE)
def test_bpf_hash(self):
text = """
BPF_HASH(table1);
BPF_HASH(table2, u32);
BPF_HASH(table3, u32, int);
"""
b = BPF(text=text, debug=0)
if __name__ == "__main__":
main()
......@@ -30,5 +30,12 @@ class TestKprobeRgx(TestCase):
k2 = self.b["stats"].Key(2)
self.assertEqual(self.b["stats"][k1].val, self.b["stats"][k2].val + 1)
class TestKprobeReplace(TestCase):
def setUp(self):
self.b = BPF(text="int empty(void *ctx) { return 0; }")
def test_periods(self):
self.b.attach_kprobe(event_re="^tcp_enter_cwr.*", fn_name="empty")
if __name__ == "__main__":
main()
#!/usr/bin/python
#
# funccount Count kernel function calls.
# For Linux, uses BCC, eBPF. See .c file.
#
# USAGE: funccount [-h] [-p PID] [-i INTERVAL] [-T] [-r] pattern
#
# The pattern is a string with optional '*' wildcards, similar to file globbing.
# If you'd prefer to use regular expressions, use the -r option.
#
# Copyright (c) 2015 Brendan Gregg.
# Licensed under the Apache License, Version 2.0 (the "License")
#
# 09-Sep-2015 Brendan Gregg Created this.
from __future__ import print_function
from bcc import BPF
from time import sleep, strftime
import argparse
import re
import signal
# arguments
examples = """examples:
./funccount 'vfs_*' # count kernel functions starting with "vfs"
./funccount 'tcp_send*' # count kernel funcs starting with "tcp_send"
./funccount -r '^vfs.*' # same as above, using regular expressions
./funccount -Ti 5 'vfs_*' # output every 5 seconds, with timestamps
./funccount -p 185 'vfs_*' # count vfs calls for PID 181 only
"""
parser = argparse.ArgumentParser(
description="Count kernel function calls",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=examples)
parser.add_argument("-p", "--pid",
help="trace this PID only")
parser.add_argument("-i", "--interval", default=99999999,
help="summary interval, seconds")
parser.add_argument("-T", "--timestamp", action="store_true",
help="include timestamp on output")
parser.add_argument("-r", "--regexp", action="store_true",
help="use regular expressions. Default is \"*\" wildcards only.")
parser.add_argument("pattern",
help="search expression for kernel functions")
args = parser.parse_args()
pattern = args.pattern
if not args.regexp:
pattern = pattern.replace('*', '.*')
pattern = '^' + pattern + '$'
debug = 0
# signal handler
def signal_ignore(signal, frame):
print()
# load BPF program
bpf_text = """
#include <uapi/linux/ptrace.h>
struct key_t {
u64 ip;
};
BPF_HASH(counts, struct key_t);
int trace_count(struct pt_regs *ctx) {
FILTER_START
struct key_t key = {};
u64 zero = 0, *val;
key.ip = ctx->ip;
val = counts.lookup_or_init(&key, &zero);
(*val)++;
FILTER_DONE
return 0;
}
"""
if args.pid:
bpf_text = bpf_text.replace('FILTER_START',
('u32 pid; pid = bpf_get_current_pid_tgid(); ' +
'if (pid == %s) {') % (args.pid))
bpf_text = bpf_text.replace('FILTER_DONE', '}')
else:
bpf_text = bpf_text.replace('FILTER_START', '')
bpf_text = bpf_text.replace('FILTER_DONE', '')
if debug:
print(bpf_text)
b = BPF(text=bpf_text)
b.attach_kprobe(event_re=pattern, fn_name="trace_count")
# header
print("Tracing... Ctrl-C to end.")
# output
exiting = 0 if args.interval else 1
while (1):
try:
sleep(int(args.interval))
except KeyboardInterrupt:
exiting=1
# as cleanup can take many seconds, trap Ctrl-C:
signal.signal(signal.SIGINT, signal_ignore)
print()
if args.timestamp:
print("%-8s\n" % strftime("%H:%M:%S"), end="")
print("%-16s %-26s %8s" % ("ADDR", "FUNC", "COUNT"))
counts = b.get_table("counts")
for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
print("%-16x %-26s %8d" % (k.ip, b.ksym(k.ip), v.value))
counts.clear()
if exiting:
print("Detaching...")
exit()
This diff is collapsed.
......@@ -16,12 +16,10 @@ from bcc import BPF
# load BPF program
b = BPF(text = """
int do_sync(void *ctx) {
void kprobe__sys_sync(void *ctx) {
bpf_trace_printk("sync()\\n");
return 0;
};
""")
b.attach_kprobe(event="sys_sync", fn_name="do_sync")
# header
print("%-18s %s" % ("TIME(s)", "CALL"))
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment