Commit c063fc4f authored by Brenden Blanco's avatar Brenden Blanco Committed by GitHub

Merge branch 'master' into patch

parents 4bb64cfb afb19da0
......@@ -2,6 +2,10 @@
*.swp
*.swo
*.pyc
.idea
# Build artefacts
# Build artifacts
/build/
cmake-build-debug
debian/**/*.log
obj-x86_64-linux-gnu
......@@ -7,6 +7,7 @@
- [Arch](#arch---aur)
- [Gentoo](#gentoo---portage)
* [Source](#source)
- [Debian](#debian---source)
- [Ubuntu](#ubuntu---source)
- [Fedora](#fedora---source)
* [Older Instructions](#older-instructions)
......@@ -163,6 +164,90 @@ The appropriate dependencies (e.g., ```clang```, ```llvm``` with BPF backend) wi
# Source
## Debian - Source
### Jessie
#### Repositories
The automated tests that run as part of the build process require `netperf`. Since netperf's license is not "certified"
as an open-source license, it is in Debian's `non-free` repository.
`/etc/apt/sources.list` should include the `non-free` repository and look something like this:
```
deb http://httpredir.debian.org/debian/ jessie main non-free
deb-src http://httpredir.debian.org/debian/ jessie main non-free
deb http://security.debian.org/ jessie/updates main non-free
deb-src http://security.debian.org/ jessie/updates main non-free
# wheezy-updates, previously known as 'volatile'
deb http://ftp.us.debian.org/debian/ jessie-updates main non-free
deb-src http://ftp.us.debian.org/debian/ jessie-updates main non-free
```
BCC also requires kernel version 4.1 or above. Those kernels are available in the `jessie-backports` repository. To
add the `jessie-backports` repository to your system create the file `/etc/apt/sources.list.d/jessie-backports.list`
with the following contents:
```
deb http://httpredir.debian.org/debian jessie-backports main
deb-src http://httpredir.debian.org/debian jessie-backports main
```
#### Install Build Dependencies
Note, check for the latest `linux-image-4.x` version in `jessie-backports` before proceeding. Also, have a look at the
`Build-Depends:` section in `debian/control` file.
```
# Before you begin
apt-get update
# Update kernel and linux-base package
apt-get -t jessie-backports install linux-base linux-image-4.8.0-0.bpo.2-amd64
# BCC build dependencies:
apt-get install debhelper cmake libllvm3.8 llvm-3.8-dev libclang-3.8-dev \
libelf-dev bison flex libedit-dev clang-format-3.8 python python-netaddr \
python-pyroute2 luajit libluajit-5.1-dev arping iperf netperf ethtool \
devscripts
```
#### Sudo
Adding eBPF probes to the kernel and removing probes from it requires root privileges. For the build to complete
successfully, you must build from an account with `sudo` access. (You may also build as root, but it is bad style.)
`/etc/sudoers` or `/etc/sudoers.d/build-user` should contain
```
build-user ALL = (ALL) NOPASSWD: ALL
```
or
```
build-user ALL = (ALL) ALL
```
If using the latter sudoers configuration, please keep an eye out for sudo's password prompt while the build is running.
#### Build
```
cd <preferred development directory>
git clone https://github.com/iovisor/bcc.git
cd bcc
debuild -b -uc -us
```
#### Install
```
cd ..
sudo dpkg -i *bcc*.deb
```
## Ubuntu - Source
To build the toolchain from source, one needs:
......@@ -219,9 +304,12 @@ sudo pip install pyroute2
wget http://llvm.org/releases/3.7.1/clang+llvm-3.7.1-x86_64-fedora22.tar.xz
sudo tar xf clang+llvm-3.7.1-x86_64-fedora22.tar.xz -C /usr/local --strip 1
# FC23 and FC24
# FC23
wget http://llvm.org/releases/3.9.0/clang+llvm-3.9.0-x86_64-fedora23.tar.xz
sudo tar xf clang+llvm-3.9.0-x86_64-fedora23.tar.xz -C /usr/local --strip 1
# FC24 and FC25
sudo dnf install -y clang clang-devel llvm llvm-devel llvm-static ncurses-devel
```
### Install and compile BCC
......
......@@ -73,7 +73,7 @@ Examples:
- examples/tracing/[vfsreadlat.py](examples/tracing/vfsreadlat.py) examples/tracing/[vfsreadlat.c](examples/tracing/vfsreadlat.c): VFS read latency distribution. [Examples](examples/tracing/vfsreadlat_example.txt).
#### Tools:
<center><a href="images/bcc_tracing_tools_2016.png"><img src="images/bcc_tracing_tools_2016.png" border=0 width=700></a></center>
<center><a href="images/bcc_tracing_tools_2017.png"><img src="images/bcc_tracing_tools_2017.png" border=0 width=700></a></center>
- tools/[argdist](tools/argdist.py): Display function parameter values as a histogram or frequency count. [Examples](tools/argdist_example.txt).
- tools/[bashreadline](tools/bashreadline.py): Print entered bash commands system wide. [Examples](tools/bashreadline_example.txt).
- tools/[biolatency](tools/biolatency.py): Summarize block device I/O latency as a histogram. [Examples](tools/biolatency_example.txt).
......@@ -89,6 +89,7 @@ Examples:
- tools/[cpuunclaimed](tools/cpuunclaimed.py): Sample CPU run queues and calculate unclaimed idle CPU. [Examples](tools/cpuunclaimed_example.txt)
- tools/[dcsnoop](tools/dcsnoop.py): Trace directory entry cache (dcache) lookups. [Examples](tools/dcsnoop_example.txt).
- tools/[dcstat](tools/dcstat.py): Directory entry cache (dcache) stats. [Examples](tools/dcstat_example.txt).
- tools/[deadlock_detector](tools/deadlock_detector.py): Detect potential deadlocks on a running process. [Examples](tools/deadlock_detector_example.txt)
- tools/[execsnoop](tools/execsnoop.py): Trace new processes via exec() syscalls. [Examples](tools/execsnoop_example.txt).
- tools/[ext4dist](tools/ext4dist.py): Summarize ext4 operation latency distribution as a histogram. [Examples](tools/ext4dist_example.txt).
- tools/[ext4slower](tools/ext4slower.py): Trace slow ext4 operations. [Examples](tools/ext4slower_example.txt).
......
%bcond_with local_clang_static
#lua jit not available for some architectures
%ifarch ppc64 aarch64 ppc64le
%{!?with_lua: %global with_lua 0}
%else
%{!?with_lua: %global with_lua 1}
%endif
%define debug_package %{nil}
Name: bcc
......@@ -11,10 +17,12 @@ License: ASL 2.0
URL: https://github.com/iovisor/bcc
Source0: bcc.tar.gz
ExclusiveArch: x86_64
ExclusiveArch: x86_64 ppc64 aarch64 ppc64le
BuildRequires: bison cmake >= 2.8.7 flex make
BuildRequires: gcc gcc-c++ python2-devel elfutils-libelf-devel-static
%if %{with_lua}
BuildRequires: luajit luajit-devel
%endif
%if %{without local_clang_static}
BuildRequires: llvm-devel llvm-static
BuildRequires: clang-devel
......@@ -25,6 +33,11 @@ BuildRequires: pkgconfig ncurses-devel
Python bindings for BPF Compiler Collection (BCC). Control a BPF program from
userspace.
%if %{with_lua}
%global lua_include `pkg-config --variable=includedir luajit`
%global lua_libs `pkg-config --variable=libdir luajit`/lib`pkg-config --variable=libname luajit`.so
%global lua_config -DLUAJIT_INCLUDE_DIR=%{lua_include} -DLUAJIT_LIBRARIES=%{lua_libs}
%endif
%prep
%setup -q -n bcc
......@@ -35,8 +48,7 @@ mkdir build
pushd build
cmake .. -DREVISION_LAST=%{version} -DREVISION=%{version} \
-DCMAKE_INSTALL_PREFIX=/usr \
-DLUAJIT_INCLUDE_DIR=`pkg-config --variable=includedir luajit` \
-DLUAJIT_LIBRARIES=`pkg-config --variable=libdir luajit`/lib`pkg-config --variable=libname luajit`.so
%{?lua_config}
make %{?_smp_mflags}
popd
......@@ -56,16 +68,20 @@ Requires: libbcc = %{version}-%{release}
%description -n python-bcc
Python bindings for BPF Compiler Collection (BCC)
%if %{with_lua}
%package -n bcc-lua
Summary: Standalone tool to run BCC tracers written in Lua
Requires: libbcc = %{version}-%{release}
%description -n bcc-lua
Standalone tool to run BCC tracers written in Lua
%endif
%package -n libbcc-examples
Summary: Examples for BPF Compiler Collection (BCC)
Requires: python-bcc = %{version}-%{release}
%if %{with_lua}
Requires: bcc-lua = %{version}-%{release}
%endif
%description -n libbcc-examples
Examples for BPF Compiler Collection (BCC)
......@@ -82,8 +98,10 @@ Command line tools for BPF Compiler Collection (BCC)
%files -n python-bcc
%{python_sitelib}/bcc*
%if %{with_lua}
%files -n bcc-lua
/usr/bin/bcc-lua
%endif
%files -n libbcc-examples
/usr/share/bcc/examples/*
......
......@@ -3,7 +3,12 @@ Maintainer: Brenden Blanco <bblanco@plumgrid.com>
Section: misc
Priority: optional
Standards-Version: 3.9.5
Build-Depends: debhelper (>= 9), cmake, libllvm3.7 | libllvm3.8, llvm-3.7-dev | llvm-3.8-dev, libclang-3.7-dev | libclang-3.8-dev, libelf-dev, bison, flex, libedit-dev, clang-format | clang-format-3.7, python-netaddr, python-pyroute2, luajit, libluajit-5.1-dev
Build-Depends: debhelper (>= 9), cmake, libllvm3.7 | libllvm3.8,
llvm-3.7-dev | llvm-3.8-dev, libclang-3.7-dev | libclang-3.8-dev,
libelf-dev, bison, flex, libedit-dev,
clang-format | clang-format-3.7 | clang-format-3.8, python (>= 2.7),
python-netaddr, python-pyroute2, luajit, libluajit-5.1-dev, arping,
inetutils-ping | iputils-ping, iperf, netperf, ethtool, devscripts
Homepage: https://github.com/iovisor/bcc
Package: libbcc
......
......@@ -15,6 +15,7 @@ ARM64 | 3.18 | [e54bcde3d69d](https://git.kernel.org/cgit/linux/kernel/git/torva
s390 | 4.1 | [054623105728](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=054623105728b06852f077299e2bf1bf3d5f2b0b)
Constant blinding for JIT machines | 4.7 | [4f3446bb809f](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4f3446bb809f20ad56cadf712e6006815ae7a8f9)
PowerPC64 | 4.8 | [156d0e290e96](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=156d0e290e969caba25f1851c52417c14d141b24)
Constant blinding - PowerPC64 | 4.9 | [b7b7013cac55](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b7b7013cac55d794940bd9cb7b7c55c9dececac4)
## Main features
......@@ -24,7 +25,7 @@ Feature | Kernel version | Commit
Kernel helpers | 3.15 | [bd4cf0ed331a](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=bd4cf0ed331a275e9bf5a49e6d0fd55dffc551b8)
`bpf()` syscall | 3.18 | [99c55f7d47c0](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=99c55f7d47c0dc6fc64729f37bf435abf43f4c60)
Tables (_a.k.a._ Maps; details below) | 3.18 | [99c55f7d47c0](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=99c55f7d47c0dc6fc64729f37bf435abf43f4c60)
BPF attached to sockets | 3.19 | [89aa075832b0](89aa075832b0da4402acebd698d0411dcc82d03e)
BPF attached to sockets | 3.19 | [89aa075832b0](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=89aa075832b0da4402acebd698d0411dcc82d03e)
BPF attached to `kprobes` | 4.1 | [2541517c32be](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2541517c32be2531e0da59dfd7efc1ce844644f5)
`cls_bpf` / `act_bpf` for `tc` | 4.1 | [e2e9b6541dd4](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e2e9b6541dd4b31848079da80fe2253daaafb549)
Tail calls | 4.2 | [04fd61ab36ec](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=04fd61ab36ec065e194ab5e74ae34a5240d992bb)
......@@ -53,9 +54,11 @@ Perf events | 4.3 | [ea317b267e9d](https://git.kernel.org/cgit/linux/kernel/git/
Per-CPU hash | 4.6 | [824bd0ce6c7c](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=824bd0ce6c7c43a9e1e210abf124958e54d88342)
Per-CPU array | 4.6 | [a10423b87a7e](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a10423b87a7eae75da79ce80a8d9475047a674ee)
Stack trace | 4.6 | [d5a3b1f69186](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d5a3b1f691865be576c2bffa708549b8cdccda19)
Pre-alloc maps memory | 4.6 | [6c9059817432](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6c90598174322b8888029e40dd84a4eb01f56afe)
cgroup array | 4.8 | [4ed8ec521ed5](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=4ed8ec521ed57c4e207ad464ca0388776de74d4b)
LRU hash | [4.10](https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=29ba732acbeece1e34c68483d1ec1f3720fa1bb3) | [](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=29ba732acbeece1e34c68483d1ec1f3720fa1bb3)
LRU per-CPU hash | [4.10](https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=8f8449384ec364ba2a654f11f94e754e4ff719e0) | [](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8f8449384ec364ba2a654f11f94e754e4ff719e0)
LRU hash | 4.10 | [29ba732acbee](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=29ba732acbeece1e34c68483d1ec1f3720fa1bb3)
LRU per-CPU hash | 4.10 | [8f8449384ec3](https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8f8449384ec364ba2a654f11f94e754e4ff719e0)
LPM trie | 4.11 | [b95a5c4db09b](https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=b95a5c4db09bc7c253636cb84dc9b12c577fd5a0)
Text string | _To be done?_ |
Variable-length maps | _To be done?_ |
......
.TH deadlock_detector 8 "2017-02-01" "USER COMMANDS"
.SH NAME
deadlock_detector \- Find potential deadlocks (lock order inversions)
in a running program.
.SH SYNOPSIS
.B deadlock_detector [\-h] [\--binary BINARY] [\--dump-graph DUMP_GRAPH]
.B [\--verbose] [\--lock-symbols LOCK_SYMBOLS]
.B [\--unlock-symbols UNLOCK_SYMBOLS]
.B pid
.SH DESCRIPTION
deadlock_detector finds potential deadlocks in a running process. The program
attaches uprobes on `pthread_mutex_lock` and `pthread_mutex_unlock` by default
to build a mutex wait directed graph, and then looks for a cycle in this graph.
This graph has the following properties:
- Nodes in the graph represent mutexes.
- Edge (A, B) exists if there exists some thread T where lock(A) was called
and lock(B) was called before unlock(A) was called.
If there is a cycle in this graph, this indicates that there is a lock order
inversion (potential deadlock). If the program finds a lock order inversion, the
program will dump the cycle of mutexes, dump the stack traces where each mutex
was acquired, and then exit.
This program can only find potential deadlocks that occur while the program is
tracing the process. It cannot find deadlocks that may have occurred before the
program was attached to the process.
This tool does not work for shared mutexes or recursive mutexes.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc
.SH OPTIONS
.TP
\-h, --help
show this help message and exit
.TP
\--binary BINARY
If set, trace the mutexes from the binary at this path. For
statically-linked binaries, this argument is not required.
For dynamically-linked binaries, this argument is required and should be the
path of the pthread library the binary is using.
Example: /lib/x86_64-linux-gnu/libpthread.so.0
.TP
\--dump-graph DUMP_GRAPH
If set, this will dump the mutex graph to the specified file.
.TP
\--verbose
Print statistics about the mutex wait graph.
.TP
\--lock-symbols LOCK_SYMBOLS
Comma-separated list of lock symbols to trace. Default is pthread_mutex_lock.
These symbols cannot be inlined in the binary.
.TP
\--unlock-symbols UNLOCK_SYMBOLS
Comma-separated list of unlock symbols to trace. Default is
pthread_mutex_unlock. These symbols cannot be inlined in the binary.
.TP
pid
Pid to trace
.SH EXAMPLES
.TP
Find potential deadlocks in PID 181. The --binary argument is not needed for \
statically-linked binaries.
#
.B deadlock_detector 181
.TP
Find potential deadlocks in PID 181. If the process was created from a \
dynamically-linked executable, the --binary argument is required and must be \
the path of the pthread library:
#
.B deadlock_detector 181 --binary /lib/x86_64-linux-gnu/libpthread.so.0
.TP
Find potential deadlocks in PID 181. If the process was created from a \
statically-linked executable, optionally pass the location of the binary. \
On older kernels without https://lkml.org/lkml/2017/1/13/585, binaries that \
contain `:` in the path cannot be attached with uprobes. As a workaround, we \
can create a symlink to the binary, and provide the symlink name instead with \
the `--binary` option:
#
.B deadlock_detector 181 --binary /usr/local/bin/lockinversion
.TP
Find potential deadlocks in PID 181 and dump the mutex wait graph to a file:
#
.B deadlock_detector 181 --dump-graph graph.json
.TP
Find potential deadlocks in PID 181 and print mutex wait graph statistics:
#
.B deadlock_detector 181 --verbose
.TP
Find potential deadlocks in PID 181 with custom mutexes:
#
.B deadlock_detector 181
.B --lock-symbols custom_mutex1_lock,custom_mutex2_lock
.B --unlock_symbols custom_mutex1_unlock,custom_mutex2_unlock
.SH OUTPUT
This program does not output any fields. Rather, it will keep running until
it finds a potential deadlock, or the user hits Ctrl-C. If the program finds
a potential deadlock, it will output the stack traces and lock order inversion
in the following format and exit:
.TP
Potential Deadlock Detected!
.TP
Cycle in lock order graph: Mutex M0 => Mutex M1 => Mutex M0
.TP
Mutex M1 acquired here while holding Mutex M0 in Thread T:
.B [stack trace]
.TP
Mutex M0 previously acquired by the same Thread T here:
.B [stack trace]
.TP
Mutex M0 acquired here while holding Mutex M1 in Thread S:
.B [stack trace]
.TP
Mutex M1 previously acquired by the same Thread S here:
.B [stack trace]
.TP
Thread T created by Thread R here:
.B [stack trace]
.TP
Thread S created by Thread Q here:
.B [stack trace]
.SH OVERHEAD
This traces all mutex lock and unlock events and all thread creation events
on the traced process. The overhead of this can be high if the process has many
threads and mutexes. You should only run this on a process where the slowdown
is acceptable.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Kenny Yu
......@@ -21,6 +21,10 @@ and may need modifications to match your software and processor architecture.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-p PID
Trace this process ID only.
.SH EXAMPLES
.TP
Trace host lookups (getaddrinfo/gethostbyname[2]) system wide:
......
......@@ -3,7 +3,7 @@
memleak \- Print a summary of outstanding allocations and their call stacks to detect memory leaks. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B memleak [-h] [-p PID] [-t] [-a] [-o OLDER] [-c COMMAND] [-s SAMPLE_RATE]
[-T TOP] [-z MIN_SIZE] [-Z MAX_SIZE] [INTERVAL] [COUNT]
[-T TOP] [-z MIN_SIZE] [-Z MAX_SIZE] [-O OBJ] [INTERVAL] [COUNT]
.SH DESCRIPTION
memleak traces and matches memory allocation and deallocation requests, and
collects call stacks for each allocation. memleak can then print a summary
......@@ -53,6 +53,9 @@ Capture only allocations that are larger than or equal to MIN_SIZE bytes.
\-Z MAX_SIZE
Capture only allocations that are smaller than or equal to MAX_SIZE bytes.
.TP
\-O OBJ
Attach to malloc and free in specified object instead of resolving libc. Ignored when kernel allocations are profiled.
.TP
INTERVAL
Print a summary of oustanding allocations and their call stacks every INTERVAL seconds.
The default interval is 5 seconds.
......
......@@ -49,7 +49,7 @@ Show stacks from kernel space only (no user space stacks).
.TP
\-\-stack-storage-size COUNT
The maximum number of unique stack traces that the kernel will count (default
2048). If the sampled count exceeds this, a warning will be printed.
10240). If the sampled count exceeds this, a warning will be printed.
.TP
duration
Duration to trace, in seconds.
......
......@@ -25,7 +25,8 @@ CONFIG_BPF and bcc.
Print usage message.
.TP
\-t
Include a timestamp column.
Include a timestamp column: in seconds since the first event, with decimal
places.
.TP
\-x
Only print failed stats.
......
......@@ -62,7 +62,7 @@ information. See PROBE SYNTAX below.
.SH PROBE SYNTAX
The general probe syntax is as follows:
.B [{p,r}]:[library]:function [(predicate)] ["format string"[, arguments]]
.B [{p,r}]:[library]:function[(signature)] [(predicate)] ["format string"[, arguments]]
.B {t:category:event,u:library:probe} [(predicate)] ["format string"[, arguments]]
.TP
......@@ -84,6 +84,12 @@ The tracepoint category. For example, "sched" or "irq".
.B function
The function to probe.
.TP
.B signature
The optional signature of the function to probe. This can make it easier to
access the function's arguments, instead of using the "arg1", "arg2" etc.
argument specifiers. For example, "(struct timespec *ts)" in the signature
position lets you use "ts" in the filter or print expressions.
.TP
.B event
The tracepoint event. For example, "block_rq_complete".
.TP
......@@ -159,6 +165,10 @@ Trace the block:block_rq_complete tracepoint and print the number of sectors com
Trace the pthread_create USDT probe from the pthread library and print the address of the thread's start function:
#
.B trace 'u:pthread:pthread_create """start addr = %llx"", arg3'
.TP
Trace the nanosleep system call and print the sleep duration in nanoseconds:
#
.B trace 'p::SyS_nanosleep(struct timespec *ts) "sleep for %lld ns", ts->tv_nsec'
.SH SOURCE
This is from bcc.
.IP
......
......@@ -2,28 +2,29 @@
.SH NAME
ucalls \- Summarize method calls from high-level languages and Linux syscalls.
.SH SYNOPSIS
.B ucalls [-l {java,python,ruby}] [-h] [-T TOP] [-L] [-S] [-v] [-m] pid [interval]
.B ucalls [-l {java,python,ruby,php}] [-h] [-T TOP] [-L] [-S] [-v] [-m] pid [interval]
.SH DESCRIPTION
This tool summarizes method calls from high-level languages such as Python,
Java, and Ruby. It can also trace Linux system calls. Whenever a method is
Java, Ruby, and PHP. It can also trace Linux system calls. Whenever a method is
invoked, ucalls records the call count and optionally the method's execution
time (latency) and displays a summary.
This uses in-kernel eBPF maps to store per process summaries for efficiency.
This tool relies on USDT probes embedded in many high-level languages, such as
Node, Java, Python, and Ruby. It requires a runtime instrumented with these
Java, Python, Ruby, and PHP. It requires a runtime instrumented with these
probes, which in some cases requires building from source with a USDT-specific
flag, such as "--enable-dtrace" or "--with-dtrace". For Java, method probes are
not enabled by default, and can be turned on by running the Java process with
the "-XX:+ExtendedDTraceProbes" flag.
the "-XX:+ExtendedDTraceProbes" flag. For PHP processes, the environment
variable USE_ZEND_DTRACE must be set to 1.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-l {java,python,ruby,node}
\-l {java,python,ruby,php}
The language to trace. If not provided, only syscalls are traced (when the \-S
option is used).
.TP
......
......@@ -2,16 +2,17 @@
.SH NAME
uflow \- Print a flow graph of method calls in high-level languages.
.SH SYNOPSIS
.B uflow [-h] [-M METHOD] [-C CLAZZ] [-v] {java,python,ruby} pid
.B uflow [-h] [-M METHOD] [-C CLAZZ] [-v] {java,python,ruby,php} pid
.SH DESCRIPTION
uflow traces method calls and prints them in a flow graph that can facilitate
debugging and diagnostics by following the program's execution (method flow).
This tool relies on USDT probes embedded in many high-level languages, such as
Node, Java, Python, and Ruby. It requires a runtime instrumented with these
Java, Python, Ruby, and PHP. It requires a runtime instrumented with these
probes, which in some cases requires building from source with a USDT-specific
flag, such as "--enable-dtrace" or "--with-dtrace". For Java processes, the
startup flag "-XX:+ExtendedDTraceProbes" is required.
startup flag "-XX:+ExtendedDTraceProbes" is required. For PHP processes, the
environment variable USE_ZEND_DTRACE must be set to 1.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
......@@ -29,7 +30,7 @@ name interpretation strongly depends on the language. For example, in Java use
\-v
Print the resulting BPF program, for debugging purposes.
.TP
{java,python,ruby}
{java,python,ruby,php}
The language to trace.
.TP
pid
......
......@@ -2,7 +2,7 @@
.SH NAME
ugc \- Trace garbage collection events in high-level languages.
.SH SYNOPSIS
.B ugc [-h] [-v] [-m] {java,python,ruby,node} pid
.B ugc [-h] [-v] [-m] [-M MINIMUM] [-F FILTER] {java,python,ruby,node} pid
.SH DESCRIPTION
This traces garbage collection events as they occur, including their duration
and any additional information (such as generation collected or type of GC)
......@@ -24,6 +24,18 @@ Print the resulting BPF program, for debugging purposes.
\-m
Print times in milliseconds. The default is microseconds.
.TP
\-M MINIMUM
Display only collections that are longer than this threshold. The value is
given in milliseconds. The default is to display all collections.
.TP
\-F FILTER
Display only collections whose textual description matches (contains) this
string. The default is to display all collections. Note that the filtering here
is performed in user-space, and not as part of the BPF program. This means that
if you have thousands of collection events, specifying this filter will not
reduce the amount of data that has to be transferred from the BPF program to
the user-space script.
.TP
{java,python,ruby,node}
The language to trace.
.TP
......@@ -39,16 +51,22 @@ Trace garbage collections in a specific Java process, and print GC times in
milliseconds:
#
.B ugc -m java 6004
.TP
Trace garbage collections in a specific Java process, and display them only if
they are longer than 10ms and have the string "Tenured" in their detailed
description:
#
.B ugc -M 10 -F Tenured java 6004
.SH FIELDS
.TP
START
The start time of the GC, in seconds from the beginning of the trace.
.TP
DESCRIPTION
The runtime-provided description of this garbage collection event.
.TP
TIME
The duration of the garbage collection event.
.TP
DESCRIPTION
The runtime-provided description of this garbage collection event.
.SH OVERHEAD
Garbage collection events, even if frequent, should not produce a considerable
overhead when traced because they are still not very common. Even hundreds of
......
......@@ -2,7 +2,7 @@
.SH NAME
ustat \- Activity stats from high-level languages.
.SH SYNOPSIS
.B ustat [-l {java,python,ruby,node}] [-C] [-S {cload,excp,gc,method,objnew,thread}] [-r MAXROWS] [-d] [interval [count]]
.B ustat [-l {java,python,ruby,node,php}] [-C] [-S {cload,excp,gc,method,objnew,thread}] [-r MAXROWS] [-d] [interval [count]]
.SH DESCRIPTION
This is "top" for high-level language events, such as garbage collections,
exceptions, thread creations, object allocations, method calls, and more. The
......@@ -12,11 +12,12 @@ can be sorted by various fields.
This uses in-kernel eBPF maps to store per process summaries for efficiency.
This tool relies on USDT probes embedded in many high-level languages, such as
Node, Java, Python, and Ruby. It requires a runtime instrumented with these
Node, Java, Python, Ruby, and PHP. It requires a runtime instrumented with these
probes, which in some cases requires building from source with a USDT-specific
flag, such as "--enable-dtrace" or "--with-dtrace". For Java, some probes are
not enabled by default, and can be turned on by running the Java process with
the "-XX:+ExtendedDTraceProbes" flag.
the "-XX:+ExtendedDTraceProbes" flag. For PHP processes, the environment
variable USE_ZEND_DTRACE must be set to 1.
Newly-created processes will only be traced at the next interval. If you run
this tool with a short interval (say, 1-5 seconds), this should be virtually
......@@ -28,7 +29,7 @@ Since this uses BPF, only the root user can use this tool.
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-l {java,python,ruby,node}
\-l {java,python,ruby,node,php}
The language to trace. By default, all languages are traced.
.TP
\-C
......
......@@ -24,14 +24,20 @@ pushd $TMP
tar xf bcc_$revision.orig.tar.gz
cd bcc
debuild=debuild
if [[ "$buildtype" = "test" ]]; then
# when testing, use faster compression options
debuild+=" --preserve-envvar PATH"
echo -e '#!/bin/bash\nexec /usr/bin/dpkg-deb -z1 "$@"' \
| sudo tee /usr/local/bin/dpkg-deb
sudo chmod +x /usr/local/bin/dpkg-deb
dch -b -v $revision-$release "$git_subject"
fi
if [[ "$buildtype" = "nightly" ]]; then
dch -v $revision-$release "$git_subject"
fi
DEB_BUILD_OPTIONS="nocheck parallel=${PARALLEL}" debuild -us -uc
DEB_BUILD_OPTIONS="nocheck parallel=${PARALLEL}" $debuild -us -uc
popd
cp $TMP/*.deb .
......@@ -16,7 +16,7 @@
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
#
name: bcc
version: 0.2.0-20161215-1402-7151673
version: 0.2.0-20170208-1555-3e77af5
summary: BPF Compiler Collection (BCC)
description: A toolkit for creating efficient kernel tracing and manipulation programs
confinement: strict
......@@ -46,12 +46,18 @@ apps:
command: wrapper cachestat
cachetop:
command: wrapper cachetop
capable:
command: wrapper capable
cpudist:
command: wrapper cpudist
cpuunclaimed:
command: wrapper cpuunclaimed
dcsnoop:
command: wrapper dcsnoop
dcstat:
command: wrapper dcstat
deadlock-detector:
command: wrapper deadlock_detector
execsnoop:
command: wrapper execsnoop
ext4dist:
......@@ -74,10 +80,14 @@ apps:
command: wrapper hardirqs
killsnoop:
command: wrapper killsnoop
llcstat:
command: wrapper llcstat
mdflush:
command: wrapper mdflush
memleak:
command: wrapper memleak
mountsnoop:
command: wrapper mountsnoop
offcputime:
command: wrapper offcputime
offwaketime:
......@@ -88,12 +98,18 @@ apps:
command: wrapper opensnoop
pidpersec:
command: wrapper pidpersec
profile:
command: wrapper profile
runqlat:
command: wrapper runqlat
runqlen:
command: wrapper runqlen
slabratetop:
command: wrapper slabratetop
softirqs:
command: wrapper softirqs
solisten:
command: wrapper solisten
sslsniff:
command: wrapper sslsniff
stackcount:
......@@ -116,10 +132,24 @@ apps:
command: wrapper tcpretrans
tcptop:
command: wrapper tcptop
ttysnoop:
command: wrapper ttysnop
tplist:
command: wrapper tplist
trace:
command: wrapper trace
ttysnoop:
command: wrapper ttysnoop
ucalls:
command: wrapper ucalls
uflow:
command: wrapper uflow
ugc:
command: wrapper ugc
uobjnew:
command: wrapper uobjnew
ustat:
command: wrapper ustat
uthreads:
command: wrapper uthreads
vfscount:
command: wrapper vfscount
vfsstat:
......
......@@ -30,6 +30,7 @@
#include "bpf_module.h"
#include "libbpf.h"
#include "perf_reader.h"
#include "common.h"
#include "usdt.h"
#include "BPF.h"
......@@ -146,7 +147,7 @@ StatusTuple BPF::detach_all() {
StatusTuple BPF::attach_kprobe(const std::string& kernel_func,
const std::string& probe_func,
bpf_attach_type attach_type,
bpf_probe_attach_type attach_type,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void* cb_cookie) {
std::string probe_event = get_kprobe_event(kernel_func, attach_type);
......@@ -156,11 +157,8 @@ StatusTuple BPF::attach_kprobe(const std::string& kernel_func,
int probe_fd;
TRY2(load_func(probe_func, BPF_PROG_TYPE_KPROBE, probe_fd));
std::string probe_event_desc = attach_type_prefix(attach_type);
probe_event_desc += ":kprobes/" + probe_event + " " + kernel_func;
void* res =
bpf_attach_kprobe(probe_fd, probe_event.c_str(), probe_event_desc.c_str(),
bpf_attach_kprobe(probe_fd, attach_type, probe_event.c_str(), kernel_func.c_str(),
pid, cpu, group_fd, cb, cb_cookie);
if (!res) {
......@@ -181,7 +179,7 @@ StatusTuple BPF::attach_uprobe(const std::string& binary_path,
const std::string& symbol,
const std::string& probe_func,
uint64_t symbol_addr,
bpf_attach_type attach_type,
bpf_probe_attach_type attach_type,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void* cb_cookie) {
bcc_symbol sym = bcc_symbol();
......@@ -195,13 +193,9 @@ StatusTuple BPF::attach_uprobe(const std::string& binary_path,
int probe_fd;
TRY2(load_func(probe_func, BPF_PROG_TYPE_KPROBE, probe_fd));
std::string probe_event_desc = attach_type_prefix(attach_type);
probe_event_desc += ":uprobes/" + probe_event + " ";
probe_event_desc += binary_path + ":0x" + uint_to_hex(sym.offset);
void* res =
bpf_attach_uprobe(probe_fd, probe_event.c_str(), probe_event_desc.c_str(),
pid, cpu, group_fd, cb, cb_cookie);
bpf_attach_uprobe(probe_fd, attach_type, probe_event.c_str(), binary_path.c_str(),
sym.offset, pid, cpu, group_fd, cb, cb_cookie);
if (!res) {
TRY2(unload_func(probe_func));
......@@ -297,11 +291,12 @@ StatusTuple BPF::attach_perf_event(uint32_t ev_type, uint32_t ev_config,
TRY2(load_func(probe_func, BPF_PROG_TYPE_PERF_EVENT, probe_fd));
auto fds = new std::map<int, int>();
int cpu_st = 0;
int cpu_en = sysconf(_SC_NPROCESSORS_ONLN) - 1;
std::vector<int> cpus;
if (cpu >= 0)
cpu_st = cpu_en = cpu;
for (int i = cpu_st; i <= cpu_en; i++) {
cpus.push_back(cpu);
else
cpus = get_online_cpus();
for (int i: cpus) {
int fd = bpf_attach_perf_event(probe_fd, ev_type, ev_config, sample_period,
sample_freq, pid, i, group_fd);
if (fd < 0) {
......@@ -323,7 +318,7 @@ StatusTuple BPF::attach_perf_event(uint32_t ev_type, uint32_t ev_config,
}
StatusTuple BPF::detach_kprobe(const std::string& kernel_func,
bpf_attach_type attach_type) {
bpf_probe_attach_type attach_type) {
std::string event = get_kprobe_event(kernel_func, attach_type);
auto it = kprobes_.find(event);
......@@ -339,7 +334,7 @@ StatusTuple BPF::detach_kprobe(const std::string& kernel_func,
StatusTuple BPF::detach_uprobe(const std::string& binary_path,
const std::string& symbol, uint64_t symbol_addr,
bpf_attach_type attach_type) {
bpf_probe_attach_type attach_type) {
bcc_symbol sym = bcc_symbol();
TRY2(check_binary_symbol(binary_path, symbol, symbol_addr, &sym));
......@@ -421,7 +416,7 @@ void BPF::poll_perf_buffer(const std::string& name, int timeout) {
}
StatusTuple BPF::load_func(const std::string& func_name,
enum bpf_prog_type type, int& fd) {
bpf_prog_type type, int& fd) {
if (funcs_.find(func_name) != funcs_.end()) {
fd = funcs_[func_name];
return StatusTuple(0);
......@@ -462,7 +457,7 @@ StatusTuple BPF::check_binary_symbol(const std::string& binary_path,
const std::string& symbol,
uint64_t symbol_addr, bcc_symbol* output) {
int res = bcc_resolve_symname(binary_path.c_str(), symbol.c_str(),
symbol_addr, output);
symbol_addr, 0, output);
if (res < 0)
return StatusTuple(
-1, "Unable to find offset for binary %s symbol %s address %lx",
......@@ -471,14 +466,14 @@ StatusTuple BPF::check_binary_symbol(const std::string& binary_path,
}
std::string BPF::get_kprobe_event(const std::string& kernel_func,
bpf_attach_type type) {
bpf_probe_attach_type type) {
std::string res = attach_type_prefix(type) + "_";
res += sanitize_str(kernel_func, &BPF::kprobe_event_validator);
return res;
}
std::string BPF::get_uprobe_event(const std::string& binary_path,
uint64_t offset, bpf_attach_type type) {
uint64_t offset, bpf_probe_attach_type type) {
std::string res = attach_type_prefix(type) + "_";
res += sanitize_str(binary_path, &BPF::uprobe_path_validator);
res += "_0x" + uint_to_hex(offset);
......@@ -492,8 +487,7 @@ StatusTuple BPF::detach_kprobe_event(const std::string& event,
attr.reader_ptr = nullptr;
}
TRY2(unload_func(attr.func));
std::string detach_event = "-:kprobes/" + event;
if (bpf_detach_kprobe(detach_event.c_str()) < 0)
if (bpf_detach_kprobe(event.c_str()) < 0)
return StatusTuple(-1, "Unable to detach kprobe %s", event.c_str());
return StatusTuple(0);
}
......@@ -505,8 +499,7 @@ StatusTuple BPF::detach_uprobe_event(const std::string& event,
attr.reader_ptr = nullptr;
}
TRY2(unload_func(attr.func));
std::string detach_event = "-:uprobes/" + event;
if (bpf_detach_uprobe(detach_event.c_str()) < 0)
if (bpf_detach_uprobe(event.c_str()) < 0)
return StatusTuple(-1, "Unable to detach uprobe %s", event.c_str());
return StatusTuple(0);
}
......
......@@ -29,11 +29,6 @@
namespace ebpf {
enum class bpf_attach_type {
probe_entry,
probe_return
};
struct open_probe_t {
void* reader_ptr;
std::string func;
......@@ -56,23 +51,23 @@ public:
StatusTuple attach_kprobe(
const std::string& kernel_func, const std::string& probe_func,
bpf_attach_type attach_type = bpf_attach_type::probe_entry,
bpf_probe_attach_type = BPF_PROBE_ENTRY,
pid_t pid = -1, int cpu = 0, int group_fd = -1,
perf_reader_cb cb = nullptr, void* cb_cookie = nullptr);
StatusTuple detach_kprobe(
const std::string& kernel_func,
bpf_attach_type attach_type = bpf_attach_type::probe_entry);
bpf_probe_attach_type attach_type = BPF_PROBE_ENTRY);
StatusTuple attach_uprobe(
const std::string& binary_path, const std::string& symbol,
const std::string& probe_func, uint64_t symbol_addr = 0,
bpf_attach_type attach_type = bpf_attach_type::probe_entry,
bpf_probe_attach_type attach_type = BPF_PROBE_ENTRY,
pid_t pid = -1, int cpu = 0, int group_fd = -1,
perf_reader_cb cb = nullptr, void* cb_cookie = nullptr);
StatusTuple detach_uprobe(
const std::string& binary_path, const std::string& symbol,
uint64_t symbol_addr = 0,
bpf_attach_type attach_type = bpf_attach_type::probe_entry);
bpf_probe_attach_type attach_type = BPF_PROBE_ENTRY);
StatusTuple attach_usdt(const USDT& usdt, pid_t pid = -1, int cpu = 0,
int group_fd = -1);
StatusTuple detach_usdt(const USDT& usdt);
......@@ -111,9 +106,9 @@ private:
StatusTuple unload_func(const std::string& func_name);
std::string get_kprobe_event(const std::string& kernel_func,
bpf_attach_type type);
bpf_probe_attach_type type);
std::string get_uprobe_event(const std::string& binary_path, uint64_t offset,
bpf_attach_type type);
bpf_probe_attach_type type);
StatusTuple detach_kprobe_event(const std::string& event, open_probe_t& attr);
StatusTuple detach_uprobe_event(const std::string& event, open_probe_t& attr);
......@@ -121,21 +116,21 @@ private:
open_probe_t& attr);
StatusTuple detach_perf_event_all_cpu(open_probe_t& attr);
std::string attach_type_debug(bpf_attach_type type) {
std::string attach_type_debug(bpf_probe_attach_type type) {
switch (type) {
case bpf_attach_type::probe_entry:
case BPF_PROBE_ENTRY:
return "";
case bpf_attach_type::probe_return:
case BPF_PROBE_RETURN:
return "return ";
}
return "ERROR";
}
std::string attach_type_prefix(bpf_attach_type type) {
std::string attach_type_prefix(bpf_probe_attach_type type) {
switch (type) {
case bpf_attach_type::probe_entry:
case BPF_PROBE_ENTRY:
return "p";
case bpf_attach_type::probe_return:
case BPF_PROBE_RETURN:
return "r";
}
return "ERROR";
......
......@@ -25,6 +25,7 @@
#include "bcc_syms.h"
#include "libbpf.h"
#include "perf_reader.h"
#include "common.h"
namespace ebpf {
......@@ -89,7 +90,7 @@ StatusTuple BPFPerfBuffer::open_all_cpu(perf_reader_raw_cb cb,
if (cpu_readers_.size() != 0 || readers_.size() != 0)
return StatusTuple(-1, "Previously opened perf buffer not cleaned");
for (int i = 0; i < sysconf(_SC_NPROCESSORS_ONLN); i++) {
for (int i: get_online_cpus()) {
auto res = open_on_cpu(cb, i, cb_cookie);
if (res.code() != 0) {
TRY2(close_all_cpu());
......@@ -113,7 +114,7 @@ StatusTuple BPFPerfBuffer::close_on_cpu(int cpu) {
StatusTuple BPFPerfBuffer::close_all_cpu() {
std::string errors;
bool has_error = false;
for (int i = 0; i < sysconf(_SC_NPROCESSORS_ONLN); i++) {
for (int i: get_online_cpus()) {
auto res = close_on_cpu(i);
if (res.code() != 0) {
errors += "Failed to close CPU" + std::to_string(i) + " perf buffer: ";
......
......@@ -35,12 +35,12 @@ if (CMAKE_COMPILER_IS_GNUCC AND LIBCLANG_ISSTATIC)
endif()
endif()
add_library(bcc-shared SHARED bpf_common.cc bpf_module.cc libbpf.c perf_reader.c shared_table.cc exported_files.cc bcc_elf.c bcc_perf_map.c bcc_proc.c bcc_syms.cc usdt_args.cc usdt.cc BPF.cc BPFTable.cc)
add_library(bcc-shared SHARED bpf_common.cc bpf_module.cc libbpf.c perf_reader.c shared_table.cc exported_files.cc bcc_elf.c bcc_perf_map.c bcc_proc.c bcc_syms.cc usdt_args.cc usdt.cc common.cc BPF.cc BPFTable.cc)
set_target_properties(bcc-shared PROPERTIES VERSION ${REVISION_LAST} SOVERSION 0)
set_target_properties(bcc-shared PROPERTIES OUTPUT_NAME bcc)
add_library(bcc-loader-static libbpf.c perf_reader.c bcc_elf.c bcc_perf_map.c bcc_proc.c)
add_library(bcc-static STATIC bpf_common.cc bpf_module.cc shared_table.cc exported_files.cc bcc_syms.cc usdt_args.cc usdt.cc BPF.cc BPFTable.cc)
add_library(bcc-static STATIC bpf_common.cc bpf_module.cc shared_table.cc exported_files.cc bcc_syms.cc usdt_args.cc usdt.cc common.cc BPF.cc BPFTable.cc)
set_target_properties(bcc-static PROPERTIES OUTPUT_NAME bcc)
set(llvm_raw_libs bitwriter bpfcodegen irreader linker
......
......@@ -24,6 +24,7 @@
#include <stdint.h>
#include <ctype.h>
#include <stdio.h>
#include <math.h>
#include "bcc_perf_map.h"
#include "bcc_proc.h"
......@@ -307,13 +308,57 @@ static bool match_so_flags(int flags) {
return true;
}
const char *bcc_procutils_which_so(const char *libname) {
static bool which_so_in_process(const char* libname, int pid, char* libpath) {
int ret, found = false;
char endline[4096], *mapname = NULL, *newline;
char mappings_file[128];
const size_t search_len = strlen(libname) + strlen("/lib.");
char search1[search_len + 1];
char search2[search_len + 1];
sprintf(mappings_file, "/proc/%ld/maps", (long)pid);
FILE *fp = fopen(mappings_file, "r");
if (!fp)
return NULL;
snprintf(search1, search_len + 1, "/lib%s.", libname);
snprintf(search2, search_len + 1, "/lib%s-", libname);
do {
ret = fscanf(fp, "%*x-%*x %*s %*x %*s %*d");
if (!fgets(endline, sizeof(endline), fp))
break;
mapname = endline;
newline = strchr(endline, '\n');
if (newline)
newline[0] = '\0';
while (isspace(mapname[0])) mapname++;
if (strstr(mapname, ".so") && (strstr(mapname, search1) ||
strstr(mapname, search2))) {
found = true;
memcpy(libpath, mapname, strlen(mapname) + 1);
break;
}
} while (ret != EOF);
fclose(fp);
return found;
}
char *bcc_procutils_which_so(const char *libname, int pid) {
const size_t soname_len = strlen(libname) + strlen("lib.so");
char soname[soname_len + 1];
char libpath[4096];
int i;
if (strchr(libname, '/'))
return libname;
return strdup(libname);
if (pid && which_so_in_process(libname, pid, libpath))
return strdup(libpath);
if (lib_cache_count < 0)
return NULL;
......@@ -327,8 +372,13 @@ const char *bcc_procutils_which_so(const char *libname) {
for (i = 0; i < lib_cache_count; ++i) {
if (!strncmp(lib_cache[i].libname, soname, soname_len) &&
match_so_flags(lib_cache[i].flags))
return lib_cache[i].path;
match_so_flags(lib_cache[i].flags)) {
return strdup(lib_cache[i].path);
}
}
return NULL;
}
void bcc_procutils_free(const char *ptr) {
free((void *)ptr);
}
......@@ -27,11 +27,12 @@ extern "C" {
typedef int (*bcc_procutils_modulecb)(const char *, uint64_t, uint64_t, void *);
typedef void (*bcc_procutils_ksymcb)(const char *, uint64_t, void *);
const char *bcc_procutils_which_so(const char *libname);
char *bcc_procutils_which_so(const char *libname, int pid);
char *bcc_procutils_which(const char *binpath);
int bcc_procutils_each_module(int pid, bcc_procutils_modulecb callback,
void *payload);
int bcc_procutils_each_ksym(bcc_procutils_ksymcb callback, void *payload);
void bcc_procutils_free(const char *ptr);
#ifdef __cplusplus
}
......
......@@ -304,7 +304,7 @@ int bcc_foreach_symbol(const char *module, SYM_CB cb) {
}
int bcc_resolve_symname(const char *module, const char *symname,
const uint64_t addr, struct bcc_symbol *sym) {
const uint64_t addr, int pid, struct bcc_symbol *sym) {
uint64_t load_addr;
sym->module = NULL;
......@@ -315,9 +315,9 @@ int bcc_resolve_symname(const char *module, const char *symname,
return -1;
if (strchr(module, '/')) {
sym->module = module;
sym->module = strdup(module);
} else {
sym->module = bcc_procutils_which_so(module);
sym->module = bcc_procutils_which_so(module, pid);
}
if (sym->module == NULL)
......
......@@ -43,7 +43,7 @@ int bcc_resolve_global_addr(int pid, const char *module, const uint64_t address,
int bcc_foreach_symbol(const char *module, SYM_CB cb);
int bcc_find_symbol_addr(struct bcc_symbol *sym);
int bcc_resolve_symname(const char *module, const char *symname,
const uint64_t addr, struct bcc_symbol *sym);
const uint64_t addr, int pid, struct bcc_symbol *sym);
#ifdef __cplusplus
}
#endif
......
......@@ -345,7 +345,8 @@ int BPFModule::load_includes(const string &text) {
int BPFModule::annotate() {
for (auto fn = mod_->getFunctionList().begin(); fn != mod_->getFunctionList().end(); ++fn)
fn->addFnAttr(Attribute::AlwaysInline);
if (!fn->hasFnAttribute(Attribute::NoInline))
fn->addFnAttr(Attribute::AlwaysInline);
// separate module to hold the reader functions
auto m = make_unique<Module>("sscanf", *ctx_);
......
/*
* Copyright (c) 2016 Catalysts GmbH
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <fstream>
#include <sstream>
#include "common.h"
namespace ebpf {
std::vector<int> read_cpu_range(std::string path) {
std::ifstream cpus_range_stream { path };
std::vector<int> cpus;
std::string cpu_range;
while (std::getline(cpus_range_stream, cpu_range, ',')) {
std::size_t rangeop = cpu_range.find('-');
if (rangeop == std::string::npos) {
cpus.push_back(std::stoi(cpu_range));
}
else {
int start = std::stoi(cpu_range.substr(0, rangeop));
int end = std::stoi(cpu_range.substr(rangeop + 1));
for (int i = start; i <= end; i++)
cpus.push_back(i);
}
}
return cpus;
}
std::vector<int> get_online_cpus() {
return read_cpu_range("/sys/devices/system/cpu/online");
}
std::vector<int> get_possible_cpus() {
return read_cpu_range("/sys/devices/system/cpu/possible");
}
} // namespace ebpf
......@@ -19,6 +19,7 @@
#include <memory>
#include <string>
#include <tuple>
#include <vector>
namespace ebpf {
......@@ -28,4 +29,8 @@ make_unique(Args &&... args) {
return std::unique_ptr<T>(new T(std::forward<Args>(args)...));
}
std::vector<int> get_online_cpus();
std::vector<int> get_possible_cpus();
} // namespace ebpf
This diff is collapsed.
This diff is collapsed.
......@@ -147,7 +147,7 @@ static u32 (*bpf_get_prandom_u32)(void) =
static int (*bpf_trace_printk_)(const char *fmt, u64 fmt_size, ...) =
(void *) BPF_FUNC_trace_printk;
int bpf_trace_printk(const char *fmt, ...) asm("llvm.bpf.extra");
static void bpf_tail_call_(u64 map_fd, void *ctx, int index) {
static inline void bpf_tail_call_(u64 map_fd, void *ctx, int index) {
((void (*)(void *, u64, int))BPF_FUNC_tail_call)(ctx, map_fd, index);
}
static int (*bpf_clone_redirect)(void *ctx, int ifindex, u32 flags) =
......@@ -180,8 +180,6 @@ static int (*bpf_perf_event_output)(void *ctx, void *map, u64 index, void *data,
(void *) BPF_FUNC_perf_event_output;
static int (*bpf_skb_load_bytes)(void *ctx, int offset, void *to, u32 len) =
(void *) BPF_FUNC_skb_load_bytes;
static u64 (*bpf_get_current_task)(void) =
(void *) BPF_FUNC_get_current_task;
/* bpf_get_stackid will return a negative value in the case of an error
*
......@@ -203,6 +201,34 @@ int bpf_get_stackid(uintptr_t map, void *ctx, u64 flags) {
static int (*bpf_csum_diff)(void *from, u64 from_size, void *to, u64 to_size, u64 seed) =
(void *) BPF_FUNC_csum_diff;
static int (*bpf_skb_get_tunnel_opt)(void *ctx, void *md, u32 size) =
(void *) BPF_FUNC_skb_get_tunnel_opt;
static int (*bpf_skb_set_tunnel_opt)(void *ctx, void *md, u32 size) =
(void *) BPF_FUNC_skb_set_tunnel_opt;
static int (*bpf_skb_change_proto)(void *ctx, u16 proto, u64 flags) =
(void *) BPF_FUNC_skb_change_proto;
static int (*bpf_skb_change_type)(void *ctx, u32 type) =
(void *) BPF_FUNC_skb_change_type;
static u32 (*bpf_get_hash_recalc)(void *ctx) =
(void *) BPF_FUNC_get_hash_recalc;
static u64 (*bpf_get_current_task)(void) =
(void *) BPF_FUNC_get_current_task;
static int (*bpf_probe_write_user)(void *dst, void *src, u32 size) =
(void *) BPF_FUNC_probe_write_user;
static int (*bpf_skb_change_tail)(void *ctx, u32 new_len, u64 flags) =
(void *) BPF_FUNC_skb_change_tail;
static int (*bpf_skb_pull_data)(void *ctx, u32 len) =
(void *) BPF_FUNC_skb_pull_data;
static int (*bpf_csum_update)(void *ctx, u16 csum) =
(void *) BPF_FUNC_csum_update;
static int (*bpf_set_hash_invalid)(void *ctx) =
(void *) BPF_FUNC_set_hash_invalid;
static int (*bpf_get_numa_node_id)(void) =
(void *) BPF_FUNC_get_numa_node_id;
static int (*bpf_skb_change_head)(void *ctx, u32 len, u64 flags) =
(void *) BPF_FUNC_skb_change_head;
static int (*bpf_xdp_adjust_head)(void *ctx, int offset) =
(void *) BPF_FUNC_xdp_adjust_head;
/* llvm builtin functions that eBPF C program may use to
* emit BPF_LD_ABS and BPF_LD_IND instructions
......@@ -284,7 +310,7 @@ static inline void bpf_store_dword(void *skb, u64 off, u64 val) {
#define MASK(_n) ((_n) < 64 ? (1ull << (_n)) - 1 : ((u64)-1LL))
#define MASK128(_n) ((_n) < 128 ? ((unsigned __int128)1 << (_n)) - 1 : ((unsigned __int128)-1))
static unsigned int bpf_log2(unsigned int v)
static inline unsigned int bpf_log2(unsigned int v)
{
unsigned int r;
unsigned int shift;
......@@ -297,7 +323,7 @@ static unsigned int bpf_log2(unsigned int v)
return r;
}
static unsigned int bpf_log2l(unsigned long v)
static inline unsigned int bpf_log2l(unsigned long v)
{
unsigned int hi = v >> 32;
if (hi)
......@@ -473,5 +499,18 @@ int bpf_usdt_readarg_p(int argc, struct pt_regs *ctx, void *buf, u64 len) asm("l
#define TRACEPOINT_PROBE(category, event) \
int tracepoint__##category##__##event(struct tracepoint__##category##__##event *args)
#define TP_DATA_LOC_READ_CONST(dst, field, length) \
do { \
short __offset = args->data_loc_##field & 0xFFFF; \
bpf_probe_read((void *)dst, length, (char *)args + __offset); \
} while (0);
#define TP_DATA_LOC_READ(dst, field) \
do { \
short __offset = args->data_loc_##field & 0xFFFF; \
short __length = args->data_loc_##field >> 16; \
bpf_probe_read((void *)dst, __length, (char *)args + __offset); \
} while (0);
#endif
)********"
......@@ -15,6 +15,9 @@ R"********(
* limitations under the License.
*/
#ifndef __BCC_PROTO_H
#define __BCC_PROTO_H
#include <uapi/linux/if_ether.h>
#define BPF_PACKET_HEADER __attribute__((packed)) __attribute__((deprecated("packet")))
......@@ -142,4 +145,6 @@ struct vxlan_gbp_t {
unsigned int key:24;
unsigned int rsv6:8;
} BPF_PACKET_HEADER;
#endif
)********"
......@@ -1163,6 +1163,8 @@ StatusTuple CodegenLLVM::visit_func_decl_stmt_node(FuncDeclStmtNode *n) {
Function *fn = mod_->getFunction(n->id_->name_);
if (fn) return mkstatus_(n, "Function %s already defined", n->id_->c_str());
fn = Function::Create(fn_type, GlobalValue::ExternalLinkage, n->id_->name_, mod_);
fn->setCallingConv(CallingConv::C);
fn->addFnAttr(Attribute::NoUnwind);
fn->setSection(BPF_FN_PREFIX + n->id_->name_);
BasicBlock *label_entry = BasicBlock::Create(ctx(), "entry", fn);
......
......@@ -2,4 +2,4 @@
# Licensed under the Apache License, Version 2.0 (the "License")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DKERNEL_MODULES_DIR='\"${BCC_KERNEL_MODULES_DIR}\"'")
add_library(clang_frontend loader.cc b_frontend_action.cc tp_frontend_action.cc kbuild_helper.cc)
add_library(clang_frontend loader.cc b_frontend_action.cc tp_frontend_action.cc kbuild_helper.cc ../../common.cc)
......@@ -27,6 +27,7 @@
#include "b_frontend_action.h"
#include "shared_table.h"
#include "common.h"
#include "libbpf.h"
......@@ -644,6 +645,8 @@ bool BTypeVisitor::VisitVarDecl(VarDecl *Decl) {
map_type = BPF_MAP_TYPE_LRU_HASH;
} else if (A->getName() == "maps/lru_percpu_hash") {
map_type = BPF_MAP_TYPE_LRU_PERCPU_HASH;
} else if (A->getName() == "maps/lpm_trie") {
map_type = BPF_MAP_TYPE_LPM_TRIE;
} else if (A->getName() == "maps/histogram") {
if (table.key_desc == "\"int\"")
map_type = BPF_MAP_TYPE_ARRAY;
......@@ -656,7 +659,7 @@ bool BTypeVisitor::VisitVarDecl(VarDecl *Decl) {
map_type = BPF_MAP_TYPE_PROG_ARRAY;
} else if (A->getName() == "maps/perf_output") {
map_type = BPF_MAP_TYPE_PERF_EVENT_ARRAY;
int numcpu = sysconf(_SC_NPROCESSORS_ONLN);
int numcpu = get_possible_cpus().size();
if (numcpu <= 0)
numcpu = 1;
table.max_entries = numcpu;
......
......@@ -89,6 +89,7 @@ int KBuildHelper::get_flags(const char *uname_machine, vector<string> *cflags) {
cflags->push_back("-D__HAVE_BUILTIN_BSWAP64__");
cflags->push_back("-Wno-unused-value");
cflags->push_back("-Wno-pointer-sign");
cflags->push_back("-fno-stack-protector");
return 0;
}
......
......@@ -137,6 +137,8 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
"-Wno-deprecated-declarations",
"-Wno-gnu-variable-sized-type-not-at-end",
"-fno-color-diagnostics",
"-fno-unwind-tables",
"-fno-asynchronous-unwind-tables",
"-x", "c", "-c", abs_file.c_str()});
KBuildHelper kbuild_helper(kdir, kernel_path_info.first);
......@@ -165,7 +167,11 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
// set up the command line argument wrapper
#if defined(__powerpc64__)
driver::Driver drv("", "ppc64le-unknown-linux-gnu", diags);
#if defined(_CALL_ELF) && _CALL_ELF == 2
driver::Driver drv("", "powerpc64le-unknown-linux-gnu", diags);
#else
driver::Driver drv("", "powerpc64-unknown-linux-gnu", diags);
#endif
#elif defined(__aarch64__)
driver::Driver drv("", "aarch64-unknown-linux-gnu", diags);
#else
......@@ -205,24 +211,25 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
}
// pre-compilation pass for generating tracepoint structures
auto invocation0 = make_unique<CompilerInvocation>();
if (!CompilerInvocation::CreateFromArgs(*invocation0, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
CompilerInstance compiler0;
CompilerInvocation &invocation0 = compiler0.getInvocation();
if (!CompilerInvocation::CreateFromArgs(
invocation0, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
return -1;
invocation0->getPreprocessorOpts().RetainRemappedFileBuffers = true;
invocation0.getPreprocessorOpts().RetainRemappedFileBuffers = true;
for (const auto &f : remapped_files_)
invocation0->getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
invocation0.getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
if (in_memory) {
invocation0->getPreprocessorOpts().addRemappedFile(main_path, &*main_buf);
invocation0->getFrontendOpts().Inputs.clear();
invocation0->getFrontendOpts().Inputs.push_back(FrontendInputFile(main_path, IK_C));
invocation0.getPreprocessorOpts().addRemappedFile(main_path, &*main_buf);
invocation0.getFrontendOpts().Inputs.clear();
invocation0.getFrontendOpts().Inputs.push_back(
FrontendInputFile(main_path, IK_C));
}
invocation0->getFrontendOpts().DisableFree = false;
invocation0.getFrontendOpts().DisableFree = false;
CompilerInstance compiler0;
compiler0.setInvocation(invocation0.release());
compiler0.createDiagnostics(new IgnoringDiagConsumer());
// capture the rewritten c file
......@@ -233,24 +240,25 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
unique_ptr<llvm::MemoryBuffer> out_buf = llvm::MemoryBuffer::getMemBuffer(out_str);
// first pass
auto invocation1 = make_unique<CompilerInvocation>();
if (!CompilerInvocation::CreateFromArgs(*invocation1, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
CompilerInstance compiler1;
CompilerInvocation &invocation1 = compiler1.getInvocation();
if (!CompilerInvocation::CreateFromArgs(
invocation1, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
return -1;
// This option instructs clang whether or not to free the file buffers that we
// give to it. Since the embedded header files should be copied fewer times
// and reused if possible, set this flag to true.
invocation1->getPreprocessorOpts().RetainRemappedFileBuffers = true;
invocation1.getPreprocessorOpts().RetainRemappedFileBuffers = true;
for (const auto &f : remapped_files_)
invocation1->getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
invocation1->getPreprocessorOpts().addRemappedFile(main_path, &*out_buf);
invocation1->getFrontendOpts().Inputs.clear();
invocation1->getFrontendOpts().Inputs.push_back(FrontendInputFile(main_path, IK_C));
invocation1->getFrontendOpts().DisableFree = false;
invocation1.getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
invocation1.getPreprocessorOpts().addRemappedFile(main_path, &*out_buf);
invocation1.getFrontendOpts().Inputs.clear();
invocation1.getFrontendOpts().Inputs.push_back(
FrontendInputFile(main_path, IK_C));
invocation1.getFrontendOpts().DisableFree = false;
CompilerInstance compiler1;
compiler1.setInvocation(invocation1.release());
compiler1.createDiagnostics();
// capture the rewritten c file
......@@ -264,21 +272,22 @@ int ClangLoader::parse(unique_ptr<llvm::Module> *mod, unique_ptr<vector<TableDes
*tables = bact.take_tables();
// second pass, clear input and take rewrite buffer
auto invocation2 = make_unique<CompilerInvocation>();
if (!CompilerInvocation::CreateFromArgs(*invocation2, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
return -1;
CompilerInstance compiler2;
invocation2->getPreprocessorOpts().RetainRemappedFileBuffers = true;
CompilerInvocation &invocation2 = compiler2.getInvocation();
if (!CompilerInvocation::CreateFromArgs(
invocation2, const_cast<const char **>(ccargs.data()),
const_cast<const char **>(ccargs.data()) + ccargs.size(), diags))
return -1;
invocation2.getPreprocessorOpts().RetainRemappedFileBuffers = true;
for (const auto &f : remapped_files_)
invocation2->getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
invocation2->getPreprocessorOpts().addRemappedFile(main_path, &*out_buf1);
invocation2->getFrontendOpts().Inputs.clear();
invocation2->getFrontendOpts().Inputs.push_back(FrontendInputFile(main_path, IK_C));
invocation2->getFrontendOpts().DisableFree = false;
invocation2.getPreprocessorOpts().addRemappedFile(f.first, &*f.second);
invocation2.getPreprocessorOpts().addRemappedFile(main_path, &*out_buf1);
invocation2.getFrontendOpts().Inputs.clear();
invocation2.getFrontendOpts().Inputs.push_back(
FrontendInputFile(main_path, IK_C));
invocation2.getFrontendOpts().DisableFree = false;
// suppress warnings in the 2nd pass, but bail out on errors (our fault)
invocation2->getDiagnosticOpts().IgnoreWarnings = true;
compiler2.setInvocation(invocation2.release());
invocation2.getDiagnosticOpts().IgnoreWarnings = true;
compiler2.createDiagnostics();
EmitLLVMOnlyAction ir_act(&*ctx_);
......
......@@ -48,35 +48,42 @@ TracepointTypeVisitor::TracepointTypeVisitor(ASTContext &C, Rewriter &rewriter)
: C(C), diag_(C.getDiagnostics()), rewriter_(rewriter), out_(llvm::errs()) {
}
static inline bool _is_valid_field(string const& line,
string& field_type,
string& field_name) {
enum class field_kind_t {
common,
data_loc,
regular,
invalid
};
static inline field_kind_t _get_field_kind(string const& line,
string& field_type,
string& field_name) {
auto field_pos = line.find("field:");
if (field_pos == string::npos)
return false;
return field_kind_t::invalid;
auto semi_pos = line.find(';', field_pos);
if (semi_pos == string::npos)
return false;
return field_kind_t::invalid;
auto size_pos = line.find("size:", semi_pos);
if (size_pos == string::npos)
return false;
return field_kind_t::invalid;
auto field = line.substr(field_pos + 6/*"field:"*/,
semi_pos - field_pos - 6);
auto pos = field.find_last_of("\t ");
if (pos == string::npos)
return false;
return field_kind_t::invalid;
field_type = field.substr(0, pos);
field_name = field.substr(pos + 1);
if (field_type.find("__data_loc") != string::npos)
return false;
return field_kind_t::data_loc;
if (field_name.find("common_") == 0)
return false;
return field_kind_t::common;
return true;
return field_kind_t::regular;
}
string TracepointTypeVisitor::GenerateTracepointStruct(
......@@ -91,9 +98,17 @@ string TracepointTypeVisitor::GenerateTracepointStruct(
tp_struct += "\tu64 __do_not_use__;\n";
for (string line; getline(input, line); ) {
string field_type, field_name;
if (!_is_valid_field(line, field_type, field_name))
continue;
tp_struct += "\t" + field_type + " " + field_name + ";\n";
switch (_get_field_kind(line, field_type, field_name)) {
case field_kind_t::invalid:
case field_kind_t::common:
continue;
case field_kind_t::data_loc:
tp_struct += "\tint data_loc_" + field_name + ";\n";
break;
case field_kind_t::regular:
tp_struct += "\t" + field_type + " " + field_name + ";\n";
break;
}
}
tp_struct += "};\n";
......
......@@ -35,6 +35,8 @@
#include <sys/resource.h>
#include <unistd.h>
#include <stdbool.h>
#include <sys/stat.h>
#include <sys/types.h>
#include "libbpf.h"
#include "perf_reader.h"
......@@ -137,6 +139,37 @@ int bpf_get_next_key(int fd, void *key, void *next_key)
return syscall(__NR_bpf, BPF_MAP_GET_NEXT_KEY, &attr, sizeof(attr));
}
void bpf_print_hints(char *log)
{
if (log == NULL)
return;
// The following error strings will need maintenance to match LLVM.
// stack busting
if (strstr(log, "invalid stack off=-") != NULL) {
fprintf(stderr, "HINT: Looks like you exceeded the BPF stack limit. "
"This can happen if you allocate too much local variable storage. "
"For example, if you allocated a 1 Kbyte struct (maybe for "
"BPF_PERF_OUTPUT), busting a max stack of 512 bytes.\n\n");
}
// didn't check NULL on map lookup
if (strstr(log, "invalid mem access 'map_value_or_null'") != NULL) {
fprintf(stderr, "HINT: The 'map_value_or_null' error can happen if "
"you dereference a pointer value from a map lookup without first "
"checking if that pointer is NULL.\n\n");
}
// lacking a bpf_probe_read
if (strstr(log, "invalid mem access 'inv'") != NULL) {
fprintf(stderr, "HINT: The invalid mem access 'inv' error can happen "
"if you try to dereference memory without first using "
"bpf_probe_read() to copy it to the BPF stack. Sometimes the "
"bpf_probe_read is automatic by the bcc rewriter, other times "
"you'll need to be explicit.\n\n");
}
}
#define ROUND_UP(x, n) (((x) + (n) - 1u) & ~((n) - 1u))
int bpf_prog_load(enum bpf_prog_type prog_type,
......@@ -221,6 +254,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
}
fprintf(stderr, "bpf: %s\n%s\n", strerror(errno), bpf_log_buffer);
bpf_print_hints(bpf_log_buffer);
free(bpf_log_buffer);
}
......@@ -304,14 +338,19 @@ static int bpf_attach_tracing_event(int progfd, const char *event_path,
return 0;
}
static void * bpf_attach_probe(int progfd, const char *event,
const char *event_desc, const char *event_type,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie) {
void * bpf_attach_kprobe(int progfd, enum bpf_probe_attach_type attach_type, const char *ev_name,
const char *fn_name,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie)
{
int kfd;
char buf[256];
char new_name[128];
struct perf_reader *reader = NULL;
static char *event_type = "kprobe";
int n;
snprintf(new_name, sizeof(new_name), "%s_bcc_%d", ev_name, getpid());
reader = perf_reader_new(cb, NULL, cb_cookie);
if (!reader)
goto error;
......@@ -323,8 +362,9 @@ static void * bpf_attach_probe(int progfd, const char *event,
goto error;
}
if (write(kfd, event_desc, strlen(event_desc)) < 0) {
fprintf(stderr, "write(%s, \"%s\") failed: %s\n", buf, event_desc, strerror(errno));
snprintf(buf, sizeof(buf), "%c:%ss/%s %s", attach_type==BPF_PROBE_ENTRY ? 'p' : 'r',
event_type, new_name, fn_name);
if (write(kfd, buf, strlen(buf)) < 0) {
if (errno == EINVAL)
fprintf(stderr, "check dmesg output for possible cause\n");
close(kfd);
......@@ -332,34 +372,84 @@ static void * bpf_attach_probe(int progfd, const char *event,
}
close(kfd);
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s", event_type, event);
if (access("/sys/kernel/debug/tracing/instances", F_OK) != -1) {
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/instances/bcc_%d", getpid());
if (access(buf, F_OK) == -1) {
if (mkdir(buf, 0755) == -1)
goto retry;
}
n = snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/instances/bcc_%d/events/%ss/%s",
getpid(), event_type, new_name);
if (n < sizeof(buf) && bpf_attach_tracing_event(progfd, buf, reader, pid, cpu, group_fd) == 0)
goto out;
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/instances/bcc_%d", getpid());
rmdir(buf);
}
retry:
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s", event_type, new_name);
if (bpf_attach_tracing_event(progfd, buf, reader, pid, cpu, group_fd) < 0)
goto error;
out:
return reader;
error:
perf_reader_free(reader);
return NULL;
}
void * bpf_attach_kprobe(int progfd, const char *event,
const char *event_desc,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie) {
return bpf_attach_probe(progfd, event, event_desc, "kprobe", pid, cpu, group_fd, cb, cb_cookie);
}
void * bpf_attach_uprobe(int progfd, const char *event,
const char *event_desc,
void * bpf_attach_uprobe(int progfd, enum bpf_probe_attach_type attach_type, const char *ev_name,
const char *binary_path, uint64_t offset,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie) {
return bpf_attach_probe(progfd, event, event_desc, "uprobe", pid, cpu, group_fd, cb, cb_cookie);
perf_reader_cb cb, void *cb_cookie)
{
int kfd;
char buf[PATH_MAX];
char new_name[128];
struct perf_reader *reader = NULL;
static char *event_type = "uprobe";
int n;
snprintf(new_name, sizeof(new_name), "%s_bcc_%d", ev_name, getpid());
reader = perf_reader_new(cb, NULL, cb_cookie);
if (!reader)
goto error;
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/%s_events", event_type);
kfd = open(buf, O_WRONLY | O_APPEND, 0);
if (kfd < 0) {
fprintf(stderr, "open(%s): %s\n", buf, strerror(errno));
goto error;
}
n = snprintf(buf, sizeof(buf), "%c:%ss/%s %s:0x%lx", attach_type==BPF_PROBE_ENTRY ? 'p' : 'r',
event_type, new_name, binary_path, offset);
if (n >= sizeof(buf)) {
close(kfd);
goto error;
}
if (write(kfd, buf, strlen(buf)) < 0) {
if (errno == EINVAL)
fprintf(stderr, "check dmesg output for possible cause\n");
close(kfd);
goto error;
}
close(kfd);
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/events/%ss/%s", event_type, new_name);
if (bpf_attach_tracing_event(progfd, buf, reader, pid, cpu, group_fd) < 0)
goto error;
return reader;
error:
perf_reader_free(reader);
return NULL;
}
static int bpf_detach_probe(const char *event_desc, const char *event_type) {
static int bpf_detach_probe(const char *ev_name, const char *event_type)
{
int kfd;
char buf[256];
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/%s_events", event_type);
kfd = open(buf, O_WRONLY | O_APPEND, 0);
......@@ -368,7 +458,8 @@ static int bpf_detach_probe(const char *event_desc, const char *event_type) {
return -1;
}
if (write(kfd, event_desc, strlen(event_desc)) < 0) {
snprintf(buf, sizeof(buf), "-:%ss/%s_bcc_%d", event_type, ev_name, getpid());
if (write(kfd, buf, strlen(buf)) < 0) {
fprintf(stderr, "write(%s): %s\n", buf, strerror(errno));
close(kfd);
return -1;
......@@ -378,14 +469,24 @@ static int bpf_detach_probe(const char *event_desc, const char *event_type) {
return 0;
}
int bpf_detach_kprobe(const char *event_desc) {
return bpf_detach_probe(event_desc, "kprobe");
int bpf_detach_kprobe(const char *ev_name)
{
char buf[256];
int ret = bpf_detach_probe(ev_name, "kprobe");
snprintf(buf, sizeof(buf), "/sys/kernel/debug/tracing/instances/bcc_%d", getpid());
if (access(buf, F_OK) != -1) {
rmdir(buf);
}
return ret;
}
int bpf_detach_uprobe(const char *event_desc) {
return bpf_detach_probe(event_desc, "uprobe");
int bpf_detach_uprobe(const char *ev_name)
{
return bpf_detach_probe(ev_name, "uprobe");
}
void * bpf_attach_tracepoint(int progfd, const char *tp_category,
const char *tp_name, int pid, int cpu,
int group_fd, perf_reader_cb cb, void *cb_cookie) {
......
......@@ -24,6 +24,11 @@
extern "C" {
#endif
enum bpf_probe_attach_type {
BPF_PROBE_ENTRY,
BPF_PROBE_RETURN
};
int bpf_create_map(enum bpf_map_type map_type, int key_size, int value_size,
int max_entries, int map_flags);
int bpf_update_elem(int fd, void *key, void *value, unsigned long long flags);
......@@ -44,15 +49,19 @@ typedef void (*perf_reader_cb)(void *cb_cookie, int pid, uint64_t callchain_num,
void *callchain);
typedef void (*perf_reader_raw_cb)(void *cb_cookie, void *raw, int raw_size);
void * bpf_attach_kprobe(int progfd, const char *event, const char *event_desc,
int pid, int cpu, int group_fd, perf_reader_cb cb,
void *cb_cookie);
int bpf_detach_kprobe(const char *event_desc);
void * bpf_attach_kprobe(int progfd, enum bpf_probe_attach_type attach_type,
const char *ev_name, const char *fn_name,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie);
int bpf_detach_kprobe(const char *ev_name);
void * bpf_attach_uprobe(int progfd, enum bpf_probe_attach_type attach_type,
const char *ev_name, const char *binary_path, uint64_t offset,
pid_t pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie);
void * bpf_attach_uprobe(int progfd, const char *event, const char *event_desc,
int pid, int cpu, int group_fd, perf_reader_cb cb,
void *cb_cookie);
int bpf_detach_uprobe(const char *event_desc);
int bpf_detach_uprobe(const char *ev_name);
void * bpf_attach_tracepoint(int progfd, const char *tp_category,
const char *tp_name, int pid, int cpu,
......
......@@ -223,8 +223,9 @@ std::string Context::resolve_bin_path(const std::string &bin_path) {
if (char *which = bcc_procutils_which(bin_path.c_str())) {
result = which;
::free(which);
} else if (const char *which_so = bcc_procutils_which_so(bin_path.c_str())) {
} else if (char *which_so = bcc_procutils_which_so(bin_path.c_str(), 0)) {
result = which_so;
::free(which_so);
}
return result;
......
......@@ -43,13 +43,10 @@ function Bpf.static.cleanup()
libbcc.perf_reader_free(probe)
-- skip bcc-specific kprobes
if not key:starts("bcc:") then
local desc = string.format("-:%s/%s", probe_type, key)
log.info("detaching %s", desc)
if probe_type == "kprobes" then
libbcc.bpf_detach_kprobe(desc)
libbcc.bpf_detach_kprobe(key)
elseif probe_type == "uprobes" then
libbcc.bpf_detach_uprobe(desc)
libbcc.bpf_detach_uprobe(key)
end
end
all_probes[key] = nil
......@@ -183,15 +180,13 @@ end
function Bpf:attach_uprobe(args)
Bpf.check_probe_quota(1)
local path, addr = Sym.check_path_symbol(args.name, args.sym, args.addr)
local path, addr = Sym.check_path_symbol(args.name, args.sym, args.addr, args.pid)
local fn = self:load_func(args.fn_name, 'BPF_PROG_TYPE_KPROBE')
local ptype = args.retprobe and "r" or "p"
local ev_name = string.format("%s_%s_0x%p", ptype, path:gsub("[^%a%d]", "_"), addr)
local desc = string.format("%s:uprobes/%s %s:0x%p", ptype, ev_name, path, addr)
log.info(desc)
local retprobe = args.retprobe and 1 or 0
local res = libbcc.bpf_attach_uprobe(fn.fd, ev_name, desc,
local res = libbcc.bpf_attach_uprobe(fn.fd, retprobe, ev_name, path, addr,
args.pid or -1,
args.cpu or 0,
args.group_fd or -1, nil, nil) -- TODO; reader callback
......@@ -209,11 +204,9 @@ function Bpf:attach_kprobe(args)
local event = args.event or ""
local ptype = args.retprobe and "r" or "p"
local ev_name = string.format("%s_%s", ptype, event:gsub("[%+%.]", "_"))
local desc = string.format("%s:kprobes/%s %s", ptype, ev_name, event)
log.info(desc)
local retprobe = args.retprobe and 1 or 0
local res = libbcc.bpf_attach_kprobe(fn.fd, ev_name, desc,
local res = libbcc.bpf_attach_kprobe(fn.fd, retprobe, ev_name, event,
args.pid or -1,
args.cpu or 0,
args.group_fd or -1, nil, nil) -- TODO; reader callback
......
......@@ -40,13 +40,19 @@ int bpf_open_raw_sock(const char *name);
typedef void (*perf_reader_cb)(void *cb_cookie, int pid, uint64_t callchain_num, void *callchain);
typedef void (*perf_reader_raw_cb)(void *cb_cookie, void *raw, int raw_size);
void * bpf_attach_kprobe(int progfd, const char *event, const char *event_desc,
int pid, int cpu, int group_fd, perf_reader_cb cb, void *cb_cookie);
int bpf_detach_kprobe(const char *event_desc);
void * bpf_attach_kprobe(int progfd, int attach_type, const char *ev_name,
const char *fn_name,
int pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie);
void * bpf_attach_uprobe(int progfd, const char *event, const char *event_desc,
int pid, int cpu, int group_fd, perf_reader_cb cb, void *cb_cookie);
int bpf_detach_uprobe(const char *event_desc);
int bpf_detach_kprobe(const char *ev_name);
void * bpf_attach_uprobe(int progfd, int attach_type, const char *ev_name,
const char *binary_path, uint64_t offset,
int pid, int cpu, int group_fd,
perf_reader_cb cb, void *cb_cookie);
int bpf_detach_uprobe(const char *ev_name);
void * bpf_open_perf_buffer(perf_reader_raw_cb raw_cb, void *cb_cookie, int pid, int cpu);
]]
......@@ -109,7 +115,8 @@ struct bcc_symbol {
};
int bcc_resolve_symname(const char *module, const char *symname, const uint64_t addr,
struct bcc_symbol *sym);
int pid, struct bcc_symbol *sym);
void bcc_procutils_free(const char *ptr);
void *bcc_symcache_new(int pid);
int bcc_symcache_resolve(void *symcache, uint64_t addr, struct bcc_symbol *sym);
void bcc_symcache_refresh(void *resolver);
......
......@@ -30,17 +30,22 @@ local function create_cache(pid)
}
end
local function check_path_symbol(module, symname, addr)
local function check_path_symbol(module, symname, addr, pid)
local sym = SYM()
if libbcc.bcc_resolve_symname(module, symname, addr or 0x0, sym) < 0 then
local module_path
if libbcc.bcc_resolve_symname(module, symname, addr or 0x0, pid or 0, sym) < 0 then
if sym[0].module == nil then
error("could not find library '%s' in the library path" % module)
else
module_path = ffi.string(sym[0].module)
libbcc.bcc_procutils_free(sym[0].module)
error("failed to resolve symbol '%s' in '%s'" % {
symname, ffi.string(sym[0].module)})
symname, module_path})
end
end
return ffi.string(sym[0].module), sym[0].offset
module_path = ffi.string(sym[0].module)
libbcc.bcc_procutils_free(sym[0].module)
return module_path, sym[0].offset
end
return { create_cache=create_cache, check_path_symbol=check_path_symbol }
......@@ -29,6 +29,7 @@ BaseTable.static.BPF_MAP_TYPE_STACK_TRACE = 7
BaseTable.static.BPF_MAP_TYPE_CGROUP_ARRAY = 8
BaseTable.static.BPF_MAP_TYPE_LRU_HASH = 9
BaseTable.static.BPF_MAP_TYPE_LRU_PERCPU_HASH = 10
BaseTable.static.BPF_MAP_TYPE_LPM_TRIE = 11
function BaseTable:initialize(t_type, bpf, map_id, map_fd, key_type, leaf_type)
assert(t_type == libbcc.bpf_table_type_id(bpf.module, map_id))
......
......@@ -140,6 +140,7 @@ else
S.c.BPF_MAP.CGROUP_ARRAY = 8
S.c.BPF_MAP.LRU_HASH = 9
S.c.BPF_MAP.LRU_PERCPU_HASH = 10
S.c.BPF_MAP.LPM_TRIE = 11
end
if not S.c.BPF_PROG.TRACEPOINT then
S.c.BPF_PROG.TRACEPOINT = 5
......
This diff is collapsed.
......@@ -87,13 +87,13 @@ lib.bpf_attach_kprobe.restype = ct.c_void_p
_CB_TYPE = ct.CFUNCTYPE(None, ct.py_object, ct.c_int,
ct.c_ulonglong, ct.POINTER(ct.c_ulonglong))
_RAW_CB_TYPE = ct.CFUNCTYPE(None, ct.py_object, ct.c_void_p, ct.c_int)
lib.bpf_attach_kprobe.argtypes = [ct.c_int, ct.c_char_p, ct.c_char_p, ct.c_int,
lib.bpf_attach_kprobe.argtypes = [ct.c_int, ct.c_int, ct.c_char_p, ct.c_char_p, ct.c_int,
ct.c_int, ct.c_int, _CB_TYPE, ct.py_object]
lib.bpf_detach_kprobe.restype = ct.c_int
lib.bpf_detach_kprobe.argtypes = [ct.c_char_p]
lib.bpf_attach_uprobe.restype = ct.c_void_p
lib.bpf_attach_uprobe.argtypes = [ct.c_int, ct.c_char_p, ct.c_char_p, ct.c_int,
ct.c_int, ct.c_int, _CB_TYPE, ct.py_object]
lib.bpf_attach_uprobe.argtypes = [ct.c_int, ct.c_int, ct.c_char_p, ct.c_char_p,
ct.c_ulonglong, ct.c_int, ct.c_int, ct.c_int, _CB_TYPE, ct.py_object]
lib.bpf_detach_uprobe.restype = ct.c_int
lib.bpf_detach_uprobe.argtypes = [ct.c_char_p]
lib.bpf_attach_tracepoint.restype = ct.c_void_p
......@@ -126,16 +126,18 @@ class bcc_symbol(ct.Structure):
_fields_ = [
('name', ct.c_char_p),
('demangle_name', ct.c_char_p),
('module', ct.c_char_p),
('module', ct.POINTER(ct.c_char)),
('offset', ct.c_ulonglong),
]
lib.bcc_procutils_which_so.restype = ct.c_char_p
lib.bcc_procutils_which_so.argtypes = [ct.c_char_p]
lib.bcc_procutils_which_so.restype = ct.POINTER(ct.c_char)
lib.bcc_procutils_which_so.argtypes = [ct.c_char_p, ct.c_int]
lib.bcc_procutils_free.restype = None
lib.bcc_procutils_free.argtypes = [ct.c_void_p]
lib.bcc_resolve_symname.restype = ct.c_int
lib.bcc_resolve_symname.argtypes = [
ct.c_char_p, ct.c_char_p, ct.c_ulonglong, ct.POINTER(bcc_symbol)]
ct.c_char_p, ct.c_char_p, ct.c_ulonglong, ct.c_int, ct.POINTER(bcc_symbol)]
_SYM_CB_TYPE = ct.CFUNCTYPE(ct.c_int, ct.c_char_p, ct.c_ulonglong)
lib.bcc_foreach_symbol.restype = ct.c_int
......
......@@ -13,8 +13,8 @@
# limitations under the License.
import ctypes as ct
import multiprocessing
import os
from .utils import get_online_cpus
class Perf(object):
class perf_event_attr(ct.Structure):
......@@ -105,5 +105,5 @@ class Perf(object):
attr.sample_period = 1
attr.wakeup_events = 9999999 # don't wake up
for cpu in range(0, multiprocessing.cpu_count()):
for cpu in get_online_cpus():
Perf._open_for_cpu(cpu, attr)
......@@ -14,11 +14,14 @@
from collections import MutableMapping
import ctypes as ct
from functools import reduce
import multiprocessing
import os
from .libbcc import lib, _RAW_CB_TYPE
from .perf import Perf
from .utils import get_online_cpus
from .utils import get_possible_cpus
from subprocess import check_output
BPF_MAP_TYPE_HASH = 1
......@@ -31,6 +34,7 @@ BPF_MAP_TYPE_STACK_TRACE = 7
BPF_MAP_TYPE_CGROUP_ARRAY = 8
BPF_MAP_TYPE_LRU_HASH = 9
BPF_MAP_TYPE_LRU_PERCPU_HASH = 10
BPF_MAP_TYPE_LPM_TRIE = 11
stars_max = 40
log2_index_max = 65
......@@ -121,6 +125,8 @@ def Table(bpf, map_id, map_fd, keytype, leaftype, **kwargs):
t = PerCpuHash(bpf, map_id, map_fd, keytype, leaftype, **kwargs)
elif ttype == BPF_MAP_TYPE_PERCPU_ARRAY:
t = PerCpuArray(bpf, map_id, map_fd, keytype, leaftype, **kwargs)
elif ttype == BPF_MAP_TYPE_LPM_TRIE:
t = LpmTrie(bpf, map_id, map_fd, keytype, leaftype)
elif ttype == BPF_MAP_TYPE_STACK_TRACE:
t = StackTrace(bpf, map_id, map_fd, keytype, leaftype)
elif ttype == BPF_MAP_TYPE_LRU_HASH:
......@@ -509,7 +515,7 @@ class PerfEventArray(ArrayBase):
event submitted from the kernel, up to millions per second.
"""
for i in range(0, multiprocessing.cpu_count()):
for i in get_online_cpus():
self._open_perf_buffer(i, callback)
def _open_perf_buffer(self, cpu, callback):
......@@ -550,7 +556,7 @@ class PerfEventArray(ArrayBase):
if not isinstance(ev, self.Event):
raise Exception("argument must be an Event, got %s", type(ev))
for i in range(0, multiprocessing.cpu_count()):
for i in get_online_cpus():
self._open_perf_event(i, ev.typ, ev.config)
......@@ -559,7 +565,7 @@ class PerCpuHash(HashTable):
self.reducer = kwargs.pop("reducer", None)
super(PerCpuHash, self).__init__(*args, **kwargs)
self.sLeaf = self.Leaf
self.total_cpu = multiprocessing.cpu_count()
self.total_cpu = len(get_possible_cpus())
# This needs to be 8 as hard coded into the linux kernel.
self.alignment = ct.sizeof(self.sLeaf) % 8
if self.alignment is 0:
......@@ -595,7 +601,7 @@ class PerCpuHash(HashTable):
def sum(self, key):
if isinstance(self.Leaf(), ct.Structure):
raise IndexError("Leaf must be an integer type for default sum functions")
return self.sLeaf(reduce(lambda x,y: x+y, self.getvalue(key)))
return self.sLeaf(sum(self.getvalue(key)))
def max(self, key):
if isinstance(self.Leaf(), ct.Structure):
......@@ -604,8 +610,7 @@ class PerCpuHash(HashTable):
def average(self, key):
result = self.sum(key)
result.value/=self.total_cpu
return result
return result.value / self.total_cpu
class LruPerCpuHash(PerCpuHash):
def __init__(self, *args, **kwargs):
......@@ -616,7 +621,7 @@ class PerCpuArray(ArrayBase):
self.reducer = kwargs.pop("reducer", None)
super(PerCpuArray, self).__init__(*args, **kwargs)
self.sLeaf = self.Leaf
self.total_cpu = multiprocessing.cpu_count()
self.total_cpu = len(get_possible_cpus())
# This needs to be 8 as hard coded into the linux kernel.
self.alignment = ct.sizeof(self.sLeaf) % 8
if self.alignment is 0:
......@@ -652,7 +657,7 @@ class PerCpuArray(ArrayBase):
def sum(self, key):
if isinstance(self.Leaf(), ct.Structure):
raise IndexError("Leaf must be an integer type for default sum functions")
return self.sLeaf(reduce(lambda x,y: x+y, self.getvalue(key)))
return self.sLeaf(sum(self.getvalue(key)))
def max(self, key):
if isinstance(self.Leaf(), ct.Structure):
......@@ -661,8 +666,19 @@ class PerCpuArray(ArrayBase):
def average(self, key):
result = self.sum(key)
result.value/=self.total_cpu
return result
return result.value / self.total_cpu
class LpmTrie(TableBase):
def __init__(self, *args, **kwargs):
super(LpmTrie, self).__init__(*args, **kwargs)
def __len__(self):
raise NotImplementedError
def __delitem__(self, key):
# Not implemented for lpm trie as of kernel commit
# b95a5c4db09bc7c253636cb84dc9b12c577fd5a0
raise NotImplementedError
class StackTrace(TableBase):
MAX_DEPTH = 127
......
......@@ -13,10 +13,14 @@
# limitations under the License.
import ctypes as ct
import sys
from .libbcc import lib, _USDT_CB, _USDT_PROBE_CB, \
bcc_usdt_location, bcc_usdt_argument, \
BCC_USDT_ARGUMENT_FLAGS
class USDTException(Exception):
pass
class USDTProbeArgument(object):
def __init__(self, argument):
self.signed = argument.size < 0
......@@ -77,8 +81,9 @@ class USDTProbeLocation(object):
res = lib.bcc_usdt_get_argument(self.probe.context, self.probe.name,
self.index, index, ct.pointer(arg))
if res != 0:
raise Exception("error retrieving probe argument %d location %d" %
(index, self.index))
raise USDTException(
"error retrieving probe argument %d location %d" %
(index, self.index))
return USDTProbeArgument(arg)
class USDTProbe(object):
......@@ -103,7 +108,7 @@ class USDTProbe(object):
res = lib.bcc_usdt_get_location(self.context, self.name,
index, ct.pointer(loc))
if res != 0:
raise Exception("error retrieving probe location %d" % index)
raise USDTException("error retrieving probe location %d" % index)
return USDTProbeLocation(self, index, loc)
class USDT(object):
......@@ -112,23 +117,36 @@ class USDT(object):
self.pid = pid
self.context = lib.bcc_usdt_new_frompid(pid)
if self.context == None:
raise Exception("USDT failed to instrument PID %d" % pid)
raise USDTException("USDT failed to instrument PID %d" % pid)
elif path:
self.path = path
self.context = lib.bcc_usdt_new_frompath(path)
if self.context == None:
raise Exception("USDT failed to instrument path %s" % path)
raise USDTException("USDT failed to instrument path %s" % path)
else:
raise Exception("either a pid or a binary path must be specified")
raise USDTException(
"either a pid or a binary path must be specified")
def enable_probe(self, probe, fn_name):
if lib.bcc_usdt_enable_probe(self.context, probe, fn_name) != 0:
raise Exception(("failed to enable probe '%s'; a possible cause " +
"can be that the probe requires a pid to enable") %
probe)
raise USDTException(
("failed to enable probe '%s'; a possible cause " +
"can be that the probe requires a pid to enable") %
probe
)
def enable_probe_or_bail(self, probe, fn_name):
if lib.bcc_usdt_enable_probe(self.context, probe, fn_name) != 0:
print(
"""Error attaching USDT probes: the specified pid might not contain the
given language's runtime, or the runtime was not built with the required
USDT probes. Look for a configure flag similar to --with-dtrace or
--enable-dtrace. To check which probes are present in the process, use the
tplist tool.""")
sys.exit(1)
def get_text(self):
return lib.bcc_usdt_genargs(self.context)
return lib.bcc_usdt_genargs(self.context).decode()
def get_probe_arg_ctype(self, probe_name, arg_index):
return lib.bcc_usdt_get_probe_argctype(
......
# Copyright 2016 Catalysts GmbH
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
def _read_cpu_range(path):
cpus = []
with open(path, 'r') as f:
cpus_range_str = f.read()
for cpu_range in cpus_range_str.split(','):
rangeop = cpu_range.find('-')
if rangeop == -1:
cpus.append(int(cpu_range))
else:
start = int(cpu_range[:rangeop])
end = int(cpu_range[rangeop+1:])
cpus.extend(range(start, end+1))
return cpus
def get_online_cpus():
return _read_cpu_range('/sys/devices/system/cpu/online')
def get_possible_cpus():
return _read_cpu_range('/sys/devices/system/cpu/possible')
......@@ -23,6 +23,7 @@
#include "bcc_perf_map.h"
#include "bcc_proc.h"
#include "bcc_syms.h"
#include "common.h"
#include "vendor/tinyformat.hpp"
#include "catch.hpp"
......@@ -30,10 +31,19 @@
using namespace std;
TEST_CASE("shared object resolution", "[c_api]") {
const char *libm = bcc_procutils_which_so("m");
char *libm = bcc_procutils_which_so("m", 0);
REQUIRE(libm);
REQUIRE(libm[0] == '/');
REQUIRE(string(libm).find("libm.so") != string::npos);
free(libm);
}
TEST_CASE("shared object resolution using loaded libraries", "[c_api]") {
char *libelf = bcc_procutils_which_so("elf", getpid());
REQUIRE(libelf);
REQUIRE(libelf[0] == '/');
REQUIRE(string(libelf).find("libelf") != string::npos);
free(libelf);
}
TEST_CASE("binary resolution with `which`", "[c_api]") {
......@@ -57,10 +67,21 @@ TEST_CASE("list all kernel symbols", "[c_api]") {
TEST_CASE("resolve symbol name in external library", "[c_api]") {
struct bcc_symbol sym;
REQUIRE(bcc_resolve_symname("c", "malloc", 0x0, &sym) == 0);
REQUIRE(bcc_resolve_symname("c", "malloc", 0x0, 0, &sym) == 0);
REQUIRE(string(sym.module).find("libc.so") != string::npos);
REQUIRE(sym.module[0] == '/');
REQUIRE(sym.offset != 0);
bcc_procutils_free(sym.module);
}
TEST_CASE("resolve symbol name in external library using loaded libraries", "[c_api]") {
struct bcc_symbol sym;
REQUIRE(bcc_resolve_symname("bcc", "bcc_procutils_which", 0x0, getpid(), &sym) == 0);
REQUIRE(string(sym.module).find("libbcc.so") != string::npos);
REQUIRE(sym.module[0] == '/');
REQUIRE(sym.offset != 0);
bcc_procutils_free(sym.module);
}
extern "C" int _a_test_function(const char *a_string) {
......@@ -196,3 +217,10 @@ TEST_CASE("resolve symbols using /tmp/perf-pid.map", "[c_api]") {
munmap(map_addr, map_sz);
}
TEST_CASE("get online CPUs", "[c_api]") {
std::vector<int> cpus = ebpf::get_online_cpus();
int num_cpus = sysconf(_SC_NPROCESSORS_ONLN);
REQUIRE(cpus.size() == num_cpus);
}
......@@ -19,23 +19,9 @@ if ldd bcc-lua | grep -q luajit; then
fail "bcc-lua depends on libluajit"
fi
rm -f libbcc.so probe.lua
rm -f probe.lua
echo "return function(BPF) print(\"Hello world\") end" > probe.lua
if ./bcc-lua "probe.lua"; then
fail "bcc-lua runs without libbcc.so"
fi
if ! env LIBBCC_SO_PATH=../cc/libbcc.so ./bcc-lua "probe.lua"; then
fail "bcc-lua cannot load libbcc.so through the environment"
fi
ln -s ../cc/libbcc.so
if ! ./bcc-lua "probe.lua"; then
fail "bcc-lua cannot find local libbcc.so"
fi
PROBE="../../../examples/lua/offcputime.lua"
if ! sudo ./bcc-lua "$PROBE" -d 1 >/dev/null 2>/dev/null; then
......
......@@ -27,8 +27,8 @@ int count(struct pt_regs *ctx) {
local text = text:gsub("PID", tostring(pid))
local b = BPF:new{text=text}
b:attach_uprobe{name="c", sym="malloc_stats", fn_name="count"}
b:attach_uprobe{name="c", sym="malloc_stats", fn_name="count", retprobe=true}
b:attach_uprobe{name="c", sym="malloc_stats", fn_name="count", pid=pid}
b:attach_uprobe{name="c", sym="malloc_stats", fn_name="count", pid=pid, retprobe=true}
assert_equals(BPF.num_open_uprobes(), 2)
......
......@@ -17,7 +17,7 @@ endif()
add_test(NAME py_test_stat1_b WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_stat1_b namespace ${CMAKE_CURRENT_SOURCE_DIR}/test_stat1.py test_stat1.b proto.b)
add_test(NAME py_test_bpf_log WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_bpf_prog namespace ${CMAKE_CURRENT_SOURCE_DIR}/test_bpf_log.py)
COMMAND ${TEST_WRAPPER} py_bpf_prog sudo ${CMAKE_CURRENT_SOURCE_DIR}/test_bpf_log.py)
add_test(NAME py_test_stat1_c WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_stat1_c namespace ${CMAKE_CURRENT_SOURCE_DIR}/test_stat1.py test_stat1.c)
#add_test(NAME py_test_xlate1_b WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
......@@ -56,6 +56,10 @@ add_test(NAME py_test_tracepoint WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_test_tracepoint sudo ${CMAKE_CURRENT_SOURCE_DIR}/test_tracepoint.py)
add_test(NAME py_test_perf_event WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_test_perf_event sudo ${CMAKE_CURRENT_SOURCE_DIR}/test_perf_event.py)
add_test(NAME py_test_utils WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_test_utils sudo ${CMAKE_CURRENT_SOURCE_DIR}/test_utils.py)
add_test(NAME py_test_percpu WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_test_percpu sudo ${CMAKE_CURRENT_SOURCE_DIR}/test_percpu.py)
add_test(NAME py_test_dump_func WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
COMMAND ${TEST_WRAPPER} py_dump_func simple ${CMAKE_CURRENT_SOURCE_DIR}/test_dump_func.py)
......@@ -6,6 +6,8 @@ from bcc import BPF
import ctypes as ct
import random
import time
import subprocess
from bcc.utils import get_online_cpus
from unittest import main, TestCase
class TestArray(TestCase):
......@@ -62,6 +64,37 @@ int kprobe__sys_nanosleep(void *ctx) {
time.sleep(0.1)
b.kprobe_poll()
self.assertGreater(self.counter, 0)
b.cleanup()
def test_perf_buffer_for_each_cpu(self):
self.events = []
class Data(ct.Structure):
_fields_ = [("cpu", ct.c_ulonglong)]
def cb(cpu, data, size):
self.assertGreater(size, ct.sizeof(Data))
event = ct.cast(data, ct.POINTER(Data)).contents
self.events.append(event)
text = """
BPF_PERF_OUTPUT(events);
int kprobe__sys_nanosleep(void *ctx) {
struct {
u64 cpu;
} data = {bpf_get_smp_processor_id()};
events.perf_submit(ctx, &data, sizeof(data));
return 0;
}
"""
b = BPF(text=text)
b["events"].open_perf_buffer(cb)
online_cpus = get_online_cpus()
for cpu in online_cpus:
subprocess.call(['taskset', '-c', str(cpu), 'sleep', '0.1'])
b.kprobe_poll()
b.cleanup()
self.assertGreaterEqual(len(self.events), len(online_cpus), 'Received only {}/{} events'.format(len(self.events), len(online_cpus)))
if __name__ == "__main__":
main()
......@@ -51,7 +51,7 @@ class TestBPFProgLoad(TestCase):
except Exception:
self.fp.flush()
self.fp.seek(0)
self.assertEqual(error_msg in self.fp.read(), True)
self.assertEqual(error_msg in self.fp.read().decode(), True)
def test_log_no_debug(self):
......@@ -61,7 +61,7 @@ class TestBPFProgLoad(TestCase):
except Exception:
self.fp.flush()
self.fp.seek(0)
self.assertEqual(error_msg in self.fp.read(), True)
self.assertEqual(error_msg in self.fp.read().decode(), True)
if __name__ == "__main__":
......
......@@ -68,6 +68,16 @@ ipr = IPRoute()
ipdb = IPDB(nl=ipr)
sim = Simulation(ipdb)
allocated_interfaces = set(ipdb.interfaces.keys())
def get_next_iface(prefix):
i = 0
while True:
iface = "{0}{1}".format(prefix, i)
if iface not in allocated_interfaces:
allocated_interfaces.add(iface)
return iface
i += 1
class TestBPFSocket(TestCase):
def setup_br(self, br, veth_rt_2_br, veth_pem_2_br, veth_br_2_pem):
......@@ -84,15 +94,15 @@ class TestBPFSocket(TestCase):
br1.add_port(ipdb.interfaces[veth_rt_2_br])
br1.up()
subprocess.call(["sysctl", "-q", "-w", "net.ipv6.conf." + br + ".disable_ipv6=1"])
def set_default_const(self):
self.ns1 = "ns1"
self.ns2 = "ns2"
self.ns_router = "ns_router"
self.br1 = "br1"
self.br1 = get_next_iface("br")
self.veth_pem_2_br1 = "v20"
self.veth_br1_2_pem = "v21"
self.br2 = "br2"
self.br2 = get_next_iface("br")
self.veth_pem_2_br2 = "v22"
self.veth_br2_2_pem = "v23"
......
#!/usr/bin/env python
# Copyright (c) 2017 Facebook, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")
import ctypes as ct
import unittest
from bcc import BPF
from netaddr import IPAddress
class KeyV4(ct.Structure):
_fields_ = [("prefixlen", ct.c_uint),
("data", ct.c_ubyte * 4)]
class KeyV6(ct.Structure):
_fields_ = [("prefixlen", ct.c_uint),
("data", ct.c_ushort * 8)]
class TestLpmTrie(unittest.TestCase):
def test_lpm_trie_v4(self):
test_prog1 = """
BPF_F_TABLE("lpm_trie", u64, int, trie, 16, BPF_F_NO_PREALLOC);
"""
b = BPF(text=test_prog1)
t = b["trie"]
k1 = KeyV4(24, (192, 168, 0, 0))
v1 = ct.c_int(24)
t[k1] = v1
k2 = KeyV4(28, (192, 168, 0, 0))
v2 = ct.c_int(28)
t[k2] = v2
k = KeyV4(32, (192, 168, 0, 15))
self.assertEqual(t[k].value, 28)
k = KeyV4(32, (192, 168, 0, 127))
self.assertEqual(t[k].value, 24)
with self.assertRaises(KeyError):
k = KeyV4(32, (172, 16, 1, 127))
v = t[k]
def test_lpm_trie_v6(self):
test_prog1 = """
struct key_v6 {
u32 prefixlen;
u32 data[4];
};
BPF_F_TABLE("lpm_trie", struct key_v6, int, trie, 16, BPF_F_NO_PREALLOC);
"""
b = BPF(text=test_prog1)
t = b["trie"]
k1 = KeyV6(64, IPAddress('2a00:1450:4001:814:200e::').words)
v1 = ct.c_int(64)
t[k1] = v1
k2 = KeyV6(96, IPAddress('2a00:1450:4001:814::200e').words)
v2 = ct.c_int(96)
t[k2] = v2
k = KeyV6(128, IPAddress('2a00:1450:4001:814::1024').words)
self.assertEqual(t[k].value, 96)
k = KeyV6(128, IPAddress('2a00:1450:4001:814:2046::').words)
self.assertEqual(t[k].value, 64)
with self.assertRaises(KeyError):
k = KeyV6(128, IPAddress('2a00:ffff::').words)
v = t[k]
if __name__ == "__main__":
unittest.main()
......@@ -9,6 +9,12 @@ import multiprocessing
class TestPercpu(unittest.TestCase):
def setUp(self):
try:
b = BPF(text='BPF_TABLE("percpu_array", u32, u32, stub, 1);')
except:
raise unittest.SkipTest("PerCpu unsupported on this kernel")
def test_u64(self):
test_prog1 = """
BPF_TABLE("percpu_hash", u32, u64, stats, 1);
......@@ -34,8 +40,8 @@ class TestPercpu(unittest.TestCase):
sum = stats_map.sum(stats_map.Key(0))
avg = stats_map.average(stats_map.Key(0))
max = stats_map.max(stats_map.Key(0))
self.assertGreater(sum.value, 0L)
self.assertGreater(max.value, 0L)
self.assertGreater(sum.value, int(0))
self.assertGreater(max.value, int(0))
bpf_code.detach_kprobe("sys_clone")
def test_u32(self):
......@@ -63,8 +69,8 @@ class TestPercpu(unittest.TestCase):
sum = stats_map.sum(stats_map.Key(0))
avg = stats_map.average(stats_map.Key(0))
max = stats_map.max(stats_map.Key(0))
self.assertGreater(sum.value, 0L)
self.assertGreater(max.value, 0L)
self.assertGreater(sum.value, int(0))
self.assertGreater(max.value, int(0))
bpf_code.detach_kprobe("sys_clone")
def test_struct_custom_func(self):
......@@ -95,7 +101,7 @@ class TestPercpu(unittest.TestCase):
f.close()
self.assertEqual(len(stats_map),1)
k = stats_map[ stats_map.Key(0) ]
self.assertGreater(k.c1, 0L)
self.assertGreater(k.c1, int(0))
bpf_code.detach_kprobe("sys_clone")
......
......@@ -7,6 +7,7 @@ import unittest
from time import sleep
import distutils.version
import os
import subprocess
def kernel_version_ge(major, minor):
# True if running kernel is >= X.Y
......@@ -39,5 +40,29 @@ class TestTracepoint(unittest.TestCase):
total_switches += v.value
self.assertNotEqual(0, total_switches)
@unittest.skipUnless(kernel_version_ge(4,7), "requires kernel >= 4.7")
class TestTracepointDataLoc(unittest.TestCase):
def test_tracepoint_data_loc(self):
text = """
struct value_t {
char filename[64];
};
BPF_HASH(execs, u32, struct value_t);
TRACEPOINT_PROBE(sched, sched_process_exec) {
struct value_t val = {0};
char fn[64];
u32 pid = args->pid;
struct value_t *existing = execs.lookup_or_init(&pid, &val);
TP_DATA_LOC_READ_CONST(fn, filename, 64);
__builtin_memcpy(existing->filename, fn, 64);
return 0;
}
"""
b = bcc.BPF(text=text)
subprocess.check_output(["/bin/ls"])
sleep(1)
self.assertTrue("/bin/ls" in [v.filename.decode()
for v in b["execs"].values()])
if __name__ == "__main__":
unittest.main()
......@@ -18,22 +18,24 @@ static void incr(int idx) {
++(*ptr);
}
int count(struct pt_regs *ctx) {
bpf_trace_printk("count() uprobe fired");
u32 pid = bpf_get_current_pid_tgid();
if (pid == PID)
incr(0);
return 0;
}"""
text = text.replace("PID", "%d" % os.getpid())
test_pid = os.getpid()
text = text.replace("PID", "%d" % test_pid)
b = bcc.BPF(text=text)
b.attach_uprobe(name="c", sym="malloc_stats", fn_name="count")
b.attach_uretprobe(name="c", sym="malloc_stats", fn_name="count")
b.attach_uprobe(name="c", sym="malloc_stats", fn_name="count", pid=test_pid)
b.attach_uretprobe(name="c", sym="malloc_stats", fn_name="count", pid=test_pid)
libc = ctypes.CDLL("libc.so.6")
libc.malloc_stats.restype = None
libc.malloc_stats.argtypes = []
libc.malloc_stats()
self.assertEqual(b["stats"][ctypes.c_int(0)].value, 2)
b.detach_uretprobe(name="c", sym="malloc_stats")
b.detach_uprobe(name="c", sym="malloc_stats")
b.detach_uretprobe(name="c", sym="malloc_stats", pid=test_pid)
b.detach_uprobe(name="c", sym="malloc_stats", pid=test_pid)
def test_simple_binary(self):
text = """
......
#!/usr/bin/python
# Copyright (c) Catalysts GmbH
# Licensed under the Apache License, Version 2.0 (the "License")
from bcc.utils import get_online_cpus
import multiprocessing
import unittest
class TestUtils(unittest.TestCase):
def test_get_online_cpus(self):
online_cpus = get_online_cpus()
num_cores = multiprocessing.cpu_count()
self.assertEqual(len(online_cpus), num_cores)
if __name__ == "__main__":
unittest.main()
......@@ -159,7 +159,7 @@ u64 __time = bpf_ktime_get_ns();
if parts[0] not in ["r", "p", "t", "u"]:
self._bail("probe type must be 'p', 'r', 't', or 'u'" +
" but got '%s'" % parts[0])
if re.match(r"\w+\(.*\)", parts[2]) is None:
if re.match(r"\S+\(.*\)", parts[2]) is None:
self._bail(("function signature '%s' has an invalid " +
"format") % parts[2])
......@@ -173,6 +173,9 @@ u64 __time = bpf_ktime_get_ns();
self._bail("no exprs specified")
self.exprs = exprs.split(',')
def _make_valid_identifier(self, ident):
return re.sub(r'[^A-Za-z0-9_]', '_', ident)
def __init__(self, tool, type, specifier):
self.usdt_ctx = None
self.streq_functions = ""
......@@ -196,8 +199,9 @@ u64 __time = bpf_ktime_get_ns();
self.tp_event = self.function
elif self.probe_type == "u":
self.library = parts[1]
self.probe_func_name = "%s_probe%d" % \
(self.function, Probe.next_probe_index)
self.probe_func_name = self._make_valid_identifier(
"%s_probe%d" % \
(self.function, Probe.next_probe_index))
self._enable_usdt_probe()
else:
self.library = parts[1]
......@@ -233,10 +237,12 @@ u64 __time = bpf_ktime_get_ns();
self.entry_probe_required = self.probe_type == "r" and \
(any(map(check, self.exprs)) or check(self.filter))
self.probe_func_name = "%s_probe%d" % \
(self.function, Probe.next_probe_index)
self.probe_hash_name = "%s_hash%d" % \
(self.function, Probe.next_probe_index)
self.probe_func_name = self._make_valid_identifier(
"%s_probe%d" % \
(self.function, Probe.next_probe_index))
self.probe_hash_name = self._make_valid_identifier(
"%s_hash%d" % \
(self.function, Probe.next_probe_index))
Probe.next_probe_index += 1
def _enable_usdt_probe(self):
......@@ -252,7 +258,7 @@ static inline bool %s(char const *ignored, char const *str) {
char needle[] = %s;
char haystack[sizeof(needle)];
bpf_probe_read(&haystack, sizeof(haystack), (void *)str);
for (int i = 0; i < sizeof(needle); ++i) {
for (int i = 0; i < sizeof(needle) - 1; ++i) {
if (needle[i] != haystack[i]) {
return false;
}
......@@ -613,7 +619,8 @@ argdist -p 2780 -z 120 \\
"(see examples below)")
parser.add_argument("-I", "--include", action="append",
metavar="header",
help="additional header files to include in the BPF program")
help="additional header files to include in the BPF program "
"as either full path, or relative to '/usr/include'")
self.args = parser.parse_args()
self.usdt_ctx = None
......@@ -634,7 +641,12 @@ struct __string_t { char s[%d]; };
#include <uapi/linux/ptrace.h>
""" % self.args.string_size
for include in (self.args.include or []):
bpf_source += "#include <%s>\n" % include
if include.startswith((".", "/")):
include = os.path.abspath(include)
bpf_source += "#include \"%s\"\n" % include
else:
bpf_source += "#include <%s>\n" % include
bpf_source += BPF.generate_auto_includes(
map(lambda p: p.raw_spec, self.probes))
for probe in self.probes:
......
......@@ -363,6 +363,7 @@ optional arguments:
below)
-I header, --include header
additional header files to include in the BPF program
as either full path, or relative to '/usr/include'
Probe specifier syntax:
{p,r,t,u}:{[library],category}:function(signature)[:type[,type...]:expr[,expr...][:filter]][#label]
......
......@@ -106,14 +106,12 @@ int trace_req_completion(struct pt_regs *ctx, struct request *req)
* test, and maintenance burden.
*/
#ifdef REQ_WRITE
if (req->cmd_flags & REQ_WRITE) {
data.rwflag = !!(req->cmd_flags & REQ_WRITE);
#elif defined(REQ_OP_SHIFT)
data.rwflag = !!((req->cmd_flags >> REQ_OP_SHIFT) == REQ_OP_WRITE);
#else
if ((req->cmd_flags >> REQ_OP_SHIFT) == REQ_OP_WRITE) {
data.rwflag = !!((req->cmd_flags & REQ_OP_MASK) == REQ_OP_WRITE);
#endif
data.rwflag = 1;
} else {
data.rwflag = 0;
}
events.perf_submit(ctx, &data, sizeof(data));
start.delete(&req);
......
......@@ -137,8 +137,10 @@ int trace_req_completion(struct pt_regs *ctx, struct request *req)
*/
#ifdef REQ_WRITE
info.rwflag = !!(req->cmd_flags & REQ_WRITE);
#else
#elif defined(REQ_OP_SHIFT)
info.rwflag = !!((req->cmd_flags >> REQ_OP_SHIFT) == REQ_OP_WRITE);
#else
info.rwflag = !!((req->cmd_flags & REQ_OP_MASK) == REQ_OP_WRITE);
#endif
whop = whobyreq.lookup(&req);
......
/*
* deadlock_detector.c Detects potential deadlocks in a running process.
* For Linux, uses BCC, eBPF. See .py file.
*
* Copyright 2017 Facebook, Inc.
* Licensed under the Apache License, Version 2.0 (the "License")
*
* 1-Feb-2016 Kenny Yu Created this.
*/
#include <linux/sched.h>
#include <uapi/linux/ptrace.h>
// Maximum number of mutexes a single thread can hold at once.
// If the number is too big, the unrolled loops wil cause the stack
// to be too big, and the bpf verifier will fail.
#define MAX_HELD_MUTEXES 16
// Info about held mutexes. `mutex` will be 0 if not held.
struct held_mutex_t {
u64 mutex;
u64 stack_id;
};
// List of mutexes that a thread is holding. Whenever we loop over this array,
// we need to force the compiler to unroll the loop, otherwise the bcc verifier
// will fail because the loop will create a backwards edge.
struct thread_to_held_mutex_leaf_t {
struct held_mutex_t held_mutexes[MAX_HELD_MUTEXES];
};
// Map of thread ID -> array of (mutex addresses, stack id)
BPF_TABLE("hash", u32, struct thread_to_held_mutex_leaf_t,
thread_to_held_mutexes, 2097152);
// Key type for edges. Represents an edge from mutex1 => mutex2.
struct edges_key_t {
u64 mutex1;
u64 mutex2;
};
// Leaf type for edges. Holds information about where each mutex was acquired.
struct edges_leaf_t {
u64 mutex1_stack_id;
u64 mutex2_stack_id;
u32 thread_pid;
char comm[TASK_COMM_LEN];
};
// Represents all edges currently in the mutex wait graph.
BPF_TABLE("hash", struct edges_key_t, struct edges_leaf_t, edges, 2097152);
// Info about parent thread when a child thread is created.
struct thread_created_leaf_t {
u64 stack_id;
u32 parent_pid;
char comm[TASK_COMM_LEN];
};
// Map of child thread pid -> info about parent thread.
BPF_TABLE("hash", u32, struct thread_created_leaf_t, thread_to_parent, 10240);
// Stack traces when threads are created and when mutexes are locked/unlocked.
BPF_STACK_TRACE(stack_traces, 655360);
// The first argument to the user space function we are tracing
// is a pointer to the mutex M held by thread T.
//
// For all mutexes N held by mutexes_held[T]
// add edge N => M (held by T)
// mutexes_held[T].add(M)
int trace_mutex_acquire(struct pt_regs *ctx, void *mutex_addr) {
// Higher 32 bits is process ID, Lower 32 bits is thread ID
u32 pid = bpf_get_current_pid_tgid();
u64 mutex = (u64)mutex_addr;
struct thread_to_held_mutex_leaf_t empty_leaf = {};
struct thread_to_held_mutex_leaf_t *leaf =
thread_to_held_mutexes.lookup_or_init(&pid, &empty_leaf);
if (!leaf) {
bpf_trace_printk(
"could not add thread_to_held_mutex key, thread: %d, mutex: %p\n", pid,
mutex);
return 1; // Could not insert, no more memory
}
// Recursive mutexes lock the same mutex multiple times. We cannot tell if
// the mutex is recursive after the mutex is already created. To avoid noisy
// reports, disallow self edges. Do one pass to check if we are already
// holding the mutex, and if we are, do nothing.
#pragma unroll
for (int i = 0; i < MAX_HELD_MUTEXES; ++i) {
if (leaf->held_mutexes[i].mutex == mutex) {
return 1; // Disallow self edges
}
}
u64 stack_id =
stack_traces.get_stackid(ctx, BPF_F_USER_STACK | BPF_F_REUSE_STACKID);
int added_mutex = 0;
#pragma unroll
for (int i = 0; i < MAX_HELD_MUTEXES; ++i) {
// If this is a free slot, see if we can insert.
if (!leaf->held_mutexes[i].mutex) {
if (!added_mutex) {
leaf->held_mutexes[i].mutex = mutex;
leaf->held_mutexes[i].stack_id = stack_id;
added_mutex = 1;
}
continue; // Nothing to do for a free slot
}
// Add edges from held mutex => current mutex
struct edges_key_t edge_key = {};
edge_key.mutex1 = leaf->held_mutexes[i].mutex;
edge_key.mutex2 = mutex;
struct edges_leaf_t edge_leaf = {};
edge_leaf.mutex1_stack_id = leaf->held_mutexes[i].stack_id;
edge_leaf.mutex2_stack_id = stack_id;
edge_leaf.thread_pid = pid;
bpf_get_current_comm(&edge_leaf.comm, sizeof(edge_leaf.comm));
// Returns non-zero on error
int result = edges.update(&edge_key, &edge_leaf);
if (result) {
bpf_trace_printk("could not add edge key %p, %p, error: %d\n",
edge_key.mutex1, edge_key.mutex2, result);
continue; // Could not insert, no more memory
}
}
// There were no free slots for this mutex.
if (!added_mutex) {
bpf_trace_printk("could not add mutex %p, added_mutex: %d\n", mutex,
added_mutex);
return 1;
}
return 0;
}
// The first argument to the user space function we are tracing
// is a pointer to the mutex M held by thread T.
//
// mutexes_held[T].remove(M)
int trace_mutex_release(struct pt_regs *ctx, void *mutex_addr) {
// Higher 32 bits is process ID, Lower 32 bits is thread ID
u32 pid = bpf_get_current_pid_tgid();
u64 mutex = (u64)mutex_addr;
struct thread_to_held_mutex_leaf_t *leaf =
thread_to_held_mutexes.lookup(&pid);
if (!leaf) {
// If the leaf does not exist for the pid, then it means we either missed
// the acquire event, or we had no more memory and could not add it.
bpf_trace_printk(
"could not find thread_to_held_mutex, thread: %d, mutex: %p\n", pid,
mutex);
return 1;
}
// For older kernels without "Bpf: allow access into map value arrays"
// (https://lkml.org/lkml/2016/8/30/287) the bpf verifier will fail with an
// invalid memory access on `leaf->held_mutexes[i]` below. On newer kernels,
// we can avoid making this extra copy in `value` and use `leaf` directly.
struct thread_to_held_mutex_leaf_t value = {};
bpf_probe_read(&value, sizeof(struct thread_to_held_mutex_leaf_t), leaf);
#pragma unroll
for (int i = 0; i < MAX_HELD_MUTEXES; ++i) {
// Find the current mutex (if it exists), and clear it.
// Note: Can't use `leaf->` in this if condition, see comment above.
if (value.held_mutexes[i].mutex == mutex) {
leaf->held_mutexes[i].mutex = 0;
leaf->held_mutexes[i].stack_id = 0;
}
}
return 0;
}
// Trace return from clone() syscall in the child thread (return value > 0).
int trace_clone(struct pt_regs *ctx, unsigned long flags, void *child_stack,
void *ptid, void *ctid, struct pt_regs *regs) {
u32 child_pid = PT_REGS_RC(ctx);
if (child_pid <= 0) {
return 1;
}
struct thread_created_leaf_t thread_created_leaf = {};
thread_created_leaf.parent_pid = bpf_get_current_pid_tgid();
thread_created_leaf.stack_id =
stack_traces.get_stackid(ctx, BPF_F_USER_STACK | BPF_F_REUSE_STACKID);
bpf_get_current_comm(&thread_created_leaf.comm,
sizeof(thread_created_leaf.comm));
struct thread_created_leaf_t *insert_result =
thread_to_parent.lookup_or_init(&child_pid, &thread_created_leaf);
if (!insert_result) {
bpf_trace_printk(
"could not add thread_created_key, child: %d, parent: %d\n", child_pid,
thread_created_leaf.parent_pid);
return 1; // Could not insert, no more memory
}
return 0;
}
This diff is collapsed.
This diff is collapsed.
......@@ -197,9 +197,9 @@ def print_event(cpu, data, size):
if args.timestamp:
print("%-8.3f" % (time.time() - start_ts), end="")
ppid = get_ppid(event.pid)
print("%-16s %-6s %-6s %3s %s" % (event.comm, event.pid,
print("%-16s %-6s %-6s %3s %s" % (event.comm.decode(), event.pid,
ppid if ppid > 0 else "?", event.retval,
' '.join(argv[event.pid])))
b' '.join(argv[event.pid]).decode()))
del(argv[event.pid])
......
......@@ -43,6 +43,7 @@ class Probe(object):
func -- probe a kernel function
lib:func -- probe a user-space function in the library 'lib'
/path:func -- probe a user-space function in binary '/path'
p::func -- same thing as 'func'
p:lib:func -- same thing as 'lib:func'
t:cat:event -- probe a kernel tracepoint
......@@ -219,8 +220,11 @@ class Tool(object):
./funccount -Ti 5 'vfs_*' # output every 5 seconds, with timestamps
./funccount -p 185 'vfs_*' # count vfs calls for PID 181 only
./funccount t:sched:sched_fork # count calls to the sched_fork tracepoint
./funccount -p 185 u:node:gc* # count all GC USDT probes in node
./funccount -p 185 u:node:gc* # count all GC USDT probes in node, PID 185
./funccount c:malloc # count all malloc() calls in libc
./funccount go:os.* # count all "os.*" calls in libgo
./funccount -p 185 go:os.* # count all "os.*" calls in libgo, PID 185
./funccount ./test:read* # count "read*" calls in the ./test binary
"""
parser = argparse.ArgumentParser(
description="Count functions, tracepoints, and USDT probes",
......
......@@ -169,7 +169,7 @@ Ctrl-C has been hit.
User functions can be traced in executables or libraries, and per-process
filtering is allowed:
# ./funccount -p 1442 contentions:*
# ./funccount -p 1442 /home/ubuntu/contentions:*
Tracing 15 functions for "/home/ubuntu/contentions:*"... Hit Ctrl-C to end.
^C
FUNC COUNT
......@@ -180,6 +180,10 @@ insert_result 87186
is_prime 1252772
Detaching...
If /home/ubuntu is in the $PATH, then the following command will also work:
# ./funccount -p 1442 contentions:*
Counting libc write and read calls using regular expression syntax (-r):
......@@ -314,6 +318,8 @@ examples:
./funccount -Ti 5 'vfs_*' # output every 5 seconds, with timestamps
./funccount -p 185 'vfs_*' # count vfs calls for PID 181 only
./funccount t:sched:sched_fork # count calls to the sched_fork tracepoint
./funccount -p 185 u:node:gc* # count all GC USDT probes in node
./funccount -p 185 u:node:gc* # count all GC USDT probes in node, PID 185
./funccount c:malloc # count all malloc() calls in libc
./funccount go:os.* # count all "os.*" calls in libgo
./funccount -p 185 go:os.* # count all "os.*" calls in libgo, PID 185
./funccount ./test:read* # count "read*" calls in the ./test binary
......@@ -201,9 +201,10 @@ if not library:
b.attach_kretprobe(event_re=pattern, fn_name="trace_func_return")
matched = b.num_open_kprobes()
else:
b.attach_uprobe(name=library, sym_re=pattern, fn_name="trace_func_entry")
b.attach_uprobe(name=library, sym_re=pattern, fn_name="trace_func_entry",
pid=args.pid or -1)
b.attach_uretprobe(name=library, sym_re=pattern,
fn_name="trace_func_return")
fn_name="trace_func_return", pid=args.pid or -1)
matched = b.num_open_uprobes()
if matched == 0:
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -90,7 +90,7 @@ parser.add_argument("-a", "--annotations", action="store_true",
help="add _[k] annotations to kernel frames")
parser.add_argument("-f", "--folded", action="store_true",
help="output folded format, one line per stack (for flame graphs)")
parser.add_argument("--stack-storage-size", default=2048,
parser.add_argument("--stack-storage-size", default=10240,
type=positive_nonzero_int,
help="the number of unique stack traces that can be stored and "
"displayed (default 2048)")
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment