@@ -41,7 +41,7 @@ Here is a generic checklist for performance investigations with bcc, first as a
These tools may be installed on your system under /usr/share/bcc/tools, or you can run them from the bcc github repo under /tools where they have a .py extension. Browse the 50+ tools available for more analysis options.
#### 1. execsnoop
#### 1.1 execsnoop
```
# ./execsnoop
...
...
@@ -59,7 +59,7 @@ It works by tracing exec(), not the fork(), so it will catch many types of new p
More [examples](../tools/execsnoop_example.txt).
#### 2. opensnoop
#### 1.2. opensnoop
```
# ./opensnoop
...
...
@@ -82,7 +82,7 @@ Files that are opened can tell you a lot about how applications work: identifyin
More [examples](../tools/opensnoop_example.txt).
#### 3. ext4slower (or btrfs\*, xfs\*, zfs\*)
#### 1.3. ext4slower (or btrfs\*, xfs\*, zfs\*)
```
# ./ext4slower
...
...
@@ -103,7 +103,7 @@ Similar tools exist in bcc for other file systems: btrfsslower, xfsslower, and z
More [examples](../tools/ext4slower_example.txt).
#### 4. biolatency
#### 1.4. biolatency
```
# ./biolatency
...
...
@@ -135,7 +135,7 @@ This is great for understanding disk I/O latency beyond the average times given
More [examples](../tools/biolatency_example.txt).
#### 5. biosnoop
#### 1.5. biosnoop
```
# ./biosnoop
...
...
@@ -155,7 +155,7 @@ This allows you to examine disk I/O in more detail, and look for time-ordered pa
More [examples](../tools/biosnoop_example.txt).
#### 6. cachestat
#### 1.6. cachestat
```
# ./cachestat
...
...
@@ -175,7 +175,7 @@ Use this to identify a low cache hit ratio, and a high rate of misses: which giv
More [examples](../tools/cachestat_example.txt).
#### 7. tcpconnect
#### 1.7. tcpconnect
```
# ./tcpconnect
...
...
@@ -194,7 +194,7 @@ Look for unexpected connections that may point to inefficiencies in application
More [examples](../tools/tcpconnect_example.txt).
#### 8. tcpaccept
#### 1.8. tcpaccept
```
# ./tcpaccept
...
...
@@ -211,7 +211,7 @@ Look for unexpected connections that may point to inefficiencies in application
More [examples](../tools/tcpaccept_example.txt).
#### 9. tcpretrans
#### 1.9. tcpretrans
```
# ./tcpretrans
...
...
@@ -228,7 +228,7 @@ TCP retransmissions cause latency and throughput issues. For ESTABLESHID retrans
More [examples](../tools/tcpretrans_example.txt).
#### 10. runqlat
#### 1.10. runqlat
```
# ./runqlat
...
...
@@ -259,7 +259,7 @@ This can help quantify time lost waiting for a turn on CPU, during periods of CP
More [examples](../tools/runqlat_example.txt).
#### 11. profile
#### 1.11. profile
```
# ./profile
...
...
@@ -306,6 +306,117 @@ Use this tool to understand the code paths that are consuming CPU resources.
More [examples](../tools/profile_example.txt).
### 2. Observatility with Generic Tools
In addition to the above tools for performance tuning, below is a checklist for bcc generic tools, first as a list, and in detail:
1. trace
1. argdist
1. funccount
These generic tools may be useful to provide visibility to solve your specific problems.
#### 2.1. trace
##### Example 1
Suppose you want to track file ownership change. There are three syscalls, `chown`, `fchown` and `lchown` which users can use to change file ownership. The corresponding syscall entry is `SyS_[f|l]chown`. The following command can be used to print out syscall parameters and the calling process user id. You can use `id` command to find the uid of a particular user.
Suppose you want to count nonvoluntary context switches (`nvcsw`) in your bpf based performance monitoring tools and you do not know what is the proper method. `/proc/<pid>/status` already tells you the number (`nonvoluntary_ctxt_switches`) for a pid and you can use `trace.py` to do a quick experiment to verify your method. With kernel source code, the `nvcsw` is counted at file `linux/kernel/sched/core.c` function `__schedule` and under condition
The `__schedule` function is marked as `notrace`, and the best place to evaluate the above condition seems in `sched/sched_switch` tracepoint called inside function `__schedule` and defined in `linux/include/trace/events/sched.h`. `trace.py` already has `args` being the pointer to the tracepoint `TP_STRUCT__entry`. The above condition in function `__schedule` can be represented as
The below command can be used to count the involuntary context switches (per process or per pid) and compare to `/proc/<pid>/status` or `/proc/<pid>/task/<task_id>/status` for correctness, as in typical cases, involuntary context switches are not very common.
This example is related to issue [1231](https://github.com/iovisor/bcc/issues/1231) and [1516](https://github.com/iovisor/bcc/issues/1516) where uprobe does not work at all in certain cases. First, you can do a `strace` as below
The `perf_event_open` syscall returns `-EIO`. Digging into kernel uprobe related codes in `/kernel/trace` and `/kernel/events` directories to search `EIO`, the function `uprobe_register` is the most suspicious. Let us find whether this function is called or not and what is the return value if it is called. In one terminal using the following command to print out the return value of uprobe_register,
The kernel symbol `empty_aops` does not have `readpage` defined and hence the above suspicious condition is true. Further examining the kernel source code shows that `overlayfs` does not provide its own `a_ops` while some other file systems (e.g., ext4) define their own `a_ops` (e.g., `ext4_da_aops`), and `ext4_da_aops` defines `readpage`. Hence, uprobe works fine on ext4 while not on overlayfs.