Commit aa1e6ca3 authored by Brenden Blanco's avatar Brenden Blanco

Merge pull request #343 from ceeaspb/master

typo trawl
parents fe884257 7a313519
......@@ -24,11 +24,11 @@ As said earlier: keep it short, neat, and documented (code comments).
A checklist for bcc tool development:
1. **Research the topic landscape**. Learn the existing tools and metrics (incl. from /proc). Determine what real world problems exist and need solving. We have too many tools and metrics as it is, we don't need more "I guess that's useful" tools, we need more "ah-hah! I couldn't do this before!" tools. Consider asking other developers about your idea. Many of us can be found in IRC, in the #iovisor channel on irc.oftc.net. There's also the mailing list (see the README.md), and github for issues.
1. **Create a known workload for testing**. This might involving writing a 10 line C program, using a microbenchmark, or just improvising at the shell. If you don't know how to create a workload, learn! Figuring this out will provide invaluable context and details that you may have otherwise overlooked. Sometimes it's easy, and I'm able to just use dd(1) from /dev/urandom or a disk device to /dev/null. It lets me set the I/O size, count, and provides throughput statistics for cross-checking checking my tool output. But other times I need a micro-benchhark, or some C.
1. **Create a known workload for testing**. This might involving writing a 10 line C program, using a micro-benchmark, or just improvising at the shell. If you don't know how to create a workload, learn! Figuring this out will provide invaluable context and details that you may have otherwise overlooked. Sometimes it's easy, and I'm able to just use dd(1) from /dev/urandom or a disk device to /dev/null. It lets me set the I/O size, count, and provides throughput statistics for cross-checking checking my tool output. But other times I need a micro-benchmark, or some C.
1. **Write the tool to solve the problem and no more**. Unix philosophy: do one thing and do it well. netstat doesn't have an option to dump packets, tcpdump-style. They are two different tools.
1. **Check your tool correctly measures your known workload**. If possible, run a prime number of events (eg, 23) and check that the numbers match. Try other workload variations.
1. **Use other observability tools to perform a cross-check or sanity check**. Eg, imagine you write a PCI bus tool that shows current throughput is 28 Gbytes/sec. How could you sanity test that? Well, what PCI devices are there? Disks and network cards? Measure their throughput (iostat, nicstat, sar), and check if is in the ballpark of 28 Gbytes/sec (which would include PCI frame overheads). Ideally, your numbers match.
1. **Measure the overhead of the tool**. If you are running a microbenchmark, how much slower is it with the tool running. Is more CPU consumed? Try to determine the worst case: run the microbenchmark so that CPU headroom is exhausted, and then run the bcc tool. Can overhead be lowered?
1. **Measure the overhead of the tool**. If you are running a micro-benchmark, how much slower is it with the tool running. Is more CPU consumed? Try to determine the worst case: run the micro-benchmark so that CPU headroom is exhausted, and then run the bcc tool. Can overhead be lowered?
1. **Test again, and stress test**. You want to discover and fix all the bad things before others hit them.
1. **Consider command line options**. Should it have -p for filtering on a PID? -T for timestamps? -i for interval? See other tools for examples, and copy the style: the usage message should list example usage at the end. Remember to keep the tool doing one thing and doing it well. Also, if there's one option that seems to be the common case, perhaps it should just be the first argument and not need a switch (no -X). A special case of this is *stat tools, like iostat/vmstat/etc, where the convention is [interval [count]].
1. **Use pep8 to check Python style**: pep8 --show-source --ignore=E123,E125,E126,E127,E128,E302 filename . Note that it misses some things, like consistent usage, so you'll still need to double check your script.
......@@ -36,5 +36,6 @@ A checklist for bcc tool development:
1. **Read your example.txt file**. Does this sound too niche or convoluted? Are you spending too much time explaining caveats? These can be hints that perhaps you should fix your tool, or abandon it! Perhaps it better belongs as an /example, and not a tool. I've abandoned many tools at this stage.
1. **Write a man page**. Either ROFF (.8), markdown (.md), or plain text (.txt): so long as it documents the important sections, particularly columns (fields) and caveats. These go under man/man8. See the other examples. Include a section on overhead, and pull no punches. It's better for end users to know about high overhead beforehand, than to discover it the hard way. Also explain caveats. Don't assume those will be obvious to tool users.
1. **Read your man page**. For ROFF: nroff -man filename. Like before, this exercise is like saying something out loud. Does it sound too niche or convoluted? Again, hints that you might need to go back and fix things, or abandon it.
1. **Spell check your documentation**. Use a spell checker like aspell to check your document quality before committing.
1. **Add an entry to README.md**.
1. If you made it this far, pull request!
......@@ -177,7 +177,7 @@ section of the [kernel ftrace doc](https://www.kernel.org/doc/Documentation/trac
### Networking
At RedHat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec).
At Red Hat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec).
A multi-host vxlan environment is simulated and a BPF program used to monitor
one of the physical interfaces. The BPF program keeps statistics on the inner
and outer IP addresses traversing the interface, and the userspace component
......
......@@ -21,5 +21,5 @@ between 128 and 255 Kbytes in size, and another mode of 211 I/O were between
4 and 7 Kbytes in size.
Understanding this distribution is useful for characterizing workloads and
understanding performance. The existance of this distribution is not visible
understanding performance. The existence of this distribution is not visible
from averages alone.
......@@ -74,7 +74,7 @@ How many I/O fell into this range
distribution
An ASCII bar chart to visualize the distribution (count column)
.SH OVERHEAD
This traces kernel functions and maintains in-kernel timestamps and a histgroam,
This traces kernel functions and maintains in-kernel timestamps and a histogram,
which are asynchronously copied to user-space. This method is very efficient,
and the overhead for most storage I/O rates (< 10k IOPS) should be negligible.
If you have a higher IOPS storage environment, test and quantify the overhead
......
......@@ -34,7 +34,7 @@ Print output every five seconds, three times:
#
.B cachestat 5 3
.TP
Print output with timetsmap every five seconds, three times:
Print output with timestamp every five seconds, three times:
#
.B cachestat -T 5 3
.SH FIELDS
......
.TH funclatency 8 "2015-08-18" "USER COMMANDS"
.SH NAME
funclatency \- Time kernel funcitons and print latency as a histogram.
funclatency \- Time kernel functions and print latency as a histogram.
.SH SYNOPSIS
.B funclatency [\-h] [\-p PID] [\-i INTERVAL] [\-T] [\-u] [\-m] [\-r] [\-F] pattern
.SH DESCRIPTION
......@@ -97,7 +97,7 @@ How many calls fell into this range
distribution
An ASCII bar chart to visualize the distribution (count column)
.SH OVERHEAD
This traces kernel functions and maintains in-kernel timestamps and a histgroam,
This traces kernel functions and maintains in-kernel timestamps and a histogram,
which are asynchronously copied to user-space. While this method is very
efficient, the rate of kernel functions can also be very high (>1M/sec), at
which point the overhead is expected to be measurable. Measure in a test
......
......@@ -73,7 +73,7 @@ This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment
could become noticeable for heavy workloads. Measure in a test environment
before use.
.SH SOURCE
This is from bcc.
......
......@@ -74,7 +74,7 @@ This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment
could become noticeable for heavy workloads. Measure in a test environment
before use.
.SH SOURCE
This is from bcc.
......
......@@ -7,7 +7,7 @@ stackcount \- Count kernel function calls and their stack traces. Uses Linux eBP
stackcount traces kernel functions and frequency counts them with their entire
kernel stack trace, summarized in-kernel for efficiency. This allows higher
frequency events to be studied. The output consists of unique stack traces,
and their occurance counts.
and their occurrence counts.
The pattern is a string with optional '*' wildcards, similar to file globbing.
If you'd prefer to use regular expressions, use the \-r option.
......
......@@ -56,7 +56,7 @@ Time of the call, in seconds.
STACK
Kernel stack trace. The first column shows "ip" for instruction pointer, and
"r#" for each return pointer in the stack. The second column is the stack trace
as hexidecimal. The third column is the translated kernel symbol names.
as hexadecimal. The third column is the translated kernel symbol names.
.SH OVERHEAD
This can have significant overhead if frequently called functions (> 1000/s) are
traced, and is only intended for low frequency function calls. This is because
......
......@@ -113,7 +113,7 @@ several types of hooks available:
through the socket/interface.
EBPF programs can be used for many purposes; the main use cases are
dynamic tracing and monitoring, and packet procesisng. We are mostly
dynamic tracing and monitoring, and packet processing. We are mostly
interested in the latter use case in this document.
#### EBPF Tables
......@@ -219,7 +219,7 @@ very complex packet filters and simple packet forwarding engines. In
the spirit of open-source "release early, release often", we expect
that the compiler's capabilities will improve gradually.
* Packet filtering is peformed using the `drop()` action. Packets
* Packet filtering is performed using the `drop()` action. Packets
that are not dropped will be forwarded.
* Packet forwarding is performed by setting the
......@@ -233,7 +233,7 @@ Here are some limitations imposed on the P4 programs:
EBPF program). In the future the compiler should probably generate
two separate EBPF programs.
* arbirary parsers can be compiled, but the BCC compiler will reject
* arbitrary parsers can be compiled, but the BCC compiler will reject
parsers that contain cycles
* arithmetic on data wider than 32 bits is not supported
......@@ -311,7 +311,7 @@ p4toEbpf.py file.p4 -o file.c
The P4 compiler first runs the C preprocessor on the input P4 file.
Some of the command-line options are passed directly to the
preprocesor.
preprocessor.
The following compiler options are available:
......
......@@ -41,7 +41,7 @@ the last row printed, for which there were 2 I/O.
For efficiency, biolatency uses an in-kernel eBPF map to store timestamps
with requests, and another in-kernel map to store the histogram (the "count")
column, which is copied to user-space only when output is printed. These
methods lower the perormance overhead when tracing is performed.
methods lower the performance overhead when tracing is performed.
In the following example, the -m option is used to print a histogram using
......
......@@ -166,7 +166,7 @@ The current implementation can take many seconds to detach from tracing, after
Ctrl-C has been hit.
Couting all vfs functions for process ID 5276 only:
Counting all vfs functions for process ID 5276 only:
# ./funccount -p 5276 'vfs_*'
Tracing... Ctrl-C to end.
......
......@@ -37,12 +37,12 @@ the function began executing (was called) to when it finished (returned).
This example output shows that most of the time, do_sys_open() took between
2048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution
shows 291 calls of between 4096 and 8191 nanoseconds. There was also one
occurrance, an outlier, in the 2 to 4 millisecond range.
occurrence, an outlier, in the 2 to 4 millisecond range.
How this works: the function entry and return are traced using the kernel kprobe
and kretprobe tracer. Timestamps are collected, the delta time calculated, which
is the bucketized and stored as an in-kernel histogram for efficiency. The
histgram is visible in the output: it's the "count" column; everything else is
histogram is visible in the output: it's the "count" column; everything else is
decoration. Only the count column is copied to user-level on output. This is an
efficient way to time kernel functions and examine their latency distribution.
......@@ -242,7 +242,7 @@ USAGE message:
usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r]
pattern
Time kernel funcitons and print latency as a histogram
Time kernel functions and print latency as a histogram
positional arguments:
pattern search expression for kernel functions
......@@ -260,7 +260,7 @@ optional arguments:
only.
examples:
./funclatency do_sys_open # time the do_sys_open() kenel function
./funclatency do_sys_open # time the do_sys_open() kernel function
./funclatency -u vfs_read # time vfs_read(), in microseconds
./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds
./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps
......
......@@ -743,4 +743,4 @@ examples:
./offcputime 5 # trace for 5 seconds only
./offcputime -f 5 # 5 seconds, and output in folded format
./offcputime -u # don't include kernel threads (user only)
./offcputime -p 185 # trace fo PID 185 only
./offcputime -p 185 # trace for PID 185 only
......@@ -165,7 +165,7 @@ via vfs_read() and the other doing a link_path_walk(). There is also a vmstat(8)
stack showing it sleeping between intervals, and an sshd(8) stack showing it
waiting on a file descriptor for input.
The stack shown at the bottom is the off-CPU stack beloning to the task name
The stack shown at the bottom is the off-CPU stack belonging to the task name
shown after "target:". Then there is a separator, "-", and above it the waker
stack and the waker task name after "waker:". The wakeup stack is printed
in reverse order.
......
......@@ -60,7 +60,7 @@ net_rx_action 15656
This can be useful for quantifying where CPU cycles are spent among the soft
interrupts (summarized as the %softirq column from mpstat(1), and shown as
event counts in /proc/softirqs). The output above shows that most time was spent
processing net_rx_action(), which was around 15 milleconds per second (total
processing net_rx_action(), which was around 15 milliseconds per second (total
time across all CPUs).
......
......@@ -376,7 +376,7 @@ Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
Detaching...
If it wasn't clear how one function called another, knowing the instruction
offset can help you locate the lines of code from a dissassembly dump.
offset can help you locate the lines of code from a disassembly dump.
A wildcard can also be used. Eg, all functions beginning with "tcp_send":
......
......@@ -3,7 +3,7 @@ Demonstrations of stacksnoop, the Linux eBPF/bcc version.
This program traces the given kernel function and prints the kernel stack trace
for every call. This tool is useful for studying low frequency kernel functions,
to see how they were invoked. For exmaple, tracing the ext4_sync_fs() call:
to see how they were invoked. For example, tracing the ext4_sync_fs() call:
# ./stacksnoop ext4_sync_fs
TIME(s) STACK
......
......@@ -467,4 +467,4 @@ examples:
./wakeuptime 5 # trace for 5 seconds only
./wakeuptime -f 5 # 5 seconds, and output in folded format
./wakeuptime -u # don't include kernel threads (user only)
./wakeuptime -p 185 # trace fo PID 185 only
./wakeuptime -p 185 # trace for PID 185 only
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment