Commit 3b9679a3 authored by Alex Bagehot's avatar Alex Bagehot

trawl typos with aspell

parent da7b4234
...@@ -24,11 +24,11 @@ As said earlier: keep it short, neat, and documented (code comments). ...@@ -24,11 +24,11 @@ As said earlier: keep it short, neat, and documented (code comments).
A checklist for bcc tool development: A checklist for bcc tool development:
1. **Research the topic landscape**. Learn the existing tools and metrics (incl. from /proc). Determine what real world problems exist and need solving. We have too many tools and metrics as it is, we don't need more "I guess that's useful" tools, we need more "ah-hah! I couldn't do this before!" tools. Consider asking other developers about your idea. Many of us can be found in IRC, in the #iovisor channel on irc.oftc.net. There's also the mailing list (see the README.md), and github for issues. 1. **Research the topic landscape**. Learn the existing tools and metrics (incl. from /proc). Determine what real world problems exist and need solving. We have too many tools and metrics as it is, we don't need more "I guess that's useful" tools, we need more "ah-hah! I couldn't do this before!" tools. Consider asking other developers about your idea. Many of us can be found in IRC, in the #iovisor channel on irc.oftc.net. There's also the mailing list (see the README.md), and github for issues.
1. **Create a known workload for testing**. This might involving writing a 10 line C program, using a microbenchmark, or just improvising at the shell. If you don't know how to create a workload, learn! Figuring this out will provide invaluable context and details that you may have otherwise overlooked. Sometimes it's easy, and I'm able to just use dd(1) from /dev/urandom or a disk device to /dev/null. It lets me set the I/O size, count, and provides throughput statistics for cross-checking checking my tool output. But other times I need a micro-benchhark, or some C. 1. **Create a known workload for testing**. This might involving writing a 10 line C program, using a micro-benchmark, or just improvising at the shell. If you don't know how to create a workload, learn! Figuring this out will provide invaluable context and details that you may have otherwise overlooked. Sometimes it's easy, and I'm able to just use dd(1) from /dev/urandom or a disk device to /dev/null. It lets me set the I/O size, count, and provides throughput statistics for cross-checking checking my tool output. But other times I need a micro-benchmark, or some C.
1. **Write the tool to solve the problem and no more**. Unix philosophy: do one thing and do it well. netstat doesn't have an option to dump packets, tcpdump-style. They are two different tools. 1. **Write the tool to solve the problem and no more**. Unix philosophy: do one thing and do it well. netstat doesn't have an option to dump packets, tcpdump-style. They are two different tools.
1. **Check your tool correctly measures your known workload**. If possible, run a prime number of events (eg, 23) and check that the numbers match. Try other workload variations. 1. **Check your tool correctly measures your known workload**. If possible, run a prime number of events (eg, 23) and check that the numbers match. Try other workload variations.
1. **Use other observability tools to perform a cross-check or sanity check**. Eg, imagine you write a PCI bus tool that shows current throughput is 28 Gbytes/sec. How could you sanity test that? Well, what PCI devices are there? Disks and network cards? Measure their throughput (iostat, nicstat, sar), and check if is in the ballpark of 28 Gbytes/sec (which would include PCI frame overheads). Ideally, your numbers match. 1. **Use other observability tools to perform a cross-check or sanity check**. Eg, imagine you write a PCI bus tool that shows current throughput is 28 Gbytes/sec. How could you sanity test that? Well, what PCI devices are there? Disks and network cards? Measure their throughput (iostat, nicstat, sar), and check if is in the ballpark of 28 Gbytes/sec (which would include PCI frame overheads). Ideally, your numbers match.
1. **Measure the overhead of the tool**. If you are running a microbenchmark, how much slower is it with the tool running. Is more CPU consumed? Try to determine the worst case: run the microbenchmark so that CPU headroom is exhausted, and then run the bcc tool. Can overhead be lowered? 1. **Measure the overhead of the tool**. If you are running a micro-benchmark, how much slower is it with the tool running. Is more CPU consumed? Try to determine the worst case: run the micro-benchmark so that CPU headroom is exhausted, and then run the bcc tool. Can overhead be lowered?
1. **Test again, and stress test**. You want to discover and fix all the bad things before others hit them. 1. **Test again, and stress test**. You want to discover and fix all the bad things before others hit them.
1. **Consider command line options**. Should it have -p for filtering on a PID? -T for timestamps? -i for interval? See other tools for examples, and copy the style: the usage message should list example usage at the end. Remember to keep the tool doing one thing and doing it well. Also, if there's one option that seems to be the common case, perhaps it should just be the first argument and not need a switch (no -X). A special case of this is *stat tools, like iostat/vmstat/etc, where the convention is [interval [count]]. 1. **Consider command line options**. Should it have -p for filtering on a PID? -T for timestamps? -i for interval? See other tools for examples, and copy the style: the usage message should list example usage at the end. Remember to keep the tool doing one thing and doing it well. Also, if there's one option that seems to be the common case, perhaps it should just be the first argument and not need a switch (no -X). A special case of this is *stat tools, like iostat/vmstat/etc, where the convention is [interval [count]].
1. **Use pep8 to check Python style**: pep8 --show-source --ignore=E123,E125,E126,E127,E128,E302 filename . Note that it misses some things, like consistent usage, so you'll still need to double check your script. 1. **Use pep8 to check Python style**: pep8 --show-source --ignore=E123,E125,E126,E127,E128,E302 filename . Note that it misses some things, like consistent usage, so you'll still need to double check your script.
......
...@@ -176,7 +176,7 @@ section of the [kernel ftrace doc](https://www.kernel.org/doc/Documentation/trac ...@@ -176,7 +176,7 @@ section of the [kernel ftrace doc](https://www.kernel.org/doc/Documentation/trac
### Networking ### Networking
At RedHat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec). At Red Hat Summit 2015, BCC was presented as part of a [session on BPF](http://www.devnation.org/#7784f1f7513e8542e4db519e79ff5eec).
A multi-host vxlan environment is simulated and a BPF program used to monitor A multi-host vxlan environment is simulated and a BPF program used to monitor
one of the physical interfaces. The BPF program keeps statistics on the inner one of the physical interfaces. The BPF program keeps statistics on the inner
and outer IP addresses traversing the interface, and the userspace component and outer IP addresses traversing the interface, and the userspace component
......
...@@ -21,5 +21,5 @@ between 128 and 255 Kbytes in size, and another mode of 211 I/O were between ...@@ -21,5 +21,5 @@ between 128 and 255 Kbytes in size, and another mode of 211 I/O were between
4 and 7 Kbytes in size. 4 and 7 Kbytes in size.
Understanding this distribution is useful for characterizing workloads and Understanding this distribution is useful for characterizing workloads and
understanding performance. The existance of this distribution is not visible understanding performance. The existence of this distribution is not visible
from averages alone. from averages alone.
...@@ -74,7 +74,7 @@ How many I/O fell into this range ...@@ -74,7 +74,7 @@ How many I/O fell into this range
distribution distribution
An ASCII bar chart to visualize the distribution (count column) An ASCII bar chart to visualize the distribution (count column)
.SH OVERHEAD .SH OVERHEAD
This traces kernel functions and maintains in-kernel timestamps and a histgroam, This traces kernel functions and maintains in-kernel timestamps and a histogram,
which are asynchronously copied to user-space. This method is very efficient, which are asynchronously copied to user-space. This method is very efficient,
and the overhead for most storage I/O rates (< 10k IOPS) should be negligible. and the overhead for most storage I/O rates (< 10k IOPS) should be negligible.
If you have a higher IOPS storage environment, test and quantify the overhead If you have a higher IOPS storage environment, test and quantify the overhead
......
...@@ -34,7 +34,7 @@ Print output every five seconds, three times: ...@@ -34,7 +34,7 @@ Print output every five seconds, three times:
# #
.B cachestat 5 3 .B cachestat 5 3
.TP .TP
Print output with timetsmap every five seconds, three times: Print output with timestamp every five seconds, three times:
# #
.B cachestat -T 5 3 .B cachestat -T 5 3
.SH FIELDS .SH FIELDS
......
.TH funclatency 8 "2015-08-18" "USER COMMANDS" .TH funclatency 8 "2015-08-18" "USER COMMANDS"
.SH NAME .SH NAME
funclatency \- Time kernel funcitons and print latency as a histogram. funclatency \- Time kernel functions and print latency as a histogram.
.SH SYNOPSIS .SH SYNOPSIS
.B funclatency [\-h] [\-p PID] [\-i INTERVAL] [\-T] [\-u] [\-m] [\-r] [\-F] pattern .B funclatency [\-h] [\-p PID] [\-i INTERVAL] [\-T] [\-u] [\-m] [\-r] [\-F] pattern
.SH DESCRIPTION .SH DESCRIPTION
...@@ -97,7 +97,7 @@ How many calls fell into this range ...@@ -97,7 +97,7 @@ How many calls fell into this range
distribution distribution
An ASCII bar chart to visualize the distribution (count column) An ASCII bar chart to visualize the distribution (count column)
.SH OVERHEAD .SH OVERHEAD
This traces kernel functions and maintains in-kernel timestamps and a histgroam, This traces kernel functions and maintains in-kernel timestamps and a histogram,
which are asynchronously copied to user-space. While this method is very which are asynchronously copied to user-space. While this method is very
efficient, the rate of kernel functions can also be very high (>1M/sec), at efficient, the rate of kernel functions can also be very high (>1M/sec), at
which point the overhead is expected to be measurable. Measure in a test which point the overhead is expected to be measurable. Measure in a test
......
...@@ -73,7 +73,7 @@ This traces kernel functions and maintains in-kernel counts, which ...@@ -73,7 +73,7 @@ This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment could become noticeable for heavy workloads. Measure in a test environment
before use. before use.
.SH SOURCE .SH SOURCE
This is from bcc. This is from bcc.
......
...@@ -74,7 +74,7 @@ This traces kernel functions and maintains in-kernel counts, which ...@@ -74,7 +74,7 @@ This traces kernel functions and maintains in-kernel counts, which
are asynchronously copied to user-space. While the rate of interrupts are asynchronously copied to user-space. While the rate of interrupts
be very high (>1M/sec), this is a relatively efficient way to trace these be very high (>1M/sec), this is a relatively efficient way to trace these
events, and so the overhead is expected to be small for normal workloads, but events, and so the overhead is expected to be small for normal workloads, but
could become noticable for heavy workloads. Measure in a test environment could become noticeable for heavy workloads. Measure in a test environment
before use. before use.
.SH SOURCE .SH SOURCE
This is from bcc. This is from bcc.
......
...@@ -7,7 +7,7 @@ stackcount \- Count kernel function calls and their stack traces. Uses Linux eBP ...@@ -7,7 +7,7 @@ stackcount \- Count kernel function calls and their stack traces. Uses Linux eBP
stackcount traces kernel functions and frequency counts them with their entire stackcount traces kernel functions and frequency counts them with their entire
kernel stack trace, summarized in-kernel for efficiency. This allows higher kernel stack trace, summarized in-kernel for efficiency. This allows higher
frequency events to be studied. The output consists of unique stack traces, frequency events to be studied. The output consists of unique stack traces,
and their occurance counts. and their occurrence counts.
The pattern is a string with optional '*' wildcards, similar to file globbing. The pattern is a string with optional '*' wildcards, similar to file globbing.
If you'd prefer to use regular expressions, use the \-r option. If you'd prefer to use regular expressions, use the \-r option.
......
...@@ -56,7 +56,7 @@ Time of the call, in seconds. ...@@ -56,7 +56,7 @@ Time of the call, in seconds.
STACK STACK
Kernel stack trace. The first column shows "ip" for instruction pointer, and Kernel stack trace. The first column shows "ip" for instruction pointer, and
"r#" for each return pointer in the stack. The second column is the stack trace "r#" for each return pointer in the stack. The second column is the stack trace
as hexidecimal. The third column is the translated kernel symbol names. as hexadecimal. The third column is the translated kernel symbol names.
.SH OVERHEAD .SH OVERHEAD
This can have significant overhead if frequently called functions (> 1000/s) are This can have significant overhead if frequently called functions (> 1000/s) are
traced, and is only intended for low frequency function calls. This is because traced, and is only intended for low frequency function calls. This is because
......
...@@ -113,7 +113,7 @@ several types of hooks available: ...@@ -113,7 +113,7 @@ several types of hooks available:
through the socket/interface. through the socket/interface.
EBPF programs can be used for many purposes; the main use cases are EBPF programs can be used for many purposes; the main use cases are
dynamic tracing and monitoring, and packet procesisng. We are mostly dynamic tracing and monitoring, and packet processing. We are mostly
interested in the latter use case in this document. interested in the latter use case in this document.
#### EBPF Tables #### EBPF Tables
...@@ -219,7 +219,7 @@ very complex packet filters and simple packet forwarding engines. In ...@@ -219,7 +219,7 @@ very complex packet filters and simple packet forwarding engines. In
the spirit of open-source "release early, release often", we expect the spirit of open-source "release early, release often", we expect
that the compiler's capabilities will improve gradually. that the compiler's capabilities will improve gradually.
* Packet filtering is peformed using the `drop()` action. Packets * Packet filtering is performed using the `drop()` action. Packets
that are not dropped will be forwarded. that are not dropped will be forwarded.
* Packet forwarding is performed by setting the * Packet forwarding is performed by setting the
...@@ -233,7 +233,7 @@ Here are some limitations imposed on the P4 programs: ...@@ -233,7 +233,7 @@ Here are some limitations imposed on the P4 programs:
EBPF program). In the future the compiler should probably generate EBPF program). In the future the compiler should probably generate
two separate EBPF programs. two separate EBPF programs.
* arbirary parsers can be compiled, but the BCC compiler will reject * arbitrary parsers can be compiled, but the BCC compiler will reject
parsers that contain cycles parsers that contain cycles
* arithmetic on data wider than 32 bits is not supported * arithmetic on data wider than 32 bits is not supported
...@@ -311,7 +311,7 @@ p4toEbpf.py file.p4 -o file.c ...@@ -311,7 +311,7 @@ p4toEbpf.py file.p4 -o file.c
The P4 compiler first runs the C preprocessor on the input P4 file. The P4 compiler first runs the C preprocessor on the input P4 file.
Some of the command-line options are passed directly to the Some of the command-line options are passed directly to the
preprocesor. preprocessor.
The following compiler options are available: The following compiler options are available:
......
...@@ -41,7 +41,7 @@ the last row printed, for which there were 2 I/O. ...@@ -41,7 +41,7 @@ the last row printed, for which there were 2 I/O.
For efficiency, biolatency uses an in-kernel eBPF map to store timestamps For efficiency, biolatency uses an in-kernel eBPF map to store timestamps
with requests, and another in-kernel map to store the histogram (the "count") with requests, and another in-kernel map to store the histogram (the "count")
column, which is copied to user-space only when output is printed. These column, which is copied to user-space only when output is printed. These
methods lower the perormance overhead when tracing is performed. methods lower the performance overhead when tracing is performed.
In the following example, the -m option is used to print a histogram using In the following example, the -m option is used to print a histogram using
......
...@@ -166,7 +166,7 @@ The current implementation can take many seconds to detach from tracing, after ...@@ -166,7 +166,7 @@ The current implementation can take many seconds to detach from tracing, after
Ctrl-C has been hit. Ctrl-C has been hit.
Couting all vfs functions for process ID 5276 only: Counting all vfs functions for process ID 5276 only:
# ./funccount -p 5276 'vfs_*' # ./funccount -p 5276 'vfs_*'
Tracing... Ctrl-C to end. Tracing... Ctrl-C to end.
......
...@@ -37,12 +37,12 @@ the function began executing (was called) to when it finished (returned). ...@@ -37,12 +37,12 @@ the function began executing (was called) to when it finished (returned).
This example output shows that most of the time, do_sys_open() took between This example output shows that most of the time, do_sys_open() took between
2048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution 2048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution
shows 291 calls of between 4096 and 8191 nanoseconds. There was also one shows 291 calls of between 4096 and 8191 nanoseconds. There was also one
occurrance, an outlier, in the 2 to 4 millisecond range. occurrence, an outlier, in the 2 to 4 millisecond range.
How this works: the function entry and return are traced using the kernel kprobe How this works: the function entry and return are traced using the kernel kprobe
and kretprobe tracer. Timestamps are collected, the delta time calculated, which and kretprobe tracer. Timestamps are collected, the delta time calculated, which
is the bucketized and stored as an in-kernel histogram for efficiency. The is the bucketized and stored as an in-kernel histogram for efficiency. The
histgram is visible in the output: it's the "count" column; everything else is histogram is visible in the output: it's the "count" column; everything else is
decoration. Only the count column is copied to user-level on output. This is an decoration. Only the count column is copied to user-level on output. This is an
efficient way to time kernel functions and examine their latency distribution. efficient way to time kernel functions and examine their latency distribution.
...@@ -242,7 +242,7 @@ USAGE message: ...@@ -242,7 +242,7 @@ USAGE message:
usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r] usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r]
pattern pattern
Time kernel funcitons and print latency as a histogram Time kernel functions and print latency as a histogram
positional arguments: positional arguments:
pattern search expression for kernel functions pattern search expression for kernel functions
...@@ -260,7 +260,7 @@ optional arguments: ...@@ -260,7 +260,7 @@ optional arguments:
only. only.
examples: examples:
./funclatency do_sys_open # time the do_sys_open() kenel function ./funclatency do_sys_open # time the do_sys_open() kernel function
./funclatency -u vfs_read # time vfs_read(), in microseconds ./funclatency -u vfs_read # time vfs_read(), in microseconds
./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds ./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds
./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps ./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps
......
...@@ -743,4 +743,4 @@ examples: ...@@ -743,4 +743,4 @@ examples:
./offcputime 5 # trace for 5 seconds only ./offcputime 5 # trace for 5 seconds only
./offcputime -f 5 # 5 seconds, and output in folded format ./offcputime -f 5 # 5 seconds, and output in folded format
./offcputime -u # don't include kernel threads (user only) ./offcputime -u # don't include kernel threads (user only)
./offcputime -p 185 # trace fo PID 185 only ./offcputime -p 185 # trace for PID 185 only
...@@ -60,7 +60,7 @@ net_rx_action 15656 ...@@ -60,7 +60,7 @@ net_rx_action 15656
This can be useful for quantifying where CPU cycles are spent among the soft This can be useful for quantifying where CPU cycles are spent among the soft
interrupts (summarized as the %softirq column from mpstat(1), and shown as interrupts (summarized as the %softirq column from mpstat(1), and shown as
event counts in /proc/softirqs). The output above shows that most time was spent event counts in /proc/softirqs). The output above shows that most time was spent
processing net_rx_action(), which was around 15 milleconds per second (total processing net_rx_action(), which was around 15 milliseconds per second (total
time across all CPUs). time across all CPUs).
......
...@@ -376,7 +376,7 @@ Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end. ...@@ -376,7 +376,7 @@ Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
Detaching... Detaching...
If it wasn't clear how one function called another, knowing the instruction If it wasn't clear how one function called another, knowing the instruction
offset can help you locate the lines of code from a dissassembly dump. offset can help you locate the lines of code from a disassembly dump.
A wildcard can also be used. Eg, all functions beginning with "tcp_send": A wildcard can also be used. Eg, all functions beginning with "tcp_send":
......
...@@ -3,7 +3,7 @@ Demonstrations of stacksnoop, the Linux eBPF/bcc version. ...@@ -3,7 +3,7 @@ Demonstrations of stacksnoop, the Linux eBPF/bcc version.
This program traces the given kernel function and prints the kernel stack trace This program traces the given kernel function and prints the kernel stack trace
for every call. This tool is useful for studying low frequency kernel functions, for every call. This tool is useful for studying low frequency kernel functions,
to see how they were invoked. For exmaple, tracing the ext4_sync_fs() call: to see how they were invoked. For example, tracing the ext4_sync_fs() call:
# ./stacksnoop ext4_sync_fs # ./stacksnoop ext4_sync_fs
TIME(s) STACK TIME(s) STACK
......
...@@ -467,4 +467,4 @@ examples: ...@@ -467,4 +467,4 @@ examples:
./wakeuptime 5 # trace for 5 seconds only ./wakeuptime 5 # trace for 5 seconds only
./wakeuptime -f 5 # 5 seconds, and output in folded format ./wakeuptime -f 5 # 5 seconds, and output in folded format
./wakeuptime -u # don't include kernel threads (user only) ./wakeuptime -u # don't include kernel threads (user only)
./wakeuptime -p 185 # trace fo PID 185 only ./wakeuptime -p 185 # trace for PID 185 only
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment