Merge pull request #1633 from hMcLauchlan/inject-tool

Add new targeted error injection tool

Merge pull request #1633 from hMcLauchlan/inject-tool
Add new targeted error injection tool
ddd5dd5e · Brendan Gregg · GitHub · c5c17472 · 601d75d8 · ddd5dd5e
Commit ddd5dd5e authored Mar 23, 2018 by Brendan Gregg Committed by GitHub Mar 23, 2018
Showing with 628 additions and 0 deletions

README.md README.md +1 -0

man/man8/inject.8 man/man8/inject.8 +50 -0

tools/inject.py tools/inject.py +453 -0

tools/inject_example.txt tools/inject_example.txt +124 -0

No files found.
--- a/README.md
+++ b/README.md
@@ -113,6 +113,7 @@ pair of .c and .py files, and some are directories of files.
 - tools/[funcslower](tools/funcslower.py): Trace slow kernel or user function calls. [Examples](tools/funcslower_example.txt).
 - tools/[gethostlatency](tools/gethostlatency.py): Show latency for getaddrinfo/gethostbyname[2] calls. [Examples](tools/gethostlatency_example.txt).
 - tools/[hardirqs](tools/hardirqs.py):  Measure hard IRQ (hard interrupt) event time. [Examples](tools/hardirqs_example.txt).
+- tools/[inject](tools/inject.py): Targeted error injection with call chain and predicates [Examples](tools/inject_example.txt).
 - tools/[killsnoop](tools/killsnoop.py): Trace signals issued by the kill() syscall. [Examples](tools/killsnoop_example.txt).
 - tools/[llcstat](tools/llcstat.py): Summarize CPU cache references and misses by process. [Examples](tools/llcstat_example.txt).
 - tools/[mdflush](tools/mdflush.py): Trace md flush events. [Examples](tools/mdflush_example.txt).

--- a/man/man8/inject.8
+++ b/man/man8/inject.8
+.TH inject 8  "2018-03-16" "USER COMMANDS"
+.SH NAME
+inject \- injects appropriate error into function if input call chain and
+predicates are satisfied. Uses Linux eBPF/bcc.
+.SH SYNOPSIS
+.B trace -h [-I header] [-v]
+.SH DESCRIPTION
+inject injects errors into specified kernel functionality when a given call
+chain and associated predicates are satsified.
+
+WARNING: This tool injects failures into key kernel functions and may crash the
+kernel. You should know what you're doing if you're using this tool.
+
+This makes use of a Linux 4.16 feature (bpf_override_return())
+
+Additionally, use of the kmalloc failure mode is only possible with 
+
+	commit f7174d08a5fc ("mm: make should_failslab always available for
+	fault injection")
+
+which is in mm-tree but not yet in mainline (as of 4.16-rc5).
+
+Since this uses BPF, only the root user can use this tool.
+.SH REQUIREMENTS
+CONFIG_BPF, CONFIG_BPF_KPROBE_OVERRIDE, bcc
+.SH OPTIONS
+.TP
+\-h
+Print usage message.
+.TP
+\-v
+Display the generated BPF program, for debugging or modification.
+.TP
+\-I header
+Necessary headers to be included.
+.SH EXAMPLES
+Please see inject_example.txt
+.SH SOURCE
+This is from bcc.
+.IP
+https://github.com/iovisor/bcc
+.PP
+Also look in the bcc distribution for a companion _examples.txt file containing
+example usage, output, and commentary for this tool.
+.SH OS
+Linux
+.SH STABILITY
+Unstable - in development.
+.SH AUTHOR
+Howard McLauchlan
--- a/tools/inject.py
+++ b/tools/inject.py
--- a/tools/inject_example.txt
+++ b/tools/inject_example.txt
+Some examples for inject
+
+inject guarantees the appropriate erroneous return of the specified injection
+mode (kmalloc,bio,etc) given a call chain and an optional set of predicates. You
+can also optionally print out the generated BPF program for
+modification/debugging purposes.
+
+As a simple example, let's say you wanted to fail all mounts. While we cannot
+fail the mount() syscall directly (a patch is in the works), we can easily
+fail do_mount() calls like so:
+
+# ./inject.py kmalloc -v 'do_mount()'
+
+The first argument indicates the mode (or what to fail). Appropriate headers are
+specified. The verbosity flag prints the generated program.
+
+Trying to mount various filesystems will fail and report an inability to
+allocate memory, as expected.
+
+Whenever a predicate is missing, an implicit "(true)" is inserted. The example
+above can be explicitly written as:
+
+# ./inject.py kmalloc -v '(true) => do_mount()(true)'
+
+The "(true)" without an associated function is a predicate for the error
+injection mechanism of the current mode. In the case of kmalloc, the predicate
+would have access to the arguments of:
+
+	int should_failslab(struct kmem_cache *s, gfp_t gfpflags);
+
+The bio mode works similarly, with access to the arguments of:
+	
+	static noinline int should_fail_bio(struct bio *bio)
+
+We also note that it's unnecessary to state the arguments of the function if you
+have no intention to reference them in the associated predicate.
+
+Now let's say we want to be a bit more specific; suppose you want to fail
+kmalloc() from mount_subtree() when called from btrfs_mount(). This will fail
+only btrfs mounts:
+
+# ./inject.py kmalloc -v 'mount_subtree() => btrfs_mount()'
+
+Attempting to mount btrfs filesystem during the execution of this command will
+yield an error, but other filesystems will be fine.
+
+Next, lets say we want to hit one of the BUG_ONs in fs/btrfs. As of 4.16-rc3,
+there is a BUG_ON in btrfs_prepare_close_one_device() at fs/btrfs/volumes.c:1002
+
+To hit this, we can use the following:
+
+# ./inject.py kmalloc -v 'btrfs_alloc_device() => btrfs_close_devices()'
+
+While the script was executing, I mounted and unmounted btrfs, causing a
+segfault on umount(since that satisfied the call path indicated). A look at
+dmesg will confirm that the erroneous return value injected by the script
+tripped the BUG_ON, causing a segfault down the line.
+
+In general, it's worth noting that the required specificity of the call chain is
+dependent on how much granularity you need. The example above might have
+performed as expected without the intermediate btrfs_alloc_device, but might
+have also done something unexpected(an earlier kmalloc could have failed before
+the one we were targetting).
+
+For hot paths, the approach outlined above isn't enough. If a path is traversed
+very often, we can distinguish distinct calls with function arguments. Let's say
+we want to fail the dentry allocation of a file creatively named 'bananas'. We
+can do the following:
+
+# ./inject.py kmalloc -v 'd_alloc_parallel(struct dentry *parent, const struct
+qstr *name)(STRCMP(name->name, 'bananas'))' 
+
+While this script is executing, any operation that would cause a dentry
+allocation where the name is 'bananas' fails, as expected.
+
+Here, since we're referencing a function argument in our predicate, we need to
+provide the function signature up to the argument we're using.
+
+To note, STRCMP is a workaround for some rewriter issues. It will take input of
+the form (x->...->z, 'literal'), and generate some equivalent code that the
+verifier is more friendly about. It's not horribly robust, but works for the
+purposes of making string comparisons a bit easier.
+
+Finally, we briefly demonstrate how to inject bio failures. The mechanism is
+identical, so any information from above will apply.
+
+Let's say we want to fail bio requests when the request is to some specific
+sector. An example use case would be to fail superblock writes in btrfs. For
+btrfs, we know that there must be a superblock at 65536 bytes, or sector 128.
+This allows us to run the following:
+
+# ./inject.py bio -v -I 'linux/blkdev.h'  '(({struct gendisk *d = bio->bi_disk;
+struct disk_part_tbl *tbl = d->part_tbl; struct hd_struct **parts = (void *)tbl +
+sizeof(struct disk_part_tbl); struct hd_struct **partp = parts + bio->bi_partno;
+struct hd_struct *p = *partp; dev_t disk = p->__dev.devt; disk ==
+MKDEV(254,16);}) && bio->bi_iter.bi_sector == 128)'
+
+The predicate in the command above has two parts. The first is a compound
+statement which shortens to "only if the system is btrfs", but is long due
+to rewriter/verifier shenanigans. The major/minor information can be found
+however; I used Python. The second part simply checks the starting
+address of bi_iter. While executing, this script effectively fails superblock
+writes to the superblock at sector 128 without affecting other filesystems.
+
+As an extension to the above, one could easily fail all btrfs superblock writes
+(we only fail the primary) by calculating the sector number of the mirrors and
+amending the predicate accordingly.
+
+USAGE message:
+
+usage: inject.py [-h] [-I header] [-v] mode spec
+
+Fail specified kernel functionality when call chain and predicates are met
+
+positional arguments:
+  mode                  indicate which base kernel function to fail
+  spec                  specify call chain
+
+optional arguments:
+  -h, --help            show this help message and exit
+  -I header, --include header
+                        additional header files to include in the BPF program
+  -v, --verbose         print BPF program
+