Commit 91b76a82 authored by 4ast's avatar 4ast Committed by GitHub

Merge pull request #818 from ceeaspb/master

offwaketime: update with get_stackid fixes #437
parents 85b702b7 47cecb65
.TH offwaketime 8 "2016-01-30" "USER COMMANDS" .TH offwaketime 8 "2016-01-30" "USER COMMANDS"
.SH NAME .SH NAME
offwaketime \- Summarize blocked time by kernel off-CPU stack + waker stack. Uses Linux eBPF/bcc. offwaketime \- Summarize blocked time by off-CPU stack + waker stack. Uses Linux eBPF/bcc.
.SH SYNOPSIS .SH SYNOPSIS
.B offwaketime [\-h] [\-u] [\-p PID] [\-v] [\-f] [duration] .B offwaketime [\-h] [\-p PID | \-t TID | \-u | \-k] [\-U | \-K] [\-f] [\-\-stack-storage-size STACK_STORAGE_SIZE] [\-m MIN_BLOCK_TIME] [\-M MAX_BLOCK_TIME] [duration]
.SH DESCRIPTION .SH DESCRIPTION
This program shows kernel stack traces and task names that were blocked and This program shows kernel stack traces and task names that were blocked and
"off-CPU", along with the stack traces and task names for the threads that woke "off-CPU", along with the stack traces and task names for the threads that woke
...@@ -25,11 +25,6 @@ flame graph". ...@@ -25,11 +25,6 @@ flame graph".
See http://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html See http://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html
The stack depth is currently limited to 20 for the off-CPU stack, and 10 for
the waker stack, and the stack traces are kernel mode only. Check for newer
versions where this should be improved.
This currently only works on x86_64. Check for future versions.
.SH REQUIREMENTS .SH REQUIREMENTS
CONFIG_BPF and bcc. CONFIG_BPF and bcc.
.SH OPTIONS .SH OPTIONS
...@@ -38,22 +33,40 @@ CONFIG_BPF and bcc. ...@@ -38,22 +33,40 @@ CONFIG_BPF and bcc.
Print usage message. Print usage message.
.TP .TP
\-f \-f
Output stacks in folded format. Print output in folded stack format.
.TP
\-p PID
Trace this process ID only (filtered in-kernel).
.TP
\-t TID
Trace this thread ID only (filtered in-kernel).
.TP .TP
\-u \-u
Only trace user threads (not kernel threads). Only trace user threads (no kernel threads).
.TP .TP
\-v \-k
Show raw addresses for non-folded mode. Only trace kernel threads (no user threads).
.TP .TP
\-p PID \-U
Trace this process ID only (filtered in-kernel). Show stacks from user space only (no kernel space stacks).
.TP
\-K
Show stacks from kernel space only (no user space stacks).
.TP
\-\-stack-storage-size STACK_STORAGE_SIZE
Change the number of unique stack traces that can be stored and displayed.
.TP .TP
duration duration
Duration to trace, in seconds. Duration to trace, in seconds.
.TP
\-m MIN_BLOCK_TIME
The amount of time in microseconds over which we store traces (default 1)
.TP
\-M MAX_BLOCK_TIME
The amount of time in microseconds under which we store traces (default U64_MAX)
.SH EXAMPLES .SH EXAMPLES
.TP .TP
Trace all thread blocking events, and summarize (in-kernel) by kernel off-CPU stack trace, waker stack traces, task names, and total blocked time: Trace all thread blocking events, and summarize (in-kernel) by user and kernel off-CPU stack trace, waker stack traces, task names, and total blocked time:
# #
.B offwaketime .B offwaketime
.TP .TP
...@@ -76,8 +89,8 @@ is printed. While these techniques greatly lower overhead, scheduler events are ...@@ -76,8 +89,8 @@ is printed. While these techniques greatly lower overhead, scheduler events are
still a high frequency event, as they can exceed 1 million events per second, still a high frequency event, as they can exceed 1 million events per second,
and so caution should still be used. Test before production use. and so caution should still be used. Test before production use.
If the overhead is still a problem, take a look at the MINBLOCK_US tunable in If the overhead is still a problem, take a look at the min block option.
the code. If your aim is to chase down longer blocking events, then this could If your aim is to chase down longer blocking events, then this could
be increased to filter shorter blocking events, further lowering overhead. be increased to filter shorter blocking events, further lowering overhead.
.SH SOURCE .SH SOURCE
This is from bcc. This is from bcc.
......
...@@ -3,16 +3,7 @@ ...@@ -3,16 +3,7 @@
# offwaketime Summarize blocked time by kernel off-CPU stack + waker stack # offwaketime Summarize blocked time by kernel off-CPU stack + waker stack
# For Linux, uses BCC, eBPF. # For Linux, uses BCC, eBPF.
# #
# USAGE: offwaketime [-h] [-u] [-p PID] [-T] [duration] # USAGE: offwaketime [-h] [-p PID | -u | -k] [-U | -K] [-f] [duration]
#
# The current implementation uses an unrolled loop for x86_64, and was written
# as a proof of concept. This implementation should be replaced in the future
# with an appropriate bpf_ call, when available.
#
# The Off-CPU stack is currently limited to a stack trace depth of 20
# (maxtdepth), and the waker stack limited to 10 (maxwdepth). This is also
# limited to kernel stacks, and x86_64 only. Check for future versions, where
# these limitations should be removed.
# #
# Copyright 2016 Netflix, Inc. # Copyright 2016 Netflix, Inc.
# Licensed under the Apache License, Version 2.0 (the "License") # Licensed under the Apache License, Version 2.0 (the "License")
...@@ -24,38 +15,82 @@ from bcc import BPF ...@@ -24,38 +15,82 @@ from bcc import BPF
from time import sleep from time import sleep
import argparse import argparse
import signal import signal
import errno
from sys import stderr
# arg validation
def positive_int(val):
try:
ival = int(val)
except ValueError:
raise argparse.ArgumentTypeError("must be an integer")
if ival < 0:
raise argparse.ArgumentTypeError("must be positive")
return ival
def positive_nonzero_int(val):
ival = positive_int(val)
if ival == 0:
raise argparse.ArgumentTypeError("must be nonzero")
return ival
# arguments # arguments
examples = """examples: examples = """examples:
./offwaketime # trace off-CPU + waker stack time until Ctrl-C ./offwaketime # trace off-CPU + waker stack time until Ctrl-C
./offwaketime 5 # trace for 5 seconds only ./offwaketime 5 # trace for 5 seconds only
./offwaketime -f 5 # 5 seconds, and output in folded format ./offwaketime -f 5 # 5 seconds, and output in folded format
./offwaketime -u # don't include kernel threads (user only) ./offwaketime -m 1000 # trace only events that last more than 1000 usec
./offwaketime -p 185 # trace fo PID 185 only ./offwaketime -M 9000 # trace only events that last less than 9000 usec
./offwaketime -p 185 # only trace threads for PID 185
./offwaketime -t 188 # only trace thread 188
./offwaketime -u # only trace user threads (no kernel)
./offwaketime -k # only trace kernel threads (no user)
./offwaketime -U # only show user space stacks (no kernel)
./offwaketime -K # only show kernel space stacks (no user)
""" """
parser = argparse.ArgumentParser( parser = argparse.ArgumentParser(
description="Summarize blocked time by kernel stack trace + waker stack", description="Summarize blocked time by kernel stack trace + waker stack",
formatter_class=argparse.RawDescriptionHelpFormatter, formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=examples) epilog=examples)
parser.add_argument("-u", "--useronly", action="store_true", thread_group = parser.add_mutually_exclusive_group()
# Note: this script provides --pid and --tid flags but their arguments are
# referred to internally using kernel nomenclature: TGID and PID.
thread_group.add_argument("-p", "--pid", metavar="PID", dest="tgid",
help="trace this PID only", type=positive_int)
thread_group.add_argument("-t", "--tid", metavar="TID", dest="pid",
help="trace this TID only", type=positive_int)
thread_group.add_argument("-u", "--user-threads-only", action="store_true",
help="user threads only (no kernel threads)") help="user threads only (no kernel threads)")
parser.add_argument("-p", "--pid", thread_group.add_argument("-k", "--kernel-threads-only", action="store_true",
help="trace this PID only") help="kernel threads only (no user threads)")
parser.add_argument("-v", "--verbose", action="store_true", stack_group = parser.add_mutually_exclusive_group()
help="show raw addresses") stack_group.add_argument("-U", "--user-stacks-only", action="store_true",
help="show stacks from user space only (no kernel space stacks)")
stack_group.add_argument("-K", "--kernel-stacks-only", action="store_true",
help="show stacks from kernel space only (no user space stacks)")
parser.add_argument("-d", "--delimited", action="store_true",
help="insert delimiter between kernel/user stacks")
parser.add_argument("-f", "--folded", action="store_true", parser.add_argument("-f", "--folded", action="store_true",
help="output folded format") help="output folded format")
parser.add_argument("--stack-storage-size", default=1024,
type=positive_nonzero_int,
help="the number of unique stack traces that can be stored and "
"displayed (default 1024)")
parser.add_argument("duration", nargs="?", default=99999999, parser.add_argument("duration", nargs="?", default=99999999,
type=positive_nonzero_int,
help="duration of trace, in seconds") help="duration of trace, in seconds")
parser.add_argument("-m", "--min-block-time", default=1,
type=positive_nonzero_int,
help="the amount of time in microseconds over which we " +
"store traces (default 1)")
parser.add_argument("-M", "--max-block-time", default=(1 << 64) - 1,
type=positive_nonzero_int,
help="the amount of time in microseconds under which we " +
"store traces (default U64_MAX)")
args = parser.parse_args() args = parser.parse_args()
folded = args.folded folded = args.folded
duration = int(args.duration) duration = int(args.duration)
debug = 0
maxwdepth = 10 # and MAXWDEPTH
maxtdepth = 20 # and MAXTDEPTH
if args.pid and args.useronly:
print("ERROR: use either -p or -u.")
exit()
# signal handler # signal handler
def signal_ignore(signal, frame): def signal_ignore(signal, frame):
...@@ -66,124 +101,91 @@ bpf_text = """ ...@@ -66,124 +101,91 @@ bpf_text = """
#include <uapi/linux/ptrace.h> #include <uapi/linux/ptrace.h>
#include <linux/sched.h> #include <linux/sched.h>
#define MAXWDEPTH 10 #define MINBLOCK_US MINBLOCK_US_VALUEULL
#define MAXTDEPTH 20 #define MAXBLOCK_US MAXBLOCK_US_VALUEULL
#define MINBLOCK_US 1
struct key_t { struct key_t {
char waker[TASK_COMM_LEN]; char waker[TASK_COMM_LEN];
char target[TASK_COMM_LEN]; char target[TASK_COMM_LEN];
u64 wret[MAXWDEPTH]; int w_k_stack_id;
u64 tret[MAXTDEPTH]; int w_u_stack_id;
int t_k_stack_id;
int t_u_stack_id;
u32 t_pid;
u32 w_pid;
u32 tgid;
}; };
BPF_HASH(counts, struct key_t); BPF_HASH(counts, struct key_t);
BPF_HASH(start, u32); BPF_HASH(start, u32);
struct wokeby_t { struct wokeby_t {
char name[TASK_COMM_LEN]; char name[TASK_COMM_LEN];
u64 ret[MAXWDEPTH]; int k_stack_id;
int u_stack_id;
int w_pid;
}; };
BPF_HASH(wokeby, u32, struct wokeby_t); BPF_HASH(wokeby, u32, struct wokeby_t);
static u64 get_frame(u64 *bp) { BPF_STACK_TRACE(stack_traces, STACK_STORAGE_SIZE)
if (*bp) {
// The following stack walker is x86_64 specific
u64 ret = 0;
if (bpf_probe_read(&ret, sizeof(ret), (void *)(*bp+8)))
return 0;
if (bpf_probe_read(bp, sizeof(*bp), (void *)*bp))
*bp = 0;
if (ret < __START_KERNEL_map)
return 0;
return ret;
}
return 0;
}
int waker(struct pt_regs *ctx, struct task_struct *p) { int waker(struct pt_regs *ctx, struct task_struct *p) {
u32 pid = p->pid; u32 pid = p->pid;
u32 tgid = p->tgid;
if (!(FILTER)) if (!(THREAD_FILTER)) {
return 0; return 0;
}
u64 bp = 0;
struct wokeby_t woke = {}; struct wokeby_t woke = {};
int depth = 0;
bpf_get_current_comm(&woke.name, sizeof(woke.name)); bpf_get_current_comm(&woke.name, sizeof(woke.name));
bp = ctx->bp;
woke.k_stack_id = KERNEL_STACK_GET;
// unrolled loop (MAXWDEPTH): woke.u_stack_id = USER_STACK_GET;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out; woke.w_pid = pid;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
if (!(woke.ret[depth++] = get_frame(&bp))) goto out;
woke.ret[depth] = get_frame(&bp);
out:
wokeby.update(&pid, &woke); wokeby.update(&pid, &woke);
return 0; return 0;
} }
int oncpu(struct pt_regs *ctx, struct task_struct *p) { int oncpu(struct pt_regs *ctx, struct task_struct *p) {
u32 pid = p->pid; u32 pid = p->pid, t_pid=pid;
u32 tgid = p->tgid;
u64 ts, *tsp; u64 ts, *tsp;
// record previous thread sleep time // record previous thread sleep time
if (FILTER) { if (THREAD_FILTER) {
ts = bpf_ktime_get_ns(); ts = bpf_ktime_get_ns();
start.update(&pid, &ts); start.update(&pid, &ts);
} }
// calculate current thread's delta time // calculate current thread's delta time
pid = bpf_get_current_pid_tgid(); pid = bpf_get_current_pid_tgid();
tgid = bpf_get_current_pid_tgid() >> 32;
tsp = start.lookup(&pid); tsp = start.lookup(&pid);
if (tsp == 0) if (tsp == 0) {
return 0; // missed start or filtered return 0; // missed start or filtered
}
u64 delta = bpf_ktime_get_ns() - *tsp; u64 delta = bpf_ktime_get_ns() - *tsp;
start.delete(&pid); start.delete(&pid);
delta = delta / 1000; delta = delta / 1000;
if (delta < MINBLOCK_US) if ((delta < MINBLOCK_US) || (delta > MAXBLOCK_US)) {
return 0; return 0;
}
// create map key // create map key
u64 zero = 0, *val, bp = 0; u64 zero = 0, *val;
int depth = 0;
struct key_t key = {}; struct key_t key = {};
struct wokeby_t *woke; struct wokeby_t *woke;
bpf_get_current_comm(&key.target, sizeof(key.target)); bpf_get_current_comm(&key.target, sizeof(key.target));
bp = ctx->bp;
key.t_k_stack_id = KERNEL_STACK_GET;
// unrolled loop (MAXTDEPTH): key.t_u_stack_id = USER_STACK_GET;
if (!(key.tret[depth++] = get_frame(&bp))) goto out; key.t_pid = t_pid;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
if (!(key.tret[depth++] = get_frame(&bp))) goto out;
key.tret[depth] = get_frame(&bp);
out:
woke = wokeby.lookup(&pid); woke = wokeby.lookup(&pid);
if (woke) { if (woke) {
__builtin_memcpy(&key.wret, woke->ret, sizeof(key.wret)); key.w_k_stack_id = woke->k_stack_id;
key.w_u_stack_id = woke->u_stack_id;
key.w_pid = woke->w_pid;
key.tgid = tgid;
__builtin_memcpy(&key.waker, woke->name, TASK_COMM_LEN); __builtin_memcpy(&key.waker, woke->name, TASK_COMM_LEN);
wokeby.delete(&pid); wokeby.delete(&pid);
} }
...@@ -193,15 +195,46 @@ out: ...@@ -193,15 +195,46 @@ out:
return 0; return 0;
} }
""" """
if args.pid:
filter = 'pid == %s' % args.pid # set thread filter
elif args.useronly: thread_context = ""
filter = '!(p->flags & PF_KTHREAD)' if args.tgid is not None:
thread_context = "PID %d" % args.tgid
thread_filter = 'tgid == %d' % args.tgid
elif args.pid is not None:
thread_context = "TID %d" % args.pid
thread_filter = 'pid == %d' % args.pid
elif args.user_threads_only:
thread_context = "user threads"
thread_filter = '!(p->flags & PF_KTHREAD)'
elif args.kernel_threads_only:
thread_context = "kernel threads"
thread_filter = 'p->flags & PF_KTHREAD'
else: else:
filter = '1' thread_context = "all threads"
bpf_text = bpf_text.replace('FILTER', filter) thread_filter = '1'
if debug: bpf_text = bpf_text.replace('THREAD_FILTER', thread_filter)
print(bpf_text)
# set stack storage size
bpf_text = bpf_text.replace('STACK_STORAGE_SIZE', str(args.stack_storage_size))
bpf_text = bpf_text.replace('MINBLOCK_US_VALUE', str(args.min_block_time))
bpf_text = bpf_text.replace('MAXBLOCK_US_VALUE', str(args.max_block_time))
# handle stack args
kernel_stack_get = "stack_traces.get_stackid(ctx, BPF_F_REUSE_STACKID)"
user_stack_get = \
"stack_traces.get_stackid(ctx, BPF_F_REUSE_STACKID | BPF_F_USER_STACK)"
stack_context = ""
if args.user_stacks_only:
stack_context = "user"
kernel_stack_get = "-1"
elif args.kernel_stacks_only:
stack_context = "kernel"
user_stack_get = "-1"
else:
stack_context = "user + kernel"
bpf_text = bpf_text.replace('USER_STACK_GET', user_stack_get)
bpf_text = bpf_text.replace('KERNEL_STACK_GET', kernel_stack_get)
# initialize BPF # initialize BPF
b = BPF(text=bpf_text) b = BPF(text=bpf_text)
...@@ -214,71 +247,88 @@ if matched == 0: ...@@ -214,71 +247,88 @@ if matched == 0:
# header # header
if not folded: if not folded:
print("Tracing blocked time (us) by kernel off-CPU and waker stack", print("Tracing blocked time (us) by %s off-CPU and waker stack" %
end="") stack_context, end="")
if duration < 99999999: if duration < 99999999:
print(" for %d secs." % duration) print(" for %d secs." % duration)
else: else:
print("... Hit Ctrl-C to end.") print("... Hit Ctrl-C to end.")
# output # as cleanup can take many seconds, trap Ctrl-C:
while (1): # print a newline for folded output on Ctrl-C
try: signal.signal(signal.SIGINT, signal_ignore)
sleep(duration)
except KeyboardInterrupt: sleep(duration)
# as cleanup can take many seconds, trap Ctrl-C:
signal.signal(signal.SIGINT, signal_ignore) if not folded:
print()
if not folded:
print() missing_stacks = 0
counts = b.get_table("counts") has_enomem = False
for k, v in sorted(counts.items(), key=lambda counts: counts[1].value): counts = b.get_table("counts")
if folded: stack_traces = b.get_table("stack_traces")
# fold target stack for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
line = k.target + ";" # handle get_stackid errors
for i in reversed(range(0, maxtdepth)): # check for an ENOMEM error
if k.tret[i] == 0: if k.w_k_stack_id == -errno.ENOMEM or \
continue k.t_k_stack_id == -errno.ENOMEM or \
line = line + b.ksym(k.tret[i]) k.w_u_stack_id == -errno.ENOMEM or \
if i != 0: k.t_u_stack_id == -errno.ENOMEM:
line = line + ";" missing_stacks += 1
continue
# add delimiter
line = line + ";-" waker_user_stack = [] if k.w_u_stack_id < 1 else \
reversed(list(stack_traces.walk(k.w_u_stack_id))[1:])
# fold waker stack waker_kernel_stack = [] if k.w_k_stack_id < 1 else \
for i in range(0, maxwdepth): reversed(list(stack_traces.walk(k.w_k_stack_id))[1:])
line = line + ";" target_user_stack = [] if k.t_u_stack_id < 1 else \
if k.wret[i] == 0: stack_traces.walk(k.t_u_stack_id)
break target_kernel_stack = [] if k.t_k_stack_id < 1 else \
line = line + b.ksym(k.wret[i]) stack_traces.walk(k.t_k_stack_id)
if i != 0:
line = line + ";" + k.waker if folded:
# print folded stack output
# print as a line line = \
print("%s %d" % (line, v.value)) [k.target] + \
else: [b.sym(addr, k.tgid)
# print wakeup name then stack in reverse order for addr in reversed(list(target_user_stack)[1:])] + \
print(" %-16s %s" % ("waker:", k.waker)) (["-"] if args.delimited else [""]) + \
for i in reversed(range(0, maxwdepth)): [b.ksym(addr)
if k.wret[i] == 0: for addr in reversed(list(target_kernel_stack)[1:])] + \
continue ["--"] + \
print(" %-16x %s" % (k.wret[i], [b.ksym(addr)
b.ksym(k.wret[i]))) for addr in reversed(list(waker_kernel_stack))] + \
(["-"] if args.delimited else [""]) + \
# print delimiter [b.sym(addr, k.tgid)
print(" %-16s %s" % ("-", "-")) for addr in reversed(list(waker_user_stack))] + \
[k.waker]
# print default multi-line stack output print("%s %d" % (";".join(line), v.value))
for i in range(0, maxtdepth):
if k.tret[i] == 0: else:
break # print wakeup name then stack in reverse order
print(" %-16x %s" % (k.tret[i], print(" %-16s %s %s" % ("waker:", k.waker, k.t_pid))
b.ksym(k.tret[i]))) for addr in waker_user_stack:
print(" %-16s %s" % ("target:", k.target)) print(" %016x %s" % (addr, b.sym(addr, k.tgid)))
print(" %d\n" % v.value) if args.delimited:
counts.clear() print(" -")
for addr in waker_kernel_stack:
if not folded: print(" %016x %s" % (addr, b.ksym(addr)))
print("Detaching...")
exit() # print waker/wakee delimiter
print(" %-16s %s" % ("--", "--"))
# print default multi-line stack output
for addr in target_kernel_stack:
print(" %016x %s" % (addr, b.ksym(addr)))
if args.delimited:
print(" -")
for addr in target_user_stack:
print(" %016x %s" % (addr, b.sym(addr, k.tgid)))
print(" %-16s %s %s" % ("target:", k.target, k.w_pid))
print(" %d\n" % v.value)
if missing_stacks > 0:
enomem_str = " Consider increasing --stack-storage-size."
print("WARNING: %d stack traces could not be displayed.%s" %
(missing_stacks, enomem_str),
file=stderr)
...@@ -305,23 +305,51 @@ software, which visualizes these. ...@@ -305,23 +305,51 @@ software, which visualizes these.
USAGE message: USAGE message:
# ./offwaketime -h # ./offwaketime -h
usage: offwaketime [-h] [-u] [-p PID] [-v] [-f] [duration] usage: offwaketime [-h] [-p PID | -t TID | -u | -k] [-U | -K] [-d] [-f]
[--stack-storage-size STACK_STORAGE_SIZE]
[-m MIN_BLOCK_TIME] [-M MAX_BLOCK_TIME]
[duration]
Summarize blocked time by kernel off-CPU stack + waker stack Summarize blocked time by kernel stack trace + waker stack
positional arguments: positional arguments:
duration duration of trace, in seconds duration duration of trace, in seconds
optional arguments: optional arguments:
-h, --help show this help message and exit -h, --help show this help message and exit
-u, --useronly user threads only (no kernel threads) -p PID, --pid PID trace this PID only
-p PID, --pid PID trace this PID only -t TID, --tid TID trace this TID only
-v, --verbose show raw addresses -u, --user-threads-only
-f, --folded output folded format user threads only (no kernel threads)
-k, --kernel-threads-only
kernel threads only (no user threads)
-U, --user-stacks-only
show stacks from user space only (no kernel space
stacks)
-K, --kernel-stacks-only
show stacks from kernel space only (no user space
stacks)
-d, --delimited insert delimiter between kernel/user stacks
-f, --folded output folded format
--stack-storage-size STACK_STORAGE_SIZE
the number of unique stack traces that can be stored
and displayed (default 1024)
-m MIN_BLOCK_TIME, --min-block-time MIN_BLOCK_TIME
the amount of time in microseconds over which we store
traces (default 1)
-M MAX_BLOCK_TIME, --max-block-time MAX_BLOCK_TIME
the amount of time in microseconds under which we
store traces (default U64_MAX)
examples: examples:
./offwaketime # trace off-CPU + waker stack time until Ctrl-C ./offwaketime # trace off-CPU + waker stack time until Ctrl-C
./offwaketime 5 # trace for 5 seconds only ./offcputime 5 # trace for 5 seconds only
./offwaketime -f 5 # 5 seconds, and output in folded format ./offcputime -f 5 # 5 seconds, and output in folded format
./offwaketime -u # don't include kernel threads (user only) ./offcputime -m 1000 # trace only events that last more than 1000 usec
./offwaketime -p 185 # trace fo PID 185 only ./offcputime -M 10000 # trace only events that last less than 10000 usec
./offcputime -p 185 # only trace threads for PID 185
./offcputime -t 188 # only trace thread 188
./offcputime -u # only trace user threads (no kernel)
./offcputime -k # only trace kernel threads (no user)
./offcputime -U # only show user space stacks (no kernel)
./offcputime -K # only show kernel space stacks (no user)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment