Commit 797c3ec1 authored by Brendan Gregg's avatar Brendan Gregg Committed by 4ast

add tcplife (#773)

parent 72027c1f
......@@ -121,6 +121,7 @@ Examples:
- tools/[tcpaccept](tools/tcpaccept.py): Trace TCP passive connections (accept()). [Examples](tools/tcpaccept_example.txt).
- tools/[tcpconnect](tools/tcpconnect.py): Trace TCP active connections (connect()). [Examples](tools/tcpconnect_example.txt).
- tools/[tcpconnlat](tools/tcpconnlat.py): Trace TCP active connection latency (connect()). [Examples](tools/tcpconnlat_example.txt).
- tools/[tcplife](tools/tcplife.py): Trace TCP sessions and summarize lifespan. [Examples](tools/tcplife_example.txt).
- tools/[tcpretrans](tools/tcpretrans.py): Trace TCP retransmits and TLPs. [Examples](tools/tcpretrans_example.txt).
- tools/[tcptop](tools/tcptop.py): Summarize TCP send/recv throughput by host. Top for TCP. [Examples](tools/tcptop_example.txt).
- tools/[tplist](tools/tplist.py): Display kernel tracepoints or USDT probes and their formats. [Examples](tools/tplist_example.txt).
......
.TH tcplife 8 "2016-10-19" "USER COMMANDS"
.SH NAME
tcplife \- Trace TCP sessions and summarize lifespan. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B tcplife [\-h] [\-T] [\-t] [\-w] [\-s] [\-p PID] [\-D PORTS] [\-L PORTS]
.SH DESCRIPTION
This tool traces TCP sessions that open and close while tracing, and prints
a line of output to summarize each one. This includes the IP addresses, ports,
duration, and throughput for the session. This is useful for workload
characterisation and flow accounting: identifying what connections are
happening, with the bytes transferred.
This tool works by using kernel dynamic tracing, and will need to be updated
if the kernel implementation changes. Only TCP state changes are traced, so
it is expected that the overhead of this tool is much lower than typical
send/receive tracing.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-s
Comma separated values output (parseable).
.TP
\-t
Include a timestamp column (seconds).
.TP
\-T
Include a time column (HH:MM:SS).
.TP
\-w
Wide column output (fits IPv6 addresses).
.TP
\-p PID
Trace this process ID only (filtered in-kernel).
.TP
\-L PORTS
Comma-separated list of local ports to trace (filtered in-kernel).
.TP
\-D PORTS
Comma-separated list of destination ports to trace (filtered in-kernel).
.SH EXAMPLES
.TP
Trace all TCP sessions, and summarize lifespan and throughput:
#
.B tcplife
.TP
Include a timestamp column, and wide column output:
#
.B tcplife \-tw
.TP
Trace PID 181 only:
#
.B tcplife \-p 181
.TP
Trace connections to local ports 80 and 81 only:
#
.B tcplife \-L 80,81
.TP
Trace connections to remote port 80 only:
#
.B tcplife \-D 80
.SH FIELDS
.TP
TIME
Time of the call, in HH:MM:SS format.
.TP
TIME(s)
Time of the call, in seconds.
.TP
PID
Process ID
.TP
COMM
Process name
.TP
IP
IP address family (4 or 6)
.TP
LADDR
Local IP address.
.TP
DADDR
Remote IP address.
.TP
LPORT
Local port.
.TP
DPORT
Destination port.
.TP
TX_KB
Total transmitted Kbytes.
.TP
RX_KB
Total received Kbytes.
.TP
MS
Lifespan of the session, in milliseconds.
.SH OVERHEAD
This traces the kernel TCP set state function, which should be called much
less often than send/receive tracing, and therefore have lower overhead. The
overhead of the tool is relative to the rate of new TCP sessions: if this is
high, over 10,000 per second, then there may be noticable overhead just to
print out 10k lines of formatted output per second.
You can find out the rate of new TCP sessions using "sar \-n TCP 1", and
adding the active/s and passive/s columns.
As always, test and understand this tools overhead for your types of
workloads before production use.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
tcpaccept(8), tcpconnect(8), tcptop(8)
This diff is collapsed.
Demonstrations of tcplife, the Linux BPF/bcc version.
tcplife summarizes TCP sessions that open and close while tracing. For example:
# ./tcplife
PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS
22597 recordProg 127.0.0.1 46644 127.0.0.1 28527 0 0 0.23
3277 redis-serv 127.0.0.1 28527 127.0.0.1 46644 0 0 0.28
22598 curl 100.66.3.172 61620 52.205.89.26 80 0 1 91.79
22604 curl 100.66.3.172 44400 52.204.43.121 80 0 1 121.38
22624 recordProg 127.0.0.1 46648 127.0.0.1 28527 0 0 0.22
3277 redis-serv 127.0.0.1 28527 127.0.0.1 46648 0 0 0.27
22647 recordProg 127.0.0.1 46650 127.0.0.1 28527 0 0 0.21
3277 redis-serv 127.0.0.1 28527 127.0.0.1 46650 0 0 0.26
[...]
This caught a program, "recordProg" making a few short-lived TCP connections
to "redis-serv", lasting about 0.25 milliseconds each connection. A couple of
"curl" sessions were also traced, connecting to port 80, and lasting 91 and 121
milliseconds.
This tool is useful for workload characterisation and flow accounting:
identifying what connections are happening, with the bytes transferred.
Process names are truncated to 10 characters. By using the wide option, -w,
the column width becomes 16 characters. The IP address columns are also wider
to fit IPv6 addresses:
# ./tcplife -w
PID COMM IP LADDR LPORT RADDR RPORT TX_KB RX_KB MS
26315 recordProgramSt 4 127.0.0.1 44188 127.0.0.1 28527 0 0 0.21
3277 redis-server 4 127.0.0.1 28527 127.0.0.1 44188 0 0 0.26
26320 ssh 6 fe80::8a3:9dff:fed5:6b19 22440 fe80::8a3:9dff:fed5:6b19 22 1 1 457.52
26321 sshd 6 fe80::8a3:9dff:fed5:6b19 22 fe80::8a3:9dff:fed5:6b19 22440 1 1 458.69
26341 recordProgramSt 4 127.0.0.1 44192 127.0.0.1 28527 0 0 0.27
3277 redis-server 4 127.0.0.1 28527 127.0.0.1 44192 0 0 0.32
In this example, I uploaded a 10 Mbyte file to the server, and then downloaded
it again, using scp:
# ./tcplife
PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS
7715 recordProg 127.0.0.1 50894 127.0.0.1 28527 0 0 0.25
3277 redis-serv 127.0.0.1 28527 127.0.0.1 50894 0 0 0.30
7619 sshd 100.66.3.172 22 100.127.64.230 63033 5 10255 3066.79
7770 recordProg 127.0.0.1 50896 127.0.0.1 28527 0 0 0.20
3277 redis-serv 127.0.0.1 28527 127.0.0.1 50896 0 0 0.24
7793 recordProg 127.0.0.1 50898 127.0.0.1 28527 0 0 0.23
3277 redis-serv 127.0.0.1 28527 127.0.0.1 50898 0 0 0.27
7847 recordProg 127.0.0.1 50900 127.0.0.1 28527 0 0 0.24
3277 redis-serv 127.0.0.1 28527 127.0.0.1 50900 0 0 0.29
7870 recordProg 127.0.0.1 50902 127.0.0.1 28527 0 0 0.29
3277 redis-serv 127.0.0.1 28527 127.0.0.1 50902 0 0 0.30
7798 sshd 100.66.3.172 22 100.127.64.230 64925 10265 6 2176.15
[...]
You can see the 10 Mbytes received by sshd, and then later transmitted. Looks
like receive was slower (3.07 seconds) than transmit (2.18 seconds).
Timestamps can be added with -t:
# ./tcplife -t
TIME(s) PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS
0.000000 5973 recordProg 127.0.0.1 47986 127.0.0.1 28527 0 0 0.25
0.000059 3277 redis-serv 127.0.0.1 28527 127.0.0.1 47986 0 0 0.29
1.022454 5996 recordProg 127.0.0.1 47988 127.0.0.1 28527 0 0 0.23
1.022513 3277 redis-serv 127.0.0.1 28527 127.0.0.1 47988 0 0 0.27
2.044868 6019 recordProg 127.0.0.1 47990 127.0.0.1 28527 0 0 0.24
2.044924 3277 redis-serv 127.0.0.1 28527 127.0.0.1 47990 0 0 0.28
3.069136 6042 recordProg 127.0.0.1 47992 127.0.0.1 28527 0 0 0.22
3.069204 3277 redis-serv 127.0.0.1 28527 127.0.0.1 47992 0 0 0.28
This shows that the recordProg process was connecting once per second.
There's also a -T for HH:MM:SS formatted times.
There's a comma separated values mode, -s. Here it is with both -t and -T
timestamps:
# ./tcplife -stT
TIME,TIME(s),PID,COMM,IP,LADDR,LPORT,RADDR,RPORT,TX_KB,RX_KB,MS
23:39:38,0.000000,7335,recordProgramSt,4,127.0.0.1,48098,127.0.0.1,28527,0,0,0.26
23:39:38,0.000064,3277,redis-server,4,127.0.0.1,28527,127.0.0.1,48098,0,0,0.32
23:39:39,1.025078,7358,recordProgramSt,4,127.0.0.1,48100,127.0.0.1,28527,0,0,0.25
23:39:39,1.025141,3277,redis-server,4,127.0.0.1,28527,127.0.0.1,48100,0,0,0.30
23:39:41,2.040949,7381,recordProgramSt,4,127.0.0.1,48102,127.0.0.1,28527,0,0,0.24
23:39:41,2.041011,3277,redis-server,4,127.0.0.1,28527,127.0.0.1,48102,0,0,0.29
23:39:42,3.067848,7404,recordProgramSt,4,127.0.0.1,48104,127.0.0.1,28527,0,0,0.30
23:39:42,3.067914,3277,redis-server,4,127.0.0.1,28527,127.0.0.1,48104,0,0,0.35
[...]
There are options for filtering on local and remote ports. Here is filtering
on local ports 22 and 80:
# ./tcplife.py -L 22,80
PID COMM LADDR LPORT RADDR RPORT TX_KB RX_KB MS
8301 sshd 100.66.3.172 22 100.127.64.230 58671 3 3 1448.52
[...]
USAGE:
# ./tcplife.py -h
usage: tcplife.py [-h] [-T] [-t] [-w] [-s] [-p PID] [-L LOCALPORT]
[-D REMOTEPORT]
Trace the lifespan of TCP sessions and summarize
optional arguments:
-h, --help show this help message and exit
-T, --time include time column on output (HH:MM:SS)
-t, --timestamp include timestamp on output (seconds)
-w, --wide wide column output (fits IPv6 addresses)
-s, --csv comma seperated values output
-p PID, --pid PID trace this PID only
-L LOCALPORT, --localport LOCALPORT
comma-separated list of local ports to trace.
-D REMOTEPORT, --remoteport REMOTEPORT
comma-separated list of remote ports to trace.
examples:
./tcplife # trace all TCP connect()s
./tcplife -t # include time column (HH:MM:SS)
./tcplife -w # wider colums (fit IPv6)
./tcplife -stT # csv output, with times & timestamps
./tcplife -p 181 # only trace PID 181
./tcplife -L 80 # only trace local port 80
./tcplife -L 80,81 # only trace local ports 80 and 81
./tcplife -D 80 # only trace remote port 80
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment