Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
B
bcc
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
bcc
Commits
553f2aa1
Commit
553f2aa1
authored
Feb 14, 2016
by
Brendan Gregg
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
tcpretrans
parent
fb7ae106
Changes
4
Show whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
368 additions
and
0 deletions
+368
-0
README.md
README.md
+1
-0
man/man8/tcpretrans.8
man/man8/tcpretrans.8
+85
-0
tools/tcpretrans.py
tools/tcpretrans.py
+221
-0
tools/tcpretrans_example.txt
tools/tcpretrans_example.txt
+61
-0
No files found.
README.md
View file @
553f2aa1
...
...
@@ -100,6 +100,7 @@ Tools:
-
tools/
[
syncsnoop
](
tools/syncsnoop.py
)
: Trace sync() syscall.
[
Examples
](
tools/syncsnoop_example.txt
)
.
-
tools/
[
tcpaccept
](
tools/tcpaccept.py
)
: Trace TCP passive connections (accept()).
[
Examples
](
tools/tcpaccept_example.txt
)
.
-
tools/
[
tcpconnect
](
tools/tcpconnect.py
)
: Trace TCP active connections (connect()).
[
Examples
](
tools/tcpconnect_example.txt
)
.
-
tools/
[
tcpretrans
](
tools/tcpretrans.py
)
: Trace TCP retransmits and TLPs.
[
Examples
](
tools/tcpretrans_example.txt
)
.
-
tools/
[
vfscount
](
tools/vfscount.py
)
tools/
[
vfscount.c
](
tools/vfscount.c
)
: Count VFS calls.
[
Examples
](
tools/vfscount_example.txt
)
.
-
tools/
[
vfsstat
](
tools/vfsstat.py
)
tools/
[
vfsstat.c
](
tools/vfsstat.c
)
: Count some VFS calls, with column output.
[
Examples
](
tools/vfsstat_example.txt
)
.
-
tools/
[
wakeuptime
](
tools/wakeuptime.py
)
: Summarize sleep to wakeup time by waker kernel stack.
[
Examples
](
tools/wakeuptime_example.txt
)
.
...
...
man/man8/tcpretrans.8
0 → 100644
View file @
553f2aa1
.TH tcpaccept 8 "2016-02-14" "USER COMMANDS"
.SH NAME
tcpaccept \- Trace TCP retransmits and TLPs. Uses Linux eBPF/bcc.
.SH SYNOPSIS
.B tcpaccept [\-h] [\-l]
.SH DESCRIPTION
This traces TCP retransmits, showing address, port, and TCP state information,
and sometimes the PID (although usually not, since retransmits are usually
sent by the kernel on timeouts). To keep overhead very low, only
the TCP retransmit functions are traced. This does not trace every packet
(like tcpdump(8) or a packet sniffer).
This uses dynamic tracing of the kernel tcp_retransmit_skb() and
tcp_send_loss_probe() functions, and will need to be updated to
match kernel changes to these functions.
Since this uses BPF, only the root user can use this tool.
.SH REQUIREMENTS
CONFIG_BPF and bcc.
.SH OPTIONS
.TP
\-h
Print usage message.
.TP
\-l
Include tail loss probe attempts (in some cases the kernel may not
complete the TLP send).
.SH EXAMPLES
.TP
Trace TCP retransmits:
#
.B tcpretrans
.TP
Trace TCP retransmits and TLP attempts:
#
.B tcpconnect \-l
.SH FIELDS
.TP
TIME
Time of the retransmit.
.TP
PID
Process ID that was on-CPU. This is less useful than it might sound, as it
may usually be 0, for the kernel, for timer-based retransmits.
.TP
IP
IP address family (4 or 6).
.TP
LADDR
Local IP address. IPv4 as a dotted quad, IPv6 shows "..." then the last 4
bytes (check for newer versions of this tool for the full address).
.TP
LPORT
Local port.
.TP
T>
Type of event: R> == retransmit, L> == tail loss probe.
.TP
RADDR
Remote IP address. IPv4 as a dotted quad, IPv6 shows "..." then the last 4
bytes (check for newer versions of this tool for the full address).
.TP
RPORT
Remote port.
.TP
STATE
TCP session state.
.SH OVERHEAD
Should be negligible: TCP retransmit events should be low (<1000/s), and the
low overhead this tool adds to each event should make the cost negligible.
.SH SOURCE
This is from bcc.
.IP
https://github.com/iovisor/bcc
.PP
Also look in the bcc distribution for a companion _examples.txt file containing
example usage, output, and commentary for this tool.
.SH OS
Linux
.SH STABILITY
Unstable - in development.
.SH AUTHOR
Brendan Gregg
.SH SEE ALSO
tcpconnect(8), tcpaccept(8)
tools/tcpretrans.py
0 → 100755
View file @
553f2aa1
#!/usr/bin/python
# @lint-avoid-python-3-compatibility-imports
#
# tcpretrans Trace TCP retransmits and TLPs.
# For Linux, uses BCC, eBPF. Embedded C.
#
# USAGE: tcpretrans [-h] [-l]
#
# This uses dynamic tracing of kernel functions, and will need to be updated
# to match kernel changes.
#
# IPv4 addresses are printed as dotted quads. For IPv6 addresses, the last four
# bytes are printed after "..."; check for future versions with better IPv6
# support.
#
# Copyright 2016 Netflix, Inc.
# Licensed under the Apache License, Version 2.0 (the "License")
#
# 14-Feb-2016 Brendan Gregg Created this.
from
__future__
import
print_function
from
bcc
import
BPF
import
argparse
from
time
import
strftime
import
ctypes
as
ct
# arguments
examples
=
"""examples:
./tcpretrans # trace TCP retransmits
./tcpretrans -l # include TLP attempts
"""
parser
=
argparse
.
ArgumentParser
(
description
=
"Trace TCP retransmits"
,
formatter_class
=
argparse
.
RawDescriptionHelpFormatter
,
epilog
=
examples
)
parser
.
add_argument
(
"-l"
,
"--lossprobe"
,
action
=
"store_true"
,
help
=
"include tail loss probe attempts"
)
args
=
parser
.
parse_args
()
# define BPF program
bpf_text
=
"""
#include <uapi/linux/ptrace.h>
#include <net/sock.h>
#include <bcc/proto.h>
#define RETRANSMIT 1
#define TLP 2
// separate data structs for ipv4 and ipv6
struct ipv4_data_t {
// XXX: switch some to u32's when supported
u64 pid;
u64 ip;
u64 saddr;
u64 daddr;
u64 lport;
u64 dport;
u64 state;
u64 type;
};
BPF_PERF_OUTPUT(ipv4_events);
struct ipv6_data_t {
// XXX: update to transfer full ipv6 addrs
u64 pid;
u64 ip;
u64 saddr;
u64 daddr;
u64 lport;
u64 dport;
u64 state;
u64 type;
};
BPF_PERF_OUTPUT(ipv6_events);
static int trace_event(struct pt_regs *ctx, struct sock *sk, int type)
{
if (sk == NULL)
return 0;
u32 pid = bpf_get_current_pid_tgid();
struct sock *skp = NULL;
bpf_probe_read(&skp, sizeof(skp), &sk);
// pull in details
u16 family = 0, lport = 0, dport = 0;
char state = 0;
bpf_probe_read(&family, sizeof(family), &skp->__sk_common.skc_family);
bpf_probe_read(&lport, sizeof(lport), &skp->__sk_common.skc_num);
bpf_probe_read(&dport, sizeof(dport), &skp->__sk_common.skc_dport);
bpf_probe_read(&state, sizeof(state), (void *)&skp->__sk_common.skc_state);
if (family == AF_INET) {
struct ipv4_data_t data4 = {.pid = pid, .ip = 4, .type = type};
bpf_probe_read(&data4.saddr, sizeof(u32),
&skp->__sk_common.skc_rcv_saddr);
bpf_probe_read(&data4.daddr, sizeof(u32),
&skp->__sk_common.skc_daddr);
data4.lport = lport;
data4.dport = dport;
data4.state = state;
ipv4_events.perf_submit(ctx, &data4, sizeof(data4));
} else if (family == AF_INET6) {
struct ipv6_data_t data6 = {.pid = pid, .ip = 6, .type = type};
// just grab the last 4 bytes for now
u32 saddr = 0, daddr = 0;
bpf_probe_read(&saddr, sizeof(saddr),
&skp->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr32[3]);
bpf_probe_read(&daddr, sizeof(daddr),
&skp->__sk_common.skc_v6_daddr.in6_u.u6_addr32[3]);
data6.saddr = bpf_ntohl(saddr);
data6.daddr = bpf_ntohl(daddr);
data6.lport = lport;
data6.dport = dport;
data6.state = state;
ipv6_events.perf_submit(ctx, &data6, sizeof(data6));
}
// else drop
return 0;
}
int trace_retransmit(struct pt_regs *ctx, struct sock *sk)
{
trace_event(ctx, sk, RETRANSMIT);
return 0;
}
int trace_tlp(struct pt_regs *ctx, struct sock *sk)
{
trace_event(ctx, sk, TLP);
return 0;
}
"""
# event data
TASK_COMM_LEN
=
16
# linux/sched.h
class
Data_ipv4
(
ct
.
Structure
):
_fields_
=
[
(
"pid"
,
ct
.
c_ulonglong
),
(
"ip"
,
ct
.
c_ulonglong
),
(
"saddr"
,
ct
.
c_ulonglong
),
(
"daddr"
,
ct
.
c_ulonglong
),
(
"lport"
,
ct
.
c_ulonglong
),
(
"dport"
,
ct
.
c_ulonglong
),
(
"state"
,
ct
.
c_ulonglong
),
(
"type"
,
ct
.
c_ulonglong
)
]
class
Data_ipv6
(
ct
.
Structure
):
_fields_
=
[
(
"pid"
,
ct
.
c_ulonglong
),
(
"ip"
,
ct
.
c_ulonglong
),
(
"saddr"
,
ct
.
c_ulonglong
),
(
"daddr"
,
ct
.
c_ulonglong
),
(
"lport"
,
ct
.
c_ulonglong
),
(
"dport"
,
ct
.
c_ulonglong
),
(
"state"
,
ct
.
c_ulonglong
),
(
"type"
,
ct
.
c_ulonglong
)
]
# from bpf_text:
type
=
{}
type
[
1
]
=
'R'
type
[
2
]
=
'L'
def
inet_ntoa
(
addr
):
dq
=
''
for
i
in
range
(
0
,
4
):
dq
=
dq
+
str
(
addr
&
0xff
)
if
(
i
!=
3
):
dq
=
dq
+
'.'
addr
=
addr
>>
8
return
dq
# from include/net/tcp_states.h:
tcpstate
=
{}
tcpstate
[
1
]
=
'ESTABLISHED'
tcpstate
[
2
]
=
'SYN_SENT'
tcpstate
[
3
]
=
'SYN_RECV'
tcpstate
[
4
]
=
'FIN_WAIT1'
tcpstate
[
5
]
=
'FIN_WAIT2'
tcpstate
[
6
]
=
'TIME_WAIT'
tcpstate
[
7
]
=
'CLOSE'
tcpstate
[
8
]
=
'CLOSE_WAIT'
tcpstate
[
9
]
=
'LAST_ACK'
tcpstate
[
10
]
=
'LISTEN'
tcpstate
[
11
]
=
'CLOSING'
tcpstate
[
12
]
=
'NEW_SYN_RECV'
# process event
def
print_ipv4_event
(
cpu
,
data
,
size
):
event
=
ct
.
cast
(
data
,
ct
.
POINTER
(
Data_ipv4
)).
contents
print
(
"%-8s %-6d %-2d %-20s %1s> %-20s %s"
%
(
strftime
(
"%H:%M:%S"
),
event
.
pid
,
event
.
ip
,
"%s:%s"
%
(
inet_ntoa
(
event
.
saddr
),
event
.
lport
),
type
[
event
.
type
],
"%s:%s"
%
(
inet_ntoa
(
event
.
daddr
),
event
.
dport
),
tcpstate
[
event
.
state
]))
def
print_ipv6_event
(
cpu
,
data
,
size
):
event
=
ct
.
cast
(
data
,
ct
.
POINTER
(
Data_ipv6
)).
contents
print
(
"%%-8s -6d %-2d %-20s %1s> %-20s %s"
%
(
strftime
(
"%H:%M:%S"
),
event
.
pid
,
event
.
ip
,
"...%x:%d"
%
(
event
.
saddr
,
event
.
lport
),
type
[
event
.
type
],
"...%x:%d"
%
(
event
.
daddr
,
event
.
dport
),
tcpstate
[
event
.
state
]))
# initialize BPF
b
=
BPF
(
text
=
bpf_text
)
b
.
attach_kprobe
(
event
=
"tcp_retransmit_skb"
,
fn_name
=
"trace_retransmit"
)
b
.
attach_kprobe
(
event
=
"tcp_send_loss_probe"
,
fn_name
=
"trace_tlp"
)
# header
print
(
"%-8s %-6s %-2s %-20s %1s> %-20s %-4s"
%
(
"TIME"
,
"PID"
,
"IP"
,
"LADDR:LPORT"
,
"T"
,
"RADDR:RPORT"
,
"STATE"
))
# read events
b
[
"ipv4_events"
].
open_perf_buffer
(
print_ipv4_event
)
b
[
"ipv6_events"
].
open_perf_buffer
(
print_ipv6_event
)
while
1
:
b
.
kprobe_poll
()
tools/tcpretrans_example.txt
0 → 100644
View file @
553f2aa1
Demonstrations of tcpretrans, the Linux eBPF/bcc version.
This tool traces the kernel TCP retransmit function to show details of these
retransmits. For example:
# ./tcpretrans
TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE
01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED
01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED
01:55:17 0 4 10.153.223.157:22 R> 69.53.245.40:22957 ESTABLISHED
[...]
This output shows three TCP retransmits, the first two were for an IPv4
connection from 10.153.223.157 port 22 to 69.53.245.40 port 34619. The TCP
state was "ESTABLISHED" at the time of the retransmit. The on-CPU PID at the
time of the retransmit is printed, in this case 0 (the kernel, which will
be the case most of the time).
Retransmits are usually a sign of poor network health, and this tool is
useful for their investigation. Unlike using tcpdump, this tool has very
low overhead, as it only traces the retransmit function. It also prints
additional kernel details: the state of the TCP session at the time of the
retransmit.
A -l option will include TCP tail loss probe attempts:
# ./tcpretrans -l
TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE
01:55:45 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED
01:55:46 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED
01:55:46 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED
01:55:53 0 4 10.153.223.157:22 L> 69.53.245.40:46444 ESTABLISHED
01:56:06 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:06 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:08 1938 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED
[...]
See the "L>" in the "T>" column. These are attempts: the kernel probably
sent a TLP, but in some cases it might not have been ultimately sent.
USAGE message:
# ./tcpretrans -h
usage: tcpretrans [-h] [-l]
Trace TCP retransmits
optional arguments:
-h, --help show this help message and exit
-l, --lossprobe include tail loss probe attempts
examples:
./tcpretrans # trace TCP retransmits
./tcpretrans -l # include TLP attempts
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment