Commits · ffd3d18c20b8df281a18940ee80a99b28114d4b7 · Kirill Smelkov / linux

17 Jan, 2018 11 commits

perf tools: Add ARM Statistical Profiling Extensions (SPE) support · ffd3d18c

Kim Phillips authored Jan 14, 2018

'perf record' and 'perf report --dump-raw-trace' supported in this
release.

Example usage:

 # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
 # perf report --dump-raw-trace

Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.

Output will contain raw SPE data and its textual representation, such
as:

0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000  offset: 0  ref: 0x1891ad0e  idx: 1  tid: 2227  cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
.  00000000:  49 00                                           LD
.  00000002:  b2 c0 3b 29 0f 00 00 ff ff                      VA 0xffff00000f293bc0
.  0000000b:  b3 c0 eb 24 fb 00 00 00 80                      PA 0xfb24ebc0 ns=1
.  00000014:  9a 00 00                                        LAT 0 XLAT
.  00000017:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
.  00000019:  b0 00 c4 15 08 00 00 ff ff                      PC 0xff00000815c400 el3 ns=1
.  00000022:  98 00 00                                        LAT 0 TOT
.  00000025:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
.  0000002e:  49 00                                           LD
.  00000030:  b2 80 3c 29 0f 00 00 ff ff                      VA 0xffff00000f293c80
.  00000039:  b3 80 ec 24 fb 00 00 00 80                      PA 0xfb24ec80 ns=1
.  00000042:  9a 00 00                                        LAT 0 XLAT
.  00000045:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS
.  00000047:  b0 f4 11 16 08 00 00 ff ff                      PC 0xff0000081611f4 el3 ns=1
.  00000050:  98 00 00                                        LAT 0 TOT
.  00000053:  71 36 6c 21 2c 09 00 00 00                      TS 39395093558
.  0000005c:  48 00                                           INSN-OTHER
.  0000005e:  42 02                                           EV RETIRED
.  00000060:  b0 2c ef 7f 08 00 00 ff ff                      PC 0xff0000087fef2c el3 ns=1
.  00000069:  98 00 00                                        LAT 0 TOT
.  0000006c:  71 d1 6f 21 2c 09 00 00 00                      TS 39395094481
...

Other release notes:

- applies to acme's perf/{core,urgent} branches, likely elsewhere

- Report is self-contained within the tool.
  Record requires enabling the kernel SPE driver by
  setting CONFIG_ARM_SPE_PMU.

- The intel-bts implementation was used as a starting point; its
  min/default/max buffer sizes and power of 2 pages granularity need to be
  revisited for ARM SPE

- Recording across multiple SPE clusters/domains not supported

- Snapshot support (record -S), and conversion to native perf events
  (e.g., via 'perf inject --itrace'), are also not supported

- Technically both cs-etm and spe can be used simultaneously, however
  disabled for simplicity in this release
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

ffd3d18c

tools lib traceevent: Fix get_field_str() for dynamic strings · d777f8de

Steven Rostedt (VMware) authored Jan 11, 2018

If a field is a dynamic string, get_field_str() returned just the
offset/size value and not the string. Have it parse the offset/size
correctly to return the actual string. Otherwise filtering fails when
trying to filter fields that are dynamic strings.
Reported-by: Gopanapalli Pradeep <prap_hai@yahoo.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004823.146333275@goodmis.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

d777f8de

tools lib traceevent: Fix missing break in FALSE case of pevent_filter_clear_trivial() · 806efaed

Taeung Song authored Jan 11, 2018

Currently the FILTER_TRIVIAL_FALSE case has a missing break statement,
if the trivial type is FALSE, it will also run into the TRUE case, and
always be skipped as the TRUE statement will continue the loop on the
inverse condition of the FALSE statement.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004823.012918807@goodmis.org
Link: http://lkml.kernel.org/r/1493218540-12296-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

806efaed

tools lib traceevent: Add UL suffix to MISSING_EVENTS · 6d36ce26

Michael Sartain authored Jan 11, 2018

Add UL suffix to MISSING_EVENTS since ints shouldn't be left shifted by 31.
Signed-off-by: Michael Sartain <mikesart@fastmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20171016165542.13038-4-mikesart@fastmail.com
Link: http://lkml.kernel.org/r/20180112004822.829533885@goodmis.orgSigned-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

6d36ce26

tools lib traceevent: Use asprintf when possible · 67dfc376

Federico Vaga authored Jan 11, 2018

It makes the code clearer and less error prone.

clearer:
- less code
- the code is now using the same format to create strings dynamically

less error prone:
- no magic number +2 +9 +5 to compute the size
- no copy&paste of the strings to compute the size and to concatenate

The function `asprintf` is not POSIX standard but the program
was already using it. Later it can be decided to use only POSIX
functions, then we can easly replace all the `asprintf(3)` with a local
implementation of that function.
Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Federico Vaga <federico.vaga@vaga.pv.it>
Link: http://lkml.kernel.org/r/20170802221558.9684-2-federico.vaga@vaga.pv.it
Link: http://lkml.kernel.org/r/20180112004822.686281649@goodmis.orgSigned-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

67dfc376

tools lib traceevent: Show contents (in hex) of data of unrecognized type records · e8773728

Steven Rostedt (VMware) authored Jan 11, 2018

When a record has an unrecognized type, an error message is reported,
but it would also be helpful to see the contents of that record. At
least show what it is in hex, instead of just showing a blank line.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.542204577@goodmis.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

e8773728

tools lib traceevent: Handle new pointer processing of bprint strings · 37db96bb

Steven Rostedt (VMware) authored Jan 11, 2018

The Linux kernel printf() has some extended use cases that dereference
the pointer. This is dangerouse for tracing because the pointer that is
dereferenced can change or even be unmapped. It also causes issues when
the trace data is extracted, because user space does not have access to
the contents of the pointer even if it still exists.

To handle this, the kernel was updated to process these dereferenced
pointers at the time they are recorded, and not post processed. Now they
exist in the tracing buffer, and no dereference is needed at the time of
reading the trace.

The event parsing library needs to handle this new case.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.403349289@goodmis.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

37db96bb

tools lib traceevent: Simplify pointer print logic and fix %pF · 38d70b7c

Steven Rostedt (VMware) authored Jan 11, 2018

When processing %pX in pretty_print(), simplify the logic slightly by
incrementing the ptr to the format string if isalnum(ptr[1]) is true.
This follows the logic a bit more closely to what is in the kernel.

Also, this fixes a small bug where %pF was not giving the offset of the
function.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004822.260262257@goodmis.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

38d70b7c

tools lib traceevent: Print value of unknown symbolic fields · d6344473

Jan Kiszka authored Jan 11, 2018

Aligns trace-cmd with the behavior of the kernel.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com
Link: http://lkml.kernel.org/r/20180112004822.118332436@goodmis.orgSigned-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

d6344473

tools lib traceevent: Show value of flags that have not been parsed · 3df76c9a

Steven Rostedt (VMware) authored Jan 11, 2018

If the value contains bits that are not defined by print_flags() helper,
then show the remaining bits. This aligns with the functionality of the
kernel.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com
Link: http://lkml.kernel.org/r/20180112004821.976225232@goodmis.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

3df76c9a

tools lib traceevent: Fix bad force_token escape sequence · 952a99cc

Michael Sartain authored Jan 11, 2018

Older kernels have a bug that creates invalid symbols. event-parse.c
handles them by replacing them with a "%s" token. But the fix included
an extra backslash, and "\%s" was added incorrectly.
Signed-off-by: Michael Sartain <mikesart@fastmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20180112004821.827168881@goodmis.org
Link: http://lkml.kernel.org/r/d320000d37c10ce0912851e1fb78d1e0c946bcd9.1497486273.git.mikesart@fastmail.comSigned-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

952a99cc

12 Jan, 2018 6 commits

perf trace: Fix setting of --call-graph/--max-stack for non-syscall events · 08e26396

Arnaldo Carvalho de Melo authored Jan 12, 2018

The raw_syscalls:sys_{enter,exit} were first supported in 'perf trace',
together with minor and major page faults, then we supported
--call-graph, then --max-stack, but when the other tracepoints got
supported, and bpf, etc, I forgot to make those global call-graph
settings apply to them.

Fix it by realizing that the global --max-stack and --call-graph
settings are done via:

        OPT_CALLBACK(0, "call-graph", &trace.opts,
                     "record_mode[,record_size]", record_callchain_help,
                     &record_parse_callchain_opt),

And then, when we go to parse the events in -e via:

        OPT_CALLBACK('e', "event", &trace, "event",
                     "event/syscall selector. use 'perf list' to list available events",
                     trace__parse_events_option),

And trace__parse_sevents_option() calls:

                struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",
                                               "event selector. use 'perf list' to list available events",
                                               parse_events_option);
                err = parse_events_option(&o, lists[0], 0);

parse_events_option() will override the global --call-graph and
--max-stack if the "call-graph" and/or "max-stack" terms are in the
event definition, such as in the probe_libc:inet_pton event in one of the
examples below (-e probe_libc:inet_pton/max-stack=2).

Before:

  # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
       1.525 (         ): probe_libc:inet_pton:(7f77f3ac9350))
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.071 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.071/0.071/0.071/0.000 ms
       1.677 ( 0.081 ms): ping/31296 sendto(fd: 3, buff: 0x55681b652720, len: 64, addr: 0x55681b650640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa97e4bc9cef] (/usr/bin/ping)
                                         [0xffffaa97e4bc656d] (/usr/bin/ping)
                                         [0xffffaa97e4bc7d0a] (/usr/bin/ping)
                                         [0xffffaa97e4bca447] (/usr/bin/ping)
                                         [0xffffaa97e4bc2f91] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa97e4bc3379] (/usr/bin/ping)
  #

After:

  # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.089 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.089/0.089/0.089/0.000 ms
       1.955 (         ): probe_libc:inet_pton:(7f383a311350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
                                         [0xffffaa5d91444f3f] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d91445379] (/usr/bin/ping)
       2.140 ( 0.101 ms): ping/32047 sendto(fd: 3, buff: 0x55a26edd0720, len: 64, addr: 0x55a26edce640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d9144bcef] (/usr/bin/ping)
                                         [0xffffaa5d9144856d] (/usr/bin/ping)
                                         [0xffffaa5d91449d0a] (/usr/bin/ping)
                                         [0xffffaa5d9144c447] (/usr/bin/ping)
                                         [0xffffaa5d91444f91] (/usr/bin/ping)
                                         __libc_start_main (/usr/lib64/libc-2.26.so)
                                         [0xffffaa5d91445379] (/usr/bin/ping)
  #

Same thing for --max-stack, the global one:

  # perf trace --max-stack 3 -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.097 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.097/0.097/0.097/0.000 ms
       1.577 (         ): probe_libc:inet_pton:(7f32f3957350))
                                         __inet_pton (inlined)
                                         gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
                                         __GI_getaddrinfo (inlined)
       1.738 ( 0.108 ms): ping/32103 sendto(fd: 3, buff: 0x55c3132d7720, len: 64, addr: 0x55c3132d5640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa3cecf44cef] (/usr/bin/ping)
                                         [0xffffaa3cecf4156d] (/usr/bin/ping)
  #

And then setting up a global setting (dwarf, max-stack=4), that will
affect the raw_syscall:sys_enter for the 'sendto' syscall and that will
be overriden in the probe_libc:inet_pton call to just one entry.

  # perf trace --max-stack=4 --call-graph dwarf -e sendto -e probe_libc:inet_pton/max-stack=1/ ping -6 -c 1 ::1
  PING ::1(::1) 56 data bytes
  64 bytes from ::1: icmp_seq=1 ttl=64 time=0.090 ms

  --- ::1 ping statistics ---
  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.090/0.090/0.090/0.000 ms
       2.140 (         ): probe_libc:inet_pton:(7f9fe9337350))
                                         __GI___inet_pton (/usr/lib64/libc-2.26.so)
       2.283 ( 0.103 ms): ping/31804 sendto(fd: 3, buff: 0x55c7f3e19720, len: 64, addr: 0x55c7f3e17640, addr_len: 28) = 64
                                         __libc_sendto (/usr/lib64/libc-2.26.so)
                                         [0xffffaa380c402cef] (/usr/bin/ping)
                                         [0xffffaa380c3ff56d] (/usr/bin/ping)
                                         [0xffffaa380c400d0a] (/usr/bin/ping)
  #

Install iputils-debuginfo to get those /usr/bin/ping addresses resolved,
those routines are not on its .dymsym nor .symtab :-)

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qgl2gse8elhh9zztw4ajopg3@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

08e26396

perf evsel: Check if callchain is enabled before setting it up · 1688c2fd

Arnaldo Carvalho de Melo authored Jan 12, 2018

The construct:

	if (callchain_param)
		perf_evsel__config_callchain(evsel, opts, &callchain_param);

happens in several places, so make perf_evsel__config_callchain() work
just like free(NULL), do nothing if param->enabled is not set.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ykk0qzxnxwx3o611ctjnmxav@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

1688c2fd

perf tools: Fix copyfile_offset update of output offset · fa1195cc

Jiri Olsa authored Jan 09, 2018

We need to increase output offset in each iteration, not decrease it as
we currently do.

I guess we were lucky to finish in most cases in first iteration, so the
bug never showed. However it shows a lot when working with big (~4GB)
size data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 9c9f5a2f ("perf tools: Introduce copyfile_offset() function")
Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

fa1195cc

perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitely · 236d812c

Arnaldo Carvalho de Melo authored Jan 12, 2018

Since 75562573 ("perf tools: Add support for
PERF_SAMPLE_IDENTIFIER") we don't need explicitely set
PERF_SAMPLE_IDENTIFIER, as perf_evlist__config() will do this for us,
i.e. when there are more than one evsel in an evlist, it will check if
some evsel has a sample_type different than the one on the first evsel
in the list, setting PERF_SAMPLE_IDENTIFIER in that case.

So, to simplify 'perf trace' codebase, ditch that check.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-12xq6orhwttee2tdtu96ucrp@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

236d812c

perf script python: Add script to profile and resolve physical mem type · 41013f0c

Kan Liang authored Jan 04, 2018

There could be different types of memory in the system. E.g normal
System Memory, Persistent Memory. To understand how the workload maps to
those memories, it's important to know the I/O statistics of them.  Perf
can collect physical addresses, but those are raw data.  It still needs
extra work to resolve the physical addresses.  Provide a script to
facilitate the physical addresses resolving and I/O statistics.

Profile with MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS
event if any of them is available.

Look up the /proc/iomem and resolve the physical address.  Provide
memory type summary.

Here is an example output:

  # perf script report mem-phys-addr
  Event: mem_inst_retired.all_loads:P
  Memory type                                    count   percentage
  ----------------------------------------  -----------  -----------
  System RAM                                        74        53.2%
  Persistent Memory                                 55        39.6%
  N/A

  ---

Changes since V2:
 - Apply the new license rules.
 - Add comments for globals

Changes since V1:
 - Do not mix DLA and Load Latency. Do not compare the loads and stores.
   Only profile the loads.
 - Use event name to replace the RAW event
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/r/1515099595-34770-1-git-send-email-kan.liang@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

41013f0c

perf evlist: Remove trailing semicolon · dd8bd53a

Luis de Bethencourt authored Jan 11, 2018

The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180111155020.9782-1-luisbg@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

dd8bd53a

11 Jan, 2018 2 commits

perf evsel: Fix incorrect handling of type _TERM_DRV_CFG · 2178790b

Mathieu Poirier authored Jan 10, 2018

Commit ("d0565132 perf evsel: Enable type checking for
perf_evsel_config_term types") assumes PERF_EVSEL__CONFIG_TERM_DRV_CFG
isn't used and as such adds a BUG_ON().

Since the enumeration type is used in macro ADD_CONFIG_TERM() the change
break CoreSight trace acquisition.

This patch restores the original code.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: d0565132 ("perf evsel: Enable type checking for perf_evsel_config_term types")
Link: http://lkml.kernel.org/r/1515617211-32024-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

2178790b

Merge tag 'perf-core-for-mingo-4.16-20180110' of... · 1ccb8fed

Ingo Molnar authored Jan 11, 2018

Merge tag 'perf-core-for-mingo-4.16-20180110' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

- The 'perf test bpf' entry hooked a eBPF proggie to the
  SyS_epoll_wait() kernel function and expected it to be hit when calling
  the epoll_wait() libc wrapper, which changed recently, in systems such
  as Fedora 27, with the glibc wrapper calling instead the epoll_pwait()
  syscall, so switch to epoll_pwait() for both the kernel and libc
  function, getting it to work both in old and new systems (Arnaldo Carvalho de Melo)

- Beautify 'gettid' syscall result in 'perf trace', and in doing so
  noticed that we need to handle namespaces in 'perf trace', will be
  dealt with in follow up patches where we'll try to figure out if
  the recent support for namespace in tools/perf/ can be used for this
  purpose as well. (Arnaldo Carvalho de Melo)

- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
  info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)

- Synchronize kernel <-> tooling headers wrt meltdown/spectre changes
  (Arnaldo Carvalho de Melo)

- Fix a wrong offset issue when using /proc/kcore (Jin Yao)

- Fix bug that prevented annotating symbols in perf.data files
  generated with 'perf record --branch-any'  (Jin Yao)

- Add infrastructure to record first and last sample time to the
  perf.data file header, so that when processing all samples in
  a 'perf record' session, such as when doing build-id processing,
  or when specifically requesting that that info be recorded, use
  that in 'perf report --time', that also got support for percent
  slices in addition to absolute ones.

  I.e. now it is possible to ask for the samples in the 10%-20%
  time slice of a perf.data file (Jin Yao)

- Enable building with libbabeltrace by default (Jiri Olsa)

- Display perf_event_attr::namespaces when duping the attributes
  in verbose mode (Jiri Olsa)

- Allocate context task_ctx_data for child event (Jiri Olsa)

- Update comments for PERF_RECORD_ITRACE_START and PERF_RECORD_MISC_* (Jiri Olsa)

- Add support for showing PERF_RECORD_LOST events in 'perf script' (Jiri Olsa)

- Add 'perf report --stats' option to display quick statistics about
  metadata events (PERF_RECORD_*) i.e. what we get at the end of 'perf
  report -D' (Jiri Olsa)

- Fix compile error with libunwind x86 (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

1ccb8fed

10 Jan, 2018 7 commits

tools headers: Synchronize kernel <-> tooling headers · 5d64db29

Arnaldo Carvalho de Melo authored Jan 10, 2018

Two kernel headers got modified recently due to meltdown/spectre, in:

  a89f040f ("x86/cpufeatures: Add X86_BUG_CPU_INSECURE")

which are used by tooling as well:

  arch/x86/include/asm/cpufeatures.h
  arch/x86/include/asm/disabled-features.h

None of those changes have an effect on tooling, so do a plain copy.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qqzcs8ri3vks8cypg0puk0ae@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

5d64db29

perf report: Introduce --mmaps · 6439d7d1

Arnaldo Carvalho de Melo authored Jan 09, 2018

Similar to --tasks, producing the same output plus /proc/<PID>/maps
similar lines for each mmap record present in a perf.data file.

Please note that not all mmaps are stored, for instance, some of the
non-executable mmaps are only stored when 'perf record --data' is used,
when the user wants to resolve data accesses in addition to asking for
executable mmaps to get the DSO with symtabs.

E.g.:

  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]
  [root@jouet ~]# perf report --mmaps
  #      pid      tid     ppid  comm
           0        0       -1 |swapper
        4137     4137       -1 |sleep
                                  5628a35a1000-5628a37aa000 r-xp 00000000 3147148 /usr/bin/sleep
                                  7fb65ad51000-7fb65b134000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
                                  7fb65b134000-7fb65b35e000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
                                  7ffd94b9f000-7ffd94ba1000 r-xp 00000000 0 [vdso]
  #
  # perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (8 samples) ]
  # perf report --mmaps
  #      pid      tid     ppid  comm
           0        0       -1 |swapper
        4161     4161       -1 |sleep
                                  55afae69a000-55afae8a3000 r-xp 00000000 3147148 /usr/bin/sleep
                                  7f569f00d000-7f569f3f0000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
                                  7f569f3f0000-7f569f61a000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
                                  7fff6fffe000-7fff70000000 r-xp 00000000 0 [vdso]
  #
  # perf record time sleep 1
  0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2156maxresident)k
  0inputs+0outputs (0major+73minor)pagefaults 0swaps
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ]
  # perf report --mmaps
  #      pid      tid     ppid  comm
           0        0       -1 |swapper
        4281     4281       -1 |time
                                  560560dca000-560560fcf000 r-xp 00000000 3190458 /usr/bin/time
                                  7fc175196000-7fc175579000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
                                  7fc175579000-7fc1757a3000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
                                  7ffc924f6000-7ffc924f8000 r-xp 00000000 0 [vdso]
        4282     4282     4281 | sleep
                                   560560dca000-560560fcf000 r-xp 00000000 3190458 /usr/bin/time
                                   564b4de3c000-564b4e045000 r-xp 00000000 3147148 /usr/bin/sleep
                                   7f6a5a716000-7f6a5aaf9000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
                                   7f6a5aaf9000-7f6a5ad23000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
                                   7fc175196000-7fc175579000 r-xp 00000000 3149795 /usr/lib64/libc-2.26.so
                                   7fc175579000-7fc1757a3000 r-xp 00000000 3149715 /usr/lib64/ld-2.26.so
                                   7ffc924f6000-7ffc924f8000 r-xp 00000000 0 [vdso]
                                   7ffcec7e6000-7ffcec7e8000 r-xp 00000000 0 [vdso]
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zulwdlg5rfowogr1qznorvvc@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

6439d7d1

perf report: Add --tasks option to display monitored tasks · 930f8b34

Jiri Olsa authored Jan 07, 2018

Add --tasks option to display monitored tasks stored in perf.data.
Displaying pid/tid/ppid plus the command string aligned to distinguish
parent and child tasks.

  $ perf record -a
  ...
  $ perf report --tasks
  #     pid     tid    ppid  comm
          0       0      -1 |swapper
          2       2       0 | kthreadd
      14080   14080       2 |  kworker/u17:1
          4       4       2 |  kworker/0:0H
          6       6       2 |  mm_percpu_wq
  ...
          1       1       0 | systemd
      23242   23242       1 |  firefox
      23242   23298   23242 |   Cache2 I/O
      23242   23304   23242 |   GMPThread
  ...
       1195    1195       1 |  login
       1611    1611    1195 |   bash
       1639    1639    1611 |    startx
       1663    1663    1639 |     xinit
       1673    1673    1663 |      xmonad-x86_64-l
      23939   23939    1673 |       xterm
      23941   23941   23939 |        bash
      23963   23963   23941 |         mutt
      24954   24954   23963 |          offlineimap
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-13-jolsa@kernel.org
[ Make it --tasks, plural, --task works as well, as its unambiguous ]
[ Use machine__find_thread(), not findnew(), as pointed out by Namhyung ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

930f8b34

perf trace: Beautify 'gettid' syscall result · 2d1073de

Arnaldo Carvalho de Melo authored Jan 09, 2018

Before:

  # trace -a -e gettid sleep 0.01
<SNIP>
     4.863 ( 0.005 ms): Chrome_ChildIO/26241 gettid() = 26241
     4.931 ( 0.004 ms): Chrome_IOThrea/26154 gettid() = 26154
     4.942 ( 0.001 ms): Chrome_IOThrea/26154 gettid() = 26154
     4.946 ( 0.001 ms): Chrome_IOThrea/26154 gettid() = 26154
     4.970 ( 0.002 ms): Chrome_IOThrea/26154 gettid() = 26154
  #

After:

  # trace -a -e gettid sleep 0.01
     0.000 ( 0.009 ms): Chrome_IOThrea/26154 gettid() = 26154 (Chrome_IOThread)
<SNIP>
     3.416 ( 0.002 ms): Chrome_ChildIO/26241 gettid() = 26241 (Chrome_ChildIOT)
     3.424 ( 0.001 ms): Chrome_ChildIO/26241 gettid() = 26241 (Chrome_ChildIOT)
     3.343 ( 0.002 ms): chrome/26116 gettid() = 26116 (chrome)
     3.386 ( 0.002 ms): Chrome_IOThrea/26154 gettid() = 26154 (Chrome_IOThread)
     4.003 ( 0.003 ms): Chrome_ChildIO/26241 gettid() = 26241 (Chrome_ChildIOT)
     4.031 ( 0.002 ms): Chrome_IOThrea/26154 gettid() = 26154 (Chrome_IOThread)
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-kyg4gz2yy0vkrrh2vtq29u71@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

2d1073de

perf report: Add --stats option to display quick data statistics · a4a4d0a7

Jiri Olsa authored Jan 07, 2018

Add --stats option to display quick data statistics of event numbers,
without any further processing, like the one at the end of the perf
report -D command.

  $ perf report --stat

  Aggregated stats:
             TOTAL events:       4566
              MMAP events:        113
              LOST events:         19
              COMM events:          3
              FORK events:        400
            SAMPLE events:       3315
             MMAP2 events:         32
    FINISHED_ROUND events:        681
        THREAD_MAP events:          1
           CPU_MAP events:          1
         TIME_CONV events:          1

I found this useful when hunting lost events for another change.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-12-jolsa@kernel.org
[ Rename it to --stats, plural ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

a4a4d0a7

perf tools: Make the tool's warning messages optional · 075ca1eb

Jiri Olsa authored Jan 07, 2018

I want to display the pure events status coming in the next patch and
the tool's warnings are superfluous in the output. Making it optional,
enabled by default.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-11-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

075ca1eb

perf script: Add support to display lost events · 3d7c27b6

Jiri Olsa authored Jan 07, 2018

Adding option to display lost events:

  $ perf script --show-lost-events ...
   mplayer 13810 [002] 468011.402396:        100 cycles:ppp:  ff..
   mplayer 13810 [002] 468011.402396: PERF_RECORD_LOST lost 3880
   mplayer 13810 [002] 468011.402397:        100 cycles:ppp:  ff..
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-10-jolsa@kernel.org
[ Use PRIu64 when printing u64 values, fixing the build in some arches ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

3d7c27b6

08 Jan, 2018 14 commits

perf script: Add support to display sample misc field · 28a0b398

Jiri Olsa authored Jan 07, 2018

Adding support to display sample misc field in form
of letter for each bit:

  # perf script -F +misc ...
   sched-messaging  1414 K     28690.636582:       4590 cycles ...
   sched-messaging  1407 U     28690.636600:     325620 cycles ...
   sched-messaging  1414 K     28690.636608:      19473 cycles ...
  misc field  __________/

The misc bits are assigned to following letters:

  PERF_RECORD_MISC_KERNEL        K
  PERF_RECORD_MISC_USER          U
  PERF_RECORD_MISC_HYPERVISOR    H
  PERF_RECORD_MISC_GUEST_KERNEL  G
  PERF_RECORD_MISC_GUEST_USER    g
  PERF_RECORD_MISC_MMAP_DATA*    M
  PERF_RECORD_MISC_COMM_EXEC     E
  PERF_RECORD_MISC_SWITCH_OUT    S
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-9-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

28a0b398

perf: Update PERF_RECORD_MISC_* comment for perf_event_header::misc bit 13 · 972c1488

Jiri Olsa authored Jan 07, 2018

The perf_event_header::misc bit 13 is shared on different events and
next patch is adding yet another bit 13 user.  Updating the comment to
make it more structured and clear which events use bit 13.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-8-jolsa@kernel.org
[ Update the tools/include/uapi/linux copy ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

972c1488

perf: Return empty callchain instead of NULL · 99e818cc

Jiri Olsa authored Jan 07, 2018

It simplifies the code a bit, because we dump the callchain
Link: http://lkml.kernel.org/n/tip-uqp7qd6aif47g39glnbu95yl@git.kernel.org
even if it's empty. With 'empty' callchain we can remove
all the NULL-checking code paths.

Original-patch-from: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-7-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

99e818cc

perf: Make perf_callchain function static · 8cf7e0e2

Jiri Olsa authored Jan 07, 2018

And move it to core.c, because there's no caller of this function other
than the one in core.c
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-6-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8cf7e0e2

perf: Add sample_id to PERF_RECORD_ITRACE_START event comment · 81df978c

Jiri Olsa authored Jan 07, 2018

Adding missing sample_id line into PERF_RECORD_ITRACE_START
event comment.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-5-jolsa@kernel.org
[ Update the tools/include/uapi/linux copy ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

81df978c

perf: Allocate context task_ctx_data for child event · 313ccb96

Jiri Olsa authored Jan 07, 2018

Currently we use perf_event_context::task_ctx_data to save and restore
the LBR status when the task is scheduled out and in.

We don't allocate it for child contexts, which results in shorter task's
LBR stack, because we don't save the history from previous run and start
over every time we schedule the task in.

I made a test to generate samples with LBR call stack and got higher
numbers on bigger chain depths:

                            before:     after:
  LBR call chain: nr: 1       60561     498127
  LBR call chain: nr: 2           0          0
  LBR call chain: nr: 3      107030       2172
  LBR call chain: nr: 4      466685      62758
  LBR call chain: nr: 5     2307319     878046
  LBR call chain: nr: 6       48713     495218
  LBR call chain: nr: 7        1040       4551
  LBR call chain: nr: 8         481        172
  LBR call chain: nr: 9         878        120
  LBR call chain: nr: 10       2377       6698
  LBR call chain: nr: 11      28830     151487
  LBR call chain: nr: 12      29347     339867
  LBR call chain: nr: 13          4         22
  LBR call chain: nr: 14          3         53
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 4af57ef2 ("perf: Add pmu specific data for perf task context")
Link: http://lkml.kernel.org/r/20180107160356.28203-4-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

313ccb96

perf tools: Display perf_event_attr::namespaces debug info · db9fc765

Jiri Olsa authored Jan 07, 2018

Display namespaces bit in -vv debug display:

  $ perf record -vv --namespaces ...
  ...
  perf_event_attr:
    size                             112
    ...
    namespaces                       1
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-3-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

db9fc765

perf tools: Enable LIBBABELTRACE by default · 24787afb

Jiri Olsa authored Jan 07, 2018

There's no reason anymore to treat babel trace in a special way, because
a) we no longer display its state b) the needed babeltrace library is
now out and well adopted among distros.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180107160356.28203-2-jolsa@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

24787afb

perf script: Support time percent and multiple time ranges · 2ab046cd

Jin Yao authored Dec 08, 2017

perf script has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

   perf script --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

   perf script --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
    No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
    in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
    are moved from perf_session to perf_evlist so change the
    related code.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-7-git-send-email-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

2ab046cd

perf report: Support time percent and multiple time ranges · 5b969bc7

Jin Yao authored Dec 08, 2017

perf report has a --time option to limit the time range of output.  It
only supports absolute time.

Now this option is extended to support multiple time ranges and support
the percent of time.

For example:

1. Select the first and second 10% time slices:

perf report --time 10%/1,10%/2

2. Select from 0% to 10% and 30% to 40% slices:

perf report --time 0%-10%,30%-40%

Changelog:

v6: Fix the merge issue with latest perf/core branch.
    No functional changes.

v5: Add checking of first/last sample time to detect if it's recorded
    in perf.data. If it's not recorded, returns error message to user.

v4: Remove perf_time__skip_sample, only uses perf_time__ranges_skip_sample

v3: Since the definitions of first_sample_time/last_sample_time
    are moved from perf_session to perf_evlist so change the
    related code.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-6-git-send-email-yao.jin@linux.intel.com
[ Add missing colons at end of examples in the man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

5b969bc7

perf tools: Create function to perform multiple time range checking · 9a9b8b4b

Jin Yao authored Dec 08, 2017

Previous patch supports the multiple time range.

For example, select the first and second 10% time slices.
perf report --time 10%/1,10%/2

We need a function to check if a timestamp is in the ranges of
[0, 10%) and [10%, 20%].

Note that it includes the last element in [10%, 20%] but it doesn't
include the last element in [0, 10%). It's to avoid the overlap.

This patch implments a new function perf_time__ranges_skip_sample
for this checking.

Change log:

v4: Let perf_time__ranges_skip_sample be compatible with
    perf_time__skip_sample when only one time range.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

9a9b8b4b

perf tools: Create function to parse time percent · 13a70f35

Jin Yao authored Dec 08, 2017

Current perf report/script/... have a --time option to limit the time
range of output. But right now it only supports absolute time, add
support for time percentage.

For example:

1. Select the second 10% time slice
   perf report --time 10%/2

2. Select from 0% to 10% time slice
   perf report --time 0%-10%

It also support the multiple time ranges.

3. Select the first and second 10% time slices
   perf report --time 10%/1,10%/2

4. Select from 0% to 10% and 30% to 40% slices
   perf report --time 0%-10%,30%-40%

Changelog:

v4: An issue is found. Following passes.
    perf script --time 10%/10x12321xsdfdasfdsafdsafdsa

    Now it uses strtol to replace atoi.

Committer notes:

This just puts in place the infrastructure, so the examples in this cset
comment will only work later, after more patches in this series are
applied.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

13a70f35

perf record: Record the first and last sample time in the header · 68588baf

Jin Yao authored Dec 08, 2017

In the default 'perf record' configuration, all samples are processed,
to create the HEADER_BUILD_ID table. So it's very easy to get the
first/last samples and save the time to perf file header via the
function write_sample_time().

Later, at post processing time, perf report/script will fetch the time
from perf file header.

Committer testing:

  # perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 2.099 MB perf.data (1101 samples) ]
  [root@jouet home]# perf report --header | grep "time of "
  # time of first sample : 22947.909226
  # time of last sample : 22948.910704
  #
  # perf report -D | grep PERF_RECORD_SAMPLE\(
  0 22947909226101 0x20bb68 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa21b1af3 period: 1 addr: 0
  0 22947909229928 0x20bb98 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa200d204 period: 1 addr: 0
  <SNIP>
  3 22948910397351 0x219360 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 28251/28251: 0xffffffffa22071d8 period: 169518 addr: 0
  0 22948910652380 0x20f120 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 198807 addr: 0
  2 22948910704034 0x2172d0 [0x30]: PERF_RECORD_SAMPLE(IP, 0x4001): 0/0: 0xffffffffa2856816 period: 88111 addr: 0
  #

Changelog:

v7: Just update the patch description according to Arnaldo's suggestion.

v6: Currently '--buildid-all' is not enabled at default. So the walking
    on all samples is the default operation. There is no big overhead
    to calculate the timestamp boundary in process_sample_event handler
    once we already go through all samples. So the timestamp boundary
    calculation is enabled by default when '--buildid-all' is not enabled.

    While if '--buildid-all' is enabled, we creates a new option
    "--timestamp-boundary" for user to decide if it enables the
    timestamp boundary calculation.

v5: There is an issue that the sample walking can only work when
    '--buildid-all' is not enabled. So we need to let the walking
    be able to work even if '--buildid-all' is enabled and let the
    processing skips the dso hit marking for this case.

    At first, I want to provide a new option "--record-time-boundaries".
    While after consideration, I think a new option is not very
    necessary.

v3: Remove the definitions of first_sample_time and last_sample_time
    from struct record and directly save them in perf_evlist.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

68588baf

perf header: Add infrastructure to record first and last sample time · 6011518d

Jin Yao authored Dec 08, 2017

perf report/script/... have a --time option to limit the time range of
output. That's very useful to slice large traces, e.g. when processing
the output of perf script for some analysis.

But right now --time only supports absolute time. Also there is no fast
way to get the start/end times of a given trace except for looking at
it.  This makes it hard to e.g. only decode the first half of the trace,
which is useful for parallelization of scripts

Another problem is that perf records are variable size and there is no
synchronization mechanism. So the only way to find the last sample
reliably would be to walk all samples. But we want to avoid that in perf
report/...  because it is already quite expensive. That is why storing
the first sample time and last sample time in perf record is better.

This patch creates a new header feature type HEADER_SAMPLE_TIME and
related ops. Save the first sample time and the last sample time to the
feature section in perf file header. That will be done when, for
instance, processing build-ids, where we already have to process all
samples to create the build-id table, take advantage of that to further
amortize that processing by storing HEADER_SAMPLE_TIME to make 'perf
report/script' faster when using --time.

Committer testing:

After this patch is applied the header is written with zeroes, we need
the next patch, for "perf record" to actually write the timestamps:

  # perf report -D | grep PERF_RECORD_SAMPLE\(
  22501155244406 0x44f0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21be8c5 period: 1 addr: 0
  <SNIP>
  22501155793625 0x4a30 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 25016/25016: 0xffffffffa21ffd50 period: 2828043 addr: 0
  # perf report --header | grep "time of "
  # time of first sample : 0.000000
  # time of last sample : 0.000000
  #

Changelog:

v7: 1. Rebase to latest perf/core branch.

    2. Add following clarification in patch description according to
       Arnaldo's suggestion.

       "That will be done when, for instance, processing build-ids,
	where we already have to process all samples to create the
	build-id table, take advantage of that to further amortize
	that processing by storing HEADER_SAMPLE_TIME to make
	'perf report/script' faster when using --time."

v4: Use perf script time style for timestamp printing. Also add with
    the printing of sample duration.

v3: Remove the definitions of first_sample_time/last_sample_time from
    perf_session. Just define them in perf_evlist
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1512738826-2628-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

6011518d