1. 10 Aug, 2018 1 commit
  2. 09 Aug, 2018 1 commit
    • Sandipan Das's avatar
      perf probe powerpc: Fix trace event post-processing · 354b064b
      Sandipan Das authored
      In some cases, a symbol may have multiple aliases. Attempting to add an
      entry probe for such symbols results in a probe being added at an
      incorrect location while it fails altogether for return probes. This is
      only applicable for binaries with debug information.
      
      During the arch-dependent post-processing, the offset from the start of
      the symbol at which the probe is to be attached is determined and added
      to the start address of the symbol to get the probe's location.  In case
      there are multiple aliases, this offset gets added multiple times for
      each alias of the symbol and we end up with an incorrect probe location.
      
      This can be verified on a powerpc64le system as shown below.
      
        $ nm /lib/modules/$(uname -r)/build/vmlinux | grep "sys_open$"
        ...
        c000000000414290 T __se_sys_open
        c000000000414290 T sys_open
      
        $ objdump -d /lib/modules/$(uname -r)/build/vmlinux | grep -A 10 "<__se_sys_open>:"
      
        c000000000414290 <__se_sys_open>:
        c000000000414290:       19 01 4c 3c     addis   r2,r12,281
        c000000000414294:       70 c4 42 38     addi    r2,r2,-15248
        c000000000414298:       a6 02 08 7c     mflr    r0
        c00000000041429c:       e8 ff a1 fb     std     r29,-24(r1)
        c0000000004142a0:       f0 ff c1 fb     std     r30,-16(r1)
        c0000000004142a4:       f8 ff e1 fb     std     r31,-8(r1)
        c0000000004142a8:       10 00 01 f8     std     r0,16(r1)
        c0000000004142ac:       c1 ff 21 f8     stdu    r1,-64(r1)
        c0000000004142b0:       78 23 9f 7c     mr      r31,r4
        c0000000004142b4:       78 1b 7e 7c     mr      r30,r3
      
        For both the entry probe and the return probe, the probe location
        should be _text+4276888 (0xc000000000414298). Since another alias
        exists for 'sys_open', the post-processing code will end up adding
        the offset (8 for powerpc64le) twice and perf will attempt to add
        the probe at _text+4276896 (0xc0000000004142a0) instead.
      
      Before:
      
        # perf probe -v -a sys_open
      
        probe-definition(0): sys_open
        symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
        Try to find probe point from debuginfo.
        Symbol sys_open address found : c000000000414290
        Matched function: __se_sys_open [2ad03a0]
        Probe point found: __se_sys_open+0
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing/kprobe_events write=1
        Writing event: p:probe/sys_open _text+4276896
        Added new event:
          probe:sys_open       (on sys_open)
        ...
      
        # perf probe -v -a sys_open%return $retval
      
        probe-definition(0): sys_open%return
        symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
        Try to find probe point from debuginfo.
        Symbol sys_open address found : c000000000414290
        Matched function: __se_sys_open [2ad03a0]
        Probe point found: __se_sys_open+0
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing/README write=0
        Opening /sys/kernel/debug/tracing/kprobe_events write=1
        Parsing probe_events: p:probe/sys_open _text+4276896
        Group:probe Event:sys_open probe:p
        Writing event: r:probe/sys_open__return _text+4276896
        Failed to write event: Invalid argument
          Error: Failed to add events. Reason: Invalid argument (Code: -22)
      
      After:
      
        # perf probe -v -a sys_open
      
        probe-definition(0): sys_open
        symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
        Try to find probe point from debuginfo.
        Symbol sys_open address found : c000000000414290
        Matched function: __se_sys_open [2ad03a0]
        Probe point found: __se_sys_open+0
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing/kprobe_events write=1
        Writing event: p:probe/sys_open _text+4276888
        Added new event:
          probe:sys_open       (on sys_open)
        ...
      
        # perf probe -v -a sys_open%return $retval
      
        probe-definition(0): sys_open%return
        symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null)
        0 arguments
        Looking at the vmlinux_path (8 entries long)
        Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
        Try to find probe point from debuginfo.
        Symbol sys_open address found : c000000000414290
        Matched function: __se_sys_open [2ad03a0]
        Probe point found: __se_sys_open+0
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing/README write=0
        Opening /sys/kernel/debug/tracing/kprobe_events write=1
        Parsing probe_events: p:probe/sys_open _text+4276888
        Group:probe Event:sys_open probe:p
        Writing event: r:probe/sys_open__return _text+4276888
        Added new event:
          probe:sys_open__return (on sys_open%return)
        ...
      Reported-by: default avatarAneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarSandipan Das <sandipan@linux.ibm.com>
      Acked-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Fixes: 99e608b5 ("perf probe ppc64le: Fix probe location when using DWARF")
      Link: http://lkml.kernel.org/r/20180809161929.35058-1-sandipan@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      354b064b
  3. 08 Aug, 2018 37 commits
    • Konstantin Khlebnikov's avatar
      perf map: Optimize maps__fixup_overlappings() · 6a9405b5
      Konstantin Khlebnikov authored
      This function splits and removes overlapping areas.
      
      Maps in tree are ordered by start address thus we could find first
      overlap and stop if next map does not overlap.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/153365189407.435244.7234821822450484712.stgit@buzzSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a9405b5
    • Konstantin Khlebnikov's avatar
      perf map: Synthesize maps only for thread group leader · e5adfc3e
      Konstantin Khlebnikov authored
      Threads share map_groups, all map events are merged into it.
      
      Thus we could send mmaps only for thread group leader.  Otherwise it
      took ages to attach and record something from processes with many vmas
      and threads.
      
      Thread group leader could be already dead, but it seems perf cannot
      handle this case anyway.
      
      Testing dummy:
      
        #include <stdio.h>
        #include <stdlib.h>
        #include <sys/mman.h>
        #include <pthread.h>
        #include <unistd.h>
      
        void *thread(void *arg) {
                pause();
        }
      
        int main(int argc, char **argv) {
              int threads = 10000;
              int vmas = 50000;
              pthread_t th;
              for (int i = 0; i < threads; i++)
                      pthread_create(&th, NULL, thread, NULL);
              for (int i = 0; i < vmas; i++)
                      mmap(NULL, 4096, (i & 1) ? PROT_READ : PROT_WRITE,
                           MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE, -1, 0);
              sleep(60);
              return 0;
        }
      
      Comment by Jiri Olsa:
      
      We actualy synthesize the group leader (if we found one) for the thread
      even if it's not present in the thread_map, so the process maps are
      always in data.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/153363294102.396323.6277944760215058174.stgit@buzzSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5adfc3e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Wire up the augmented syscalls with the syscalls:sys_enter_FOO beautifier · 88cf7084
      Arnaldo Carvalho de Melo authored
      We just check that the evsel is the one we associated with the
      bpf-output event associated with the "__augmented_syscalls__" eBPF map,
      to show that the formatting is done properly:
      
        # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): __augmented_syscalls__:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC
           0.006 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC
           0.007 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x43e06da8, flags: CLOEXEC                 ) = 3
           0.029 (         ): __augmented_syscalls__:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC
           0.030 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC
           0.031 ( 0.004 ms): cat/11486 openat(dfd: CWD, filename: 0x4400ece0, flags: CLOEXEC                 ) = 3
           0.249 (         ): __augmented_syscalls__:dfd: CWD, filename: 0xc3700d6
           0.250 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xc3700d6
           0.252 ( 0.003 ms): cat/11486 openat(dfd: CWD, filename: 0xc3700d6                                  ) = 3
        #
      
      Now we just need to get the full blown enter/exit handlers to check if the
      evsel being processed is the augmented_syscalls one to go pick the pointer
      payloads from the end of the payload.
      
      We also need to state somehow what is the layout for multi pointer arg syscalls.
      
      Also handy would be to have a BTF file with the struct definitions used in
      syscalls, compact, generated at kernel built time and available for use in eBPF
      programs.
      
      Till we get there we can go on doing some manual coupling of the most relevant
      syscalls with some hand built beautifiers.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-r6ba5izrml82nwfmwcp7jpkm@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88cf7084
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Setup the augmented syscalls bpf-output event fields · d3d1c4bd
      Arnaldo Carvalho de Melo authored
      The payload that is put in place by the eBPF script attached to
      syscalls:sys_enter_openat (and other syscalls with pointers, in the
      future) can be consumed by the existing sys_enter beautifiers if
      evsel->priv is setup with a struct syscall_tp with struct tp_fields for
      the 'syscall_id' and 'args' fields expected by the beautifiers, this
      patch does just that.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-xfjyog8oveg2fjys9r1yy1es@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3d1c4bd
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Make bpf__setup_output_event() return the bpf-output event · 78e890ea
      Arnaldo Carvalho de Melo authored
      We're calling it to setup that event, and we'll need it later to decide
      if the bpf-output event we're handling is the one setup for a specific
      purpose, return it using ERR_PTR, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-zhachv7il2n1lopt9aonwhu7@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      78e890ea
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Handle "bpf-output" events associated with "__augmented_syscalls__" BPF map · e0b6d2ef
      Arnaldo Carvalho de Melo authored
      Add an example BPF script that writes syscalls:sys_enter_openat raw
      tracepoint payloads augmented with the first 64 bytes of the "filename"
      syscall pointer arg.
      
      Then catch it and print it just like with things written to the
      "__bpf_stdout__" map associated with a PERF_COUNT_SW_BPF_OUTPUT software
      event, by just letting the default tracepoint handler in 'perf trace',
      trace__event_handler(), to use bpf_output__fprintf(trace, sample), just
      like it does with all other PERF_COUNT_SW_BPF_OUTPUT events, i.e. just
      do a dump on the payload, so that we can check if what is being printed
      has at least the first 64 bytes of the "filename" arg:
      
      The augmented_syscalls.c eBPF script:
      
        # cat tools/perf/examples/bpf/augmented_syscalls.c
        // SPDX-License-Identifier: GPL-2.0
      
        #include <stdio.h>
      
        struct bpf_map SEC("maps") __augmented_syscalls__ = {
             .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
             .key_size = sizeof(int),
             .value_size = sizeof(u32),
             .max_entries = __NR_CPUS__,
        };
      
        struct syscall_enter_openat_args {
      	unsigned long long common_tp_fields;
      	long		   syscall_nr;
      	long		   dfd;
      	char		   *filename_ptr;
      	long		   flags;
      	long		   mode;
        };
      
        struct augmented_enter_openat_args {
      	struct syscall_enter_openat_args args;
      	char				 filename[64];
        };
      
        int syscall_enter(openat)(struct syscall_enter_openat_args *args)
        {
      	struct augmented_enter_openat_args augmented_args;
      
      	probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
      	probe_read_str(&augmented_args.filename, sizeof(augmented_args.filename), args->filename_ptr);
      	perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU,
      			  &augmented_args, sizeof(augmented_args));
      	return 1;
        }
      
        license(GPL);
        #
      
      So it will just prepare a raw_syscalls:sys_enter payload for the
      "openat" syscall.
      
      This will eventually be done for all syscalls with pointer args,
      globally or just when the user asks, using some spec, which args of
      which syscalls it wants "expanded" this way, we'll probably start with
      just all the syscalls that have char * pointers with familiar names, the
      ones we already handle with the probe:vfs_getname kprobe if it is in
      place hooking the kernel getname_flags() function used to copy from user
      the paths.
      
      Running it we get:
      
        # perf trace -e perf/tools/perf/examples/bpf/augmented_syscalls.c,openat cat /etc/passwd > /dev/null
           0.000 (         ): __augmented_syscalls__:X?.C......................`\..................../etc/ld.so.cache..#......,....ao.k...............k......1.".........
           0.006 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC
           0.008 ( 0.005 ms): cat/31292 openat(dfd: CWD, filename: 0x5c600da8, flags: CLOEXEC                 ) = 3
           0.036 (         ): __augmented_syscalls__:X?.C.......................\..................../lib64/libc.so.6......... .\....#........?.......=.C..../.".........
           0.037 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC
           0.039 ( 0.007 ms): cat/31292 openat(dfd: CWD, filename: 0x5c808ce0, flags: CLOEXEC                 ) = 3
           0.323 (         ): __augmented_syscalls__:X?.C.....................P....................../etc/passwd......>.C....@................>.C.....,....ao.>.C........
           0.325 (         ): syscalls:sys_enter_openat:dfd: CWD, filename: 0xe8be50d6
           0.327 ( 0.004 ms): cat/31292 openat(dfd: CWD, filename: 0xe8be50d6                                 ) = 3
        #
      
      We need to go on optimizing this to avoid seding trash or zeroes in the
      pointer content payload, using the return from bpf_probe_read_str(), but
      to keep things simple at this stage and make incremental progress, lets
      leave it at that for now.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-g360n1zbj6bkbk6q0qo11c28@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0b6d2ef
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Add wrappers to BPF_FUNC_probe_read(_str) functions · 8fa25f30
      Arnaldo Carvalho de Melo authored
      Will be used shortly in the augmented syscalls work together with a
      PERF_COUNT_SW_BPF_OUTPUT software event to insert syscalls + pointer
      contents in the perf ring buffer, to be consumed by 'perf trace'
      beautifiers.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-ajlkpz4cd688ulx1u30htkj3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8fa25f30
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Add bpf__setup_output_event() strerror() counterpart · aa31be3a
      Arnaldo Carvalho de Melo authored
      That is just bpf__strerror_setup_stdout() renamed to the more general
      "setup_output_event" method, keep the existing stdout() as a wrapper.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-nwnveo428qn0b48axj50vkc7@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa31be3a
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Generalize bpf__setup_stdout() · 92bbe8d8
      Arnaldo Carvalho de Melo authored
      We will use it to set up other bpf-output events, for instance to
      generate augmented syscall entry tracepoints with pointer contents.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-4r7kw0nsyi4vyz6xm1tzx6a3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92bbe8d8
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Make bpf__for_each_stdout_map() generic · 5941d856
      Arnaldo Carvalho de Melo authored
      By passing a 'name' arg, that will eventually be used to setup more
      "bpf-output" events, e.g. to create a event where to create raw_syscalls
      like events that in addition to the syscall arguments will also copy the
      pointer contents being passed from/to userspace.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-talrnxps9p3qozk3aeh91fgv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5941d856
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Add bpf/stdio.h wrapper to bpf_perf_event_output function · 53a5d7b8
      Arnaldo Carvalho de Melo authored
      That, together with the map __bpf_output__ that is already handled by
      'perf trace' to print that event's contents as strings provides a
      debugging facility, to show it in use, print a simple string everytime
      the syscalls:sys_enter_openat() syscall tracepoint is hit:
      
        # cat tools/perf/examples/bpf/hello.c
        #include <stdio.h>
      
        int syscall_enter(openat)(void *args)
        {
      	  puts("Hello, world\n");
      	  return 0;
        }
      
        license(GPL);
        #
        # perf trace -e openat,tools/perf/examples/bpf/hello.c cat /etc/passwd > /dev/null
           0.016 (         ): __bpf_stdout__:Hello, world
           0.018 ( 0.010 ms): cat/9079 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
           0.057 (         ): __bpf_stdout__:Hello, world
           0.059 ( 0.011 ms): cat/9079 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
           0.417 (         ): __bpf_stdout__:Hello, world
           0.419 ( 0.009 ms): cat/9079 openat(dfd: CWD, filename: /etc/passwd) = 3
        #
      
      This is part of an ongoing experimentation on making eBPF scripts as
      consumed by perf to be as concise as possible and using familiar
      concepts such as stdio.h functions, that end up just wrapping the
      existing BPF functions, trying to hide as much boilerplate as possible
      while using just conventions and C preprocessor tricks.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-4tiaqlx5crf0fwpe7a6j84x7@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      53a5d7b8
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Add struct bpf_map struct · 7402e543
      Arnaldo Carvalho de Melo authored
      A helper structure used by eBPF C program to describe map attributes to
      elf_bpf loader, to be used initially by the special __bpf_stdout__ map
      used to print strings into the perf ring buffer in BPF scripts, e.g.:
      
      Using the upcoming stdio.h and puts() macros to use the __bpf_stdout__
      map to add strings to the ring buffer:
      
        # cat tools/perf/examples/bpf/hello.c
        #include <stdio.h>
      
        int syscall_enter(openat)(void *args)
        {
      	  puts("Hello, world\n");
      	  return 0;
        }
      
        license(GPL);
        #
        # cat ~/.perfconfig
        [llvm]
      	dump-obj = true
        # perf trace -e openat,tools/perf/examples/bpf/hello.c/call-graph=dwarf/ cat /etc/passwd > /dev/null
        LLVM: dumping tools/perf/examples/bpf/hello.o
           0.016 (         ): __bpf_stdout__:Hello, world
           0.018 ( 0.010 ms): cat/9079 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
           0.057 (         ): __bpf_stdout__:Hello, world
           0.059 ( 0.011 ms): cat/9079 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC           ) = 3
           0.417 (         ): __bpf_stdout__:Hello, world
           0.419 ( 0.009 ms): cat/9079 openat(dfd: CWD, filename: /etc/passwd                                ) = 3
        #
        # file tools/perf/examples/bpf/hello.o
        tools/perf/examples/bpf/hello.o: ELF 64-bit LSB relocatable, *unknown arch 0xf7* version 1 (SYSV), not stripped
         # readelf -SW tools/perf/examples/bpf/hello.o
        There are 10 section headers, starting at offset 0x208:
      
        Section Headers:
          [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
          [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
          [ 1] .strtab           STRTAB          0000000000000000 000188 00007f 00      0   0  1
          [ 2] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
          [ 3] syscalls:sys_enter_openat PROGBITS        0000000000000000 000040 000088 00  AX  0   0  8
          [ 4] .relsyscalls:sys_enter_openat REL             0000000000000000 000178 000010 10      9   3  8
          [ 5] maps              PROGBITS        0000000000000000 0000c8 00001c 00  WA  0   0  4
          [ 6] .rodata.str1.1    PROGBITS        0000000000000000 0000e4 00000e 01 AMS  0   0  1
          [ 7] license           PROGBITS        0000000000000000 0000f2 000004 00  WA  0   0  1
          [ 8] version           PROGBITS        0000000000000000 0000f8 000004 00  WA  0   0  4
          [ 9] .symtab           SYMTAB          0000000000000000 000100 000078 18      1   1  8
        Key to Flags:
          W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
          L (link order), O (extra OS processing required), G (group), T (TLS),
          C (compressed), x (unknown), o (OS specific), E (exclude),
          p (processor specific)
          # readelf -s tools/perf/examples/bpf/hello.o
      
        Symbol table '.symtab' contains 5 entries:
         Num:    Value          Size Type    Bind   Vis      Ndx Name
           0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
           1: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    5 __bpf_stdout__
           2: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    7 _license
           3: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    8 _version
           4: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    3 syscall_enter_openat
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-81fg60om2ifnatsybzwmiga3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7402e543
    • Jiri Olsa's avatar
      perf report: Add --percent-type option · e6902d1b
      Jiri Olsa authored
      Set annotation percent type from following choices:
      
        global-period, local-period, global-hits, local-hits
      
      With following report option setup the percent type will be passed to
      annotation browser:
      
        $ perf report --percent-type period-local
      
      The local/global keywords set if the percentage is computed in the scope
      of the function (local) or the whole data (global).  The period/hits
      keywords set the base the percentage is computed on - the samples period
      or the number of samples (hits).
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-21-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e6902d1b
    • Jiri Olsa's avatar
      perf annotate: Add --percent-type option · 88c21190
      Jiri Olsa authored
      Add --percent-type option to set annotation percent type from following
      choices:
      
        global-period, local-period, global-hits, local-hits
      
      Examples:
      
        $ perf annotate --percent-type period-local --stdio | head -1
         Percent         |      Source code ... es, percent: local period)
        $ perf annotate --percent-type hits-local --stdio | head -1
         Percent         |      Source code ... es, percent: local hits)
        $ perf annotate --percent-type hits-global --stdio | head -1
         Percent         |      Source code ... es, percent: global hits)
        $ perf annotate --percent-type period-global --stdio | head -1
         Percent         |      Source code ... es, percent: global period)
      
      The local/global keywords set if the percentage is computed in the scope
      of the function (local) or the whole data (global).
      
      The period/hits keywords set the base the percentage is computed on -
      the samples period or the number of samples (hits).
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-20-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88c21190
    • Jiri Olsa's avatar
      perf annotate: Display percent type in stdio output · 4c04868f
      Jiri Olsa authored
      In following patches we will allow to switch percent type even for stdio
      annotation outputs. Adding the percent type value into the annotation
      outputs title.
      
        $ perf annotate --stdio
         Percent         |      Sou ... instructions:u } (2805 samples, percent: local period)
        --------------------------- ... ------------------------------------------------------
        ...
      
        $ perf annotate --stdio2
        Samples: 2K of events 'anon ...  count (approx.): 156525487, [percent: local period]
        safe_write.c() /usr/bin/yes
        Percent
        ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-19-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4c04868f
    • Jiri Olsa's avatar
      perf annotate: Make local period the default percent type · addba8b6
      Jiri Olsa authored
      Currently we display the percentages in annotation output based on
      number of samples hits. Switching it to period based percentage by
      default, because it corresponds more to the time spent on the line.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-18-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      addba8b6
    • Jiri Olsa's avatar
      perf annotate: Add support to toggle percent type · 3e0d7953
      Jiri Olsa authored
      Add new key bindings to toggle percent type/base in annotation UI browser:
      
       'p' to switch between local and global percent type
       'b' to switch between hits and perdio percent base
      
      Add the following help messages to the UI browser '?' window:
      
        ...
        p             Toggle percent type [local/global]
        b             Toggle percent base [period/hits]
        ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-17-jolsa@kernel.org
      [ Moved percent_type to be the last arg to sym_title(), its an arg to what is being formmated (buf, size) ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e0d7953
    • Jiri Olsa's avatar
      perf annotate: Pass browser percent_type in annotate_browser__calc_percent() · d4265b1a
      Jiri Olsa authored
      Pass browser percent_type in annotate_browser__calc_percent().
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-16-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4265b1a
    • Jiri Olsa's avatar
      perf annotate: Pass 'struct annotation_options' to map_symbol__annotation_dump() · 4c650ddc
      Jiri Olsa authored
      Pass 'struct annotation_options' to map_symbol__annotation_dump(), to
      carry on and pass the percent_type value.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-15-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4c650ddc
    • Jiri Olsa's avatar
      perf annotate: Pass struct annotation_options to symbol__calc_lines() · c849c12c
      Jiri Olsa authored
      Pass struct annotation_options to symbol__calc_lines(), to carry on and
      pass the percent_type value.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-14-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c849c12c
    • Jiri Olsa's avatar
      perf annotate: Add percent_type to struct annotation_options · 796ca33d
      Jiri Olsa authored
      It will be used to carry user selection of percent type for annotation
      output.
      
      Passing the percent_type to the annotation_line__print function as the
      first step and making it default to current percentage type
      (PERCENT_HITS_LOCAL) value.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-13-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      796ca33d
    • Jiri Olsa's avatar
      perf annotate: Add PERCENT_PERIOD_GLOBAL percent value · e58684df
      Jiri Olsa authored
      Adding and computing global period percent value for annotation line.
      Storing it in struct annotation_data percent array under new
      PERCENT_PERIOD_GLOBAL index.
      
      At the moment it's not displayed, it's coming in following patches.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-12-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e58684df
    • Jiri Olsa's avatar
      perf annotate: Add PERCENT_PERIOD_LOCAL percent value · ab371169
      Jiri Olsa authored
      Adding and computing local period percent value for annotation line.
      Storing it in struct annotation_data percent array under new
      PERCENT_PERIOD_LOCAL index.
      
      At the moment it's not displayed, it's coming in following patches.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-11-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab371169
    • Jiri Olsa's avatar
      perf annotate: Add PERCENT_HITS_GLOBAL percent value · 75a8c1ff
      Jiri Olsa authored
      Adding and computing global hits percent value for annotation line.
      Storing it in struct annotation_data percent array under new
      PERCENT_HITS_GLOBAL index.
      
      At the moment it's not displayed, it's coming in following patches.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-10-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      75a8c1ff
    • Jiri Olsa's avatar
      perf annotate: Switch struct annotation_data::percent to array · 6d9f0c2d
      Jiri Olsa authored
      So we can hold multiple percent values for annotation line.
      
      The first member of this array is current local hits percent value
      (PERCENT_HITS_LOCAL index), so no functional change is expected.
      
      Adding annotation_data__percent function to return requested percent
      value from struct annotation_data.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-9-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6d9f0c2d
    • Jiri Olsa's avatar
      perf annotate: Loop group events directly in annotation__calc_percent() · 2bcf7306
      Jiri Olsa authored
      We need to bring in 'struct hists' object and for that we need 'struct
      perf_evsel' object in the scope.
      
      Switching the group data loop with the evsel group loop.  It does the
      same thing, but it brings evsel object, that we can use later get the
      'struct hists' object.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-8-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2bcf7306
    • Jiri Olsa's avatar
      perf annotate: Rename hist to sym_hist in annotation__calc_percent · 48a1e4f2
      Jiri Olsa authored
      We will need to bring in 'struct hists' variable in this scope, so it's
      better we do this rename first.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-7-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      48a1e4f2
    • Jiri Olsa's avatar
      perf annotate: Rename local sample variables to data · 0440af74
      Jiri Olsa authored
      Based on previous rename, changing also the local variable names to fit
      properly.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-6-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0440af74
    • Jiri Olsa's avatar
      perf annotate: Rename struct annotation_line::samples* to data* · c2f938ba
      Jiri Olsa authored
      The name 'samples*' is little confusing because we have nested 'struct
      sym_hist_entry' under annotation_line struct, which holds 'nr_samples'
      as well.
      
      Also the holding struct name is 'annotation_data' so the 'data' name
      fits better.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-5-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c2f938ba
    • Jiri Olsa's avatar
      perf annotate: Get rid of annotation__scnprintf_samples_period() · 0683d13c
      Jiri Olsa authored
      We have more current function tto get the title for annotation,
      which is hists__scnprintf_title. They both have same output as
      far as the annotation's header line goes.
      
      They differ in counting of the nr_samples, hists__scnprintf_title
      provides more accurate number based on the setup of the
      symbol_conf.filter_relative variable.
      
      Plus it also displays any uid/thread/dso/socket filters/zooms
      if there are set any, which annotation__scnprintf_samples_period
      does not.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-4-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0683d13c
    • Jiri Olsa's avatar
      perf annotate: Make annotation_line__max_percent static · 5ecf7d30
      Jiri Olsa authored
      There's no outside user of it.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/r/20180804130521.11408-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ecf7d30
    • Jiri Olsa's avatar
      perf annotate: Make symbol__annotate_fprintf2() local · 7a3e71e0
      Jiri Olsa authored
      There's no outside user of it.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lkml.kernel.org/r/20180804130521.11408-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7a3e71e0
    • Arnaldo Carvalho de Melo's avatar
      perf bpf: Add 'syscall_enter' probe helper for syscall enter tracepoints · dda9ac96
      Arnaldo Carvalho de Melo authored
      Allowing one to hook into the syscalls:sys_enter_NAME tracepoints,
      an example is provided that hooks into the 'openat' syscall.
      
      Using it with the probe:vfs_getname probe into getname_flags to get the
      filename args as it is copied from userspace:
      
        # perf probe -l
        probe:vfs_getname    (on getname_flags:73@acme/git/linux/fs/namei.c with pathname)
        # perf trace -e probe:*getname,tools/perf/examples/bpf/sys_enter_openat.c cat /etc/passwd > /dev/null
           0.000 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/ld.so.preload"
           0.022 syscalls:sys_enter_openat:dfd: CWD, filename: 0xafbe8da8, flags: CLOEXEC
           0.027 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/ld.so.cache"
           0.054 syscalls:sys_enter_openat:dfd: CWD, filename: 0xafdf0ce0, flags: CLOEXEC
           0.057 probe:vfs_getname:(ffffffffbd2a8983) pathname="/lib64/libc.so.6"
           0.316 probe:vfs_getname:(ffffffffbd2a8983) pathname="/usr/lib/locale/locale-archive"
           0.375 syscalls:sys_enter_openat:dfd: CWD, filename: 0xe2b2b0b4
           0.379 probe:vfs_getname:(ffffffffbd2a8983) pathname="/etc/passwd"
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-2po9jcqv1qgj0koxlg8kkg30@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dda9ac96
    • Yury Norov's avatar
      perf tools: Drop unneeded bitmap_zero() calls · 3c8b8186
      Yury Norov authored
      bitmap_zero() is called after bitmap_alloc() in perf code. But
      bitmap_alloc() internally uses calloc() which guarantees that allocated
      area is zeroed. So following bitmap_zero is unneeded. Drop it.
      
      This happened because of confusing name for bitmap allocator. It
      should has name bitmap_zalloc instead of bitmap_alloc.
      
      This series:
      
        https://lkml.org/lkml/2018/6/18/841
      
      introduces a new API for bitmap allocations in kernel, and functions
      there are named correctly. Following patch propogates the API to tools,
      and fixes naming issue.
      Signed-off-by: default avatarYury Norov <ynorov@caviumnetworks.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Link: http://lkml.kernel.org/r/20180623073502.16321-1-ynorov@caviumnetworks.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3c8b8186
    • Sean V Kelley's avatar
      perf vendor events arm64: Enable JSON events for eMAG · 704089e7
      Sean V Kelley authored
      This patch adds the Ampere Computing eMAG file.  This platform follows
      the ARMv8 recommended IMPLEMENTATION DEFINED events, where applicable.
      Signed-off-by: default avatarSean V Kelley <seanvk.dev@oregontracks.org>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: linux-arm-kernel@lists.infradead.org
      LPU-Reference: 20180803041811.17065-1-seanvk.dev@oregontracks.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      704089e7
    • Thomas Richter's avatar
      perf report: Add GUI report support for s390 auxiliary trace · 33d9e183
      Thomas Richter authored
      Add support for s390 auxiliary trace support.
      
      Use 'perf record -e rbd000 -- ls' to create the perf.data file.
      
      Use 'perf report' to display the auxiliary trace data.
      
      Output before:
      
        [root@s35lp76 perf]# ./perf report --stdio
        0x128 [0x10]: failed to process type: 70
        Error:
        failed to process sample
        [root@s35lp76 perf]#
      
      Output after:
      
        [root@s35lp76 perf]# ./perf report --stdio
      
            18.21%    18.21%  ls     [kernel.kallsyms]       [k] ftrace_likely_update
             9.52%     9.52%  ls     [kernel.kallsyms]       [k] lock_acquire
             9.38%     9.38%  ls     [kernel.kallsyms]       [k] lock_release
             3.45%     3.45%  ls     [kernel.kallsyms]       [k] lock_acquired
             2.88%     2.88%  ls     [kernel.kallsyms]       [k] link_path_walk
             2.63%     2.63%  ls     [kernel.kallsyms]       [k] __d_lookup
             2.38%     2.38%  ls     [kernel.kallsyms]       [k] __d_lookup_rcu
             2.04%     2.04%  ls     [kernel.kallsyms]       [k] ___might_sleep
             1.83%     1.83%  ls     [kernel.kallsyms]       [k] debug_lockdep_rcu_enabled
             1.44%     1.44%  ls     [kernel.kallsyms]       [k] dput
           ....
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180802074622.13641-4-tmricht@linux.ibm.com
      [ Use PRI[xd]64 to fix the build on debian:experimental-x-mips (gcc 8.1.0) and others ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      33d9e183
    • Thomas Richter's avatar
      perf report: Add raw report support for s390 auxiliary trace · 2b1444f2
      Thomas Richter authored
      Add support for s390 auxiliary trace support.
      
      Use 'perf record -e rbd000' to create the perf.data file.  The event
      also has the symbolic name SF_CYCLES_BASIC_DIAG, using 'perf record -e
      SF_CYCLES_BASIC_DIAG' is equivalent.
      
      Use 'perf report -D' to display the auxiliary trace data.
      
      Output before:
      
       0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000
                       offset: 0  ref: 0  idx: 4  tid: -1  cpu: 4
           Nothing else
      
      Output after:
      
       0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000
                        offset: 0  ref: 0  idx: 4  tid: -1  cpu: 4
       .
       . ... s390 AUX data: size 262144 bytes
          [00000000] Basic   Def:0001 Inst:0000 TW   AS:3 ASN:0xffff IA:0x0000000000c2f1bc
      		CL:1 HPP:0x8000000000000000 GPP:000000000000000000
          [0x000020] Diag    Def:8005
          [0x0000bf] Basic   Def:0001 Inst:0000 TW   AS:3 ASN:0xffff IA:0x0000000000c2f1bc
      		CL:1 HPP:0x8000000000000000 GPP:000000000000000000
          [0x0000df] Diag    Def:8005
          [0x00017e] Basic   Def:0001 Inst:0000 TW   AS:3 ASN:0xffff IA:0x0000000000c2f1bc
      		CL:1 HPP:0x8000000000000000 GPP:000000000000000000
          ....
          [0x000fc0] Trailer F T bsdes:32 dsdes:159 Overflow:0 Time:0xd4ab59a8450fa108
      		C:1 TOD:0xd4ab4ec98ceb3832 1:0x8000000000000000 2:0xd4ab4ec98ceb3832
      
      This output is shown for every sampled data block. The
      output contains the
      
       - basic-sampling data entry
      
       - diagnostic-sampling data entry
      
       - trailer entry
      
      The basic sampling entry and diagnostic sampling entry sizes can be
      extracted using the trailer entries in the SDB.  On older hardware these
      values (bsdes and dsdes in the trailer entry) are reserved and zero.
      Older hardware use hard coded values based on the s390 machine type.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Link: http://lkml.kernel.org/r/20180802074622.13641-3-tmricht@linux.ibm.com
      Link: http://lkml.kernel.org/r/eda2632e-7919-5ffd-5f68-821e77d216fa@linux.ibm.com
      [ Merged a fix for a 'tipe puned' problem reported by Michael Ellerman see last Link tag. ]
      [ Removed __packed from two structs, they're already naturally packed and having that. ]
      [ attribute breaks the build in gcc 8.1.1 mips, 4.4.7 x86_64, 7.1.1 ARCompact ISA, etc) ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2b1444f2
  4. 03 Aug, 2018 1 commit
    • Thomas Richter's avatar
      perf auxtrace: Support for perf report -D for s390 · b96e6615
      Thomas Richter authored
      Add initial support for s390 auxiliary traces using the CPU-Measurement
      Sampling Facility.
      
      Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data
      file. Later patches will show the contents of the auxiliary traces.
      
      Setup the auxtrace queues and data structures for s390.  A raw dump of
      the perf.data file now does not show an error when an auxtrace event is
      encountered.
      
      Output before:
      
        [root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace
        0x128 [0x10]: failed to process type: 70
        Error:
        failed to process sample
      
        0x128 [0x10]: event: 70
        .
        . ... raw event: size 16 bytes
        .  0000:  00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00  ...F............
      
        0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0
        [root@s35lp76 perf]#
      
      Output after:
      
         # ./perf report -D -i perf.data.auxtrace |fgrep PERF_RECORD_AUXTRACE
        0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5
        0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000
      	   offset: 0  ref: 0  idx: 4  tid: -1  cpu: 4
        ....
      
      Additional notes about the underlying hardware and software
      implementation, provided by Hendrik Brueckner (see Link: below).
      
      =============================================================================
      
      The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain
      performance information on the mainframe.  Basically, it was introduced
      with System z10 years ago for the z/Architecture, that means, 64-bit.
      For Linux, there are two facilities of interest, counter facility and sampling
      facility.  The counter facility provides hardware counters for instructions,
      cycles, crypto-activities, and many more.
      
      The sampling facility is a hardware sampler that when started will write
      samples at a particular interval into a sampling buffer.  At some point,
      for example, if a sample block is full, it generates an interrupt to collect
      samples (while the sampler continues to run).
      
      Few years ago, I started to provide the a perf PMU to use the counter
      and sampling facilities.  Recently, the device driver was updated to also
      "export" the sampling buffer into the AUX area.  Thomas now completed the
      related perf work to interpret and process these AUX data.
      
      If people are more interested in the sampling facility, they can have a
      look into:
      
      - The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05
        http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a
      
      and to learn how-to use it for Linux on Z, have look at chapter 54,
      "Using the CPU-measurement facilities" in the:
      
      - Device Drivers, Features, and Commands, SC33-8411-34
        http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf
      
      =============================================================================
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b96e6615