1. 26 Jul, 2017 10 commits
  2. 25 Jul, 2017 7 commits
    • Jiri Olsa's avatar
      perf evsel: Add verbose output for sys_perf_event_open fallback · 2b04e0f8
      Jiri Olsa authored
      Adding info about what is being switched off in the sys_perf_event_open
      fallback.
      
      New output (notice the 'switching off' lines):
      
        $ perf stat -e '{cycles,instructions}' -vvv ls
        Using CPUID GenuineIntel-6-3D
        intel_pt default config: tsc
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0x8
        sys_perf_event_open failed, error -22
        switching off cloexec flag
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
        sys_perf_event_open failed, error -22
        switching off exclude_guest, exclude_host
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
        ------------------------------------------------------------
        sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
        sys_perf_event_open failed, error -22
        switching off sample_id_all
        ------------------------------------------------------------
        perf_event_attr:
          size                             112
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
          disabled                         1
          inherit                          1
          enable_on_exec                   1
        ...
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20170721121212.21414-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2b04e0f8
    • Sudeep Holla's avatar
      perf jvmti: Fix linker error when libelf config is disabled · 5d90faf4
      Sudeep Holla authored
      When libelf is disabled in the configuration, we get the following
      linker error:
      
        LINK     libperf-jvmti.so
        ld: cannot find -lelf
        Makefile.perf:515: recipe for target 'libperf-jvmti.so' failed
      
      Jiri pointed out that both librt and libelf are not really required. So
      this patch fixes the linker error by getting rid of unwanted libraries
      in the linker stage.
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Acked-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 209045ad ("perf tools: add JVMTI agent library")
      Link: http://lkml.kernel.org/r/20170719011839.99399-5-davidcc@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5d90faf4
    • David Carrillo-Cisneros's avatar
      perf annotate: Process tracing data in pipe mode · f4849599
      David Carrillo-Cisneros authored
      'perf annotate' was missing the handler for tracing data records.
      
      Prior to this patch we obtained "unhandled" records when piping trace
      events to perf annotate (using -D option to show the dump_printf
      messages in process_event_synth_tracing_data_stub):
      
        $ perf record -o - -e block:bio_free sleep 2 | perf annotate -D --stdio
        ...
        0x78 [0xc]: PERF_RECORD_TRACING_DATA: unhandled!
        ...
      Signed-off-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170719011839.99399-4-davidcc@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f4849599
    • David Carrillo-Cisneros's avatar
      perf tools: Add EXCLUDE_EXTLIBS and EXTRA_PERFLIBS to makefile · cb281fea
      David Carrillo-Cisneros authored
      The goal is to allow users to override linking of libraries that
      were automatically added to PERFLIBS.
      
      EXCLUDE_EXTLIBS contains linker flags to be removed from LIBS
      while EXTRA_PERFLIBS contains linker flags to be added.
      
      My use case is to force certain library to be build statically,
      e.g. for libelf:
      
        EXCLUDE_EXTLIBS=-lelf EXTRA_PERFLIBS=path/libelf.a
      Signed-off-by: default avatarDavid Carrillo-Cisneros <davidcc@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Paul Turner <pjt@google.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20170719011839.99399-3-davidcc@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb281fea
    • Arnaldo Carvalho de Melo's avatar
      perf cgroup: Fix refcount usage · cd8dd032
      Arnaldo Carvalho de Melo authored
      When converting from atomic_t to refcount_t we didn't follow the usual
      step of initializing it to one before taking any new reference, which
      trips over checking if taking a reference for a freed refcount_t, fix
      it.
      
      Brendan's report:
      
       ---
      It's 4.12-rc7, with node v4.4.1. I'm building 4.13-rc1 now, as I hit
      what I think is another unrelated perf bug and I'm starting to wonder
      what else is broken on that version:
      
      (root) /mnt/src/linux-4.12-rc7/tools/perf # ./perf record -F 99 -a -e
      cpu-clock --cgroup=docker/f9e9d5df065b14646e8a11edc837a13877fd90c171137b2ba3feb67a0201cb65
      -g
      perf: /mnt/src/linux-4.12-rc7/tools/include/linux/refcount.h:108:
      refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
      Aborted
      
      that used to work...
       ---
      
      Testing it:
      
      Before:
      
        # perf stat -e cycles -C 0 --cgroup /
        perf: /home/acme/git/linux/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
        Aborted (core dumped)
        #
      
      After:
      
        # perf stat -e cycles -C 0 --cgroup /
      ^C
        Performance counter stats for 'CPU(s) 0':
      
             132,081,393      cycles                    /
      
             2.492942763 seconds time elapsed
      
        #
      Reported-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Acked-by: default avatarElena Reshetova <elena.reshetova@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Krister Johansen <kjlx@templeofstupid.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sudeep Holla <Sudeep.Holla@arm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 79c5fe6d ("perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t")
      Link: http://lkml.kernel.org/n/tip-l7ovfblq14ip2i08m1g0fkhv@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cd8dd032
    • Thomas Richter's avatar
      perf report: Fix kernel symbol adjustment for s390x · cf6383f7
      Thomas Richter authored
      On s390x the kernel text segment starts at address 0x0.  When perf
      report reads kernel symbols from vmlinux file it adds an offset of
      0x1000.
      
      For example see symbol set_reset_devices:
      
        [root@s8360047 linux-devel]# nm -A vmlinux| fgrep set_reset_devices
        vmlinux:0000000001379000 t set_reset_devices
        [root@s8360047 linux-devel]#
      
        [root@s8360047 linux-devel]# fgrep set_reset_devices /proc/kallsyms
        0000000001379000 t set_reset_devices
        [root@s8360047 linux-devel]#
      
      The kernel symbol table and the vmlinux file have the same address for
      symbol set_reset_devices namely 1379000.
      
      When perf report reads this symbols it displays it with address
      symbol__new: set_reset_devices 0x137a000-0x137a018
      
      There is a difference between perf report and vmlinux of 0x1000.
      
      The reason for the difference is at kernel symbol load time in function
      dso__load_sym(). The vmlinux file is investigated with its ELF header.
      Command readelf shows this:
      
        Section Headers:
        [Nr] Name              Type             Address           Offset
             Size              EntSize          Flags  Link  Info  Align
        [ 0]                   NULL             0000000000000000  00000000
             0000000000000000  0000000000000000           0     0     0
        [ 1] .text             PROGBITS         0000000000000000  00001000
             0000000000b0e0c2  0000000000000000  AX       0     0     128
      
      This leads to an invalid calculation of the symbol start address, see
      file utit/symbol-elf.c line 974:
      
              /* Adjust symbol to map to file offset */
              if (adjust_kernel_syms)
                      sym.st_value -= shdr.sh_addr - shdr.sh_offset;
      
      With shdr.sh_addr set to 0x0 and shdr.sh_offset set to 0x1000 as read
      from the ELF .text section 0x1000 is added to the symbol address.
      
      I would like to fix this by introducing an archticture specific function
      named elf__needs_adjust_symbols(). This is the same approach as done by
      PowerPC.  The function currently does not exist for s390x and the
      default weak one is used.  The s390x specific one returns false when
      symsrc_init() is invoked for kernel symbols and results in variable
      adjust_kernel_syms being false.  This omits the adjustment and the
      correct address is displayed (when symbol resolvement does not work).
      
      The s390x specific function returns false for kernel symbol adjustment
      and returns true for kernel modules, processes and shared libraries.
      Signed-off-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      LPU-Reference: 20170713130252.6167-1-tmricht@linux.vnet.ibm.com
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cf6383f7
    • Taeung Song's avatar
      perf annotate stdio: Fix --show-total-period · 585d93c5
      Taeung Song authored
      We were showing the total number of samples, not the total period as
      asked by the user, fix it.
      Reported-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Martin Liška <mliska@suse.cz>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Link: http://lkml.kernel.org/n/tip-lh2nh89rtqn5x5vbfthw6qml@git.kernel.org
      Fixes: 0c4a5bce ("perf annotate: Display total number of samples with --show-total-period")
      [ split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      585d93c5
  3. 21 Jul, 2017 5 commits
  4. 20 Jul, 2017 18 commits
    • Arnaldo Carvalho de Melo's avatar
      tools lib: Update copy of strtobool from the kernel sources · b99e4850
      Arnaldo Carvalho de Melo authored
      Getting support for "on", "off" introduced in a81a5a17 ("lib: add
      "on"/"off" support to kstrtobool") and making it check for NULL,
      introduced in ef951599 ("lib: move strtobool() to kstrtobool()").
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Kees Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/n/tip-mu8ghin4rklacmmubzwv8td7@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b99e4850
    • Arnaldo Carvalho de Melo's avatar
      tools include: Adopt strstarts() from the kernel · 8e99b6d4
      Arnaldo Carvalho de Melo authored
      Replacing prefixcmp(), same purpose, inverted result, so standardize on
      the kernel variant, to reduce silly differences among tools/ and the
      kernel sources, making it easier for people to work in both codebases.
      
      And then doing:
      
      	if (strstarts(option, "no-"))
      
      Looks clearer than doing:
      
      	if (!prefixcmp(option, "no-"))
      
      To figure out if option starts witn "no-".
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kaei42gi7lpa8subwtv7eug8@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8e99b6d4
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Filter out 'sshd' in the tracer ancestry in syswide tracing · 082ab9a1
      Arnaldo Carvalho de Melo authored
      Avoiding a loop, so now its quite convenient to ssh to a machine and
      then simply do:
      
      	# perf trace
      
      To trace all syscalls without causing a loop.
      
      This was possible using --filter-pids, i.e. once you noticed the loop,
      get the sshd pid and add it to --filter-pids, restarting the 'perf
      trace'.
      
      Now to figure out how to do that in a X terminal, the other common
      scenario, which is way more involved, as there are multiple processes
      communicating to process terminal activity...
      
      Using --filter-pids + '-e \!syscall,names,you,dont,need' may be a good
      approximation when having to do syswide tracing on your workstation.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-68rjeao9wnpylla41htk7xps@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      082ab9a1
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Introduce filter_loop_pids() · dd1a5037
      Arnaldo Carvalho de Melo authored
      No change in functionality, just to make clearer that what we want when
      filtering the tracer pid in a system wide tracing session is to avoid a
      feedback loop.
      
      This also paves the way for a more interesting loop avoidance algorithm,
      one that tries to figure out if we are in a ssh session, xterm, etc.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-5fcttc5kdjkcyp9404ezkuy9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd1a5037
    • Arnaldo Carvalho de Melo's avatar
      perf trace beauty clone: Suppress unused args according to 'flags' arg · 15bed274
      Arnaldo Carvalho de Melo authored
      The 'parent_tidptr', 'child_tidptr' and 'tls' arguments to the 'clone'
      syscall are only used when certain flags are set in 'flags', suppress
      them when those aren't there.
      
      E.g:
      
         9886.919 (0.236 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19608 (fetchmail)
        12876.052 (0.249 ms): qemu-system-x8/21238 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f48117fc770, parent_tidptr: 0x7f48117ff9d0, child_tidptr: 0x7f48117ff9d0, tls: 0x7f48117ff700) = 19611 (qemu-system-x86)
        12876.555 (0.048 ms): worker/19611 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f480f7f8770, parent_tidptr: 0x7f480f7fb9d0, child_tidptr: 0x7f480f7fb9d0, tls: 0x7f480f7fb700) = 19612 (worker)
        16575.240 (0.469 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19613 (fetchmail)
        20797.270 (0.335 ms): fetchmail/19298 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fe43f468590) = 19614 (fetchmail)
        21228.585 (0.501 ms): vim/19519 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fbad6ac27d0) = 19615 (vim)
        21232.193 (0.137 ms): bash/19615 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, child_tidptr: 0x7fad8bff49d0) = 19616 (bash)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-0um93djul9knf239gwa5mpcb@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      15bed274
    • Arnaldo Carvalho de Melo's avatar
      perf trace beauty clone: Beautify syscall arguments · 33396a3a
      Arnaldo Carvalho de Melo authored
      Now, syswide tracing, selected entries:
      
        # trace -e clone
        24417.203 ( 0.158 ms): bash/11323 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11325 (bash)
                ? (     ?   ): bash/11325  ... [continued]: clone()) = 0
        24419.355 ( 0.093 ms): bash/10586 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11326 (bash)
                ? (     ?   ): bash/11326  ... [continued]: clone()) = 0
        24419.744 ( 0.102 ms): bash/11326 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11327 (bash)
                ? (     ?   ): bash/11327  ... [continued]: clone()) = 0
        24420.138 ( 0.105 ms): bash/11327 clone(flags: CHILD_CLEARTID|CHILD_SETTID|0x11, child_stack: 0, parent_tidptr: 0, child_tidptr: 0x7f0778e5c9d0, tls: 0x7f0778e5c700) = 11328 (bash)
                ? (     ?   ): bash/11328  ... [continued]: clone()) = 0
        35747.722 ( 0.044 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff0755f6ff0, parent_tidptr: 0x7ff0755f79d0, child_tidptr: 0x7ff0755f79d0, tls: 0x7ff0755f7700) = 11329 (gpg-agent)
                ? (     ?   ): gpg-agent/11329  ... [continued]: clone()) = 0
        35748.359 ( 0.022 ms): gpg-agent/18087 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ff075df7ff0, parent_tidptr: 0x7ff075df89d0, child_tidptr: 0x7ff075df89d0, tls: 0x7ff075df8700) = 11330 (gpg-agent)
                ? (     ?   ): gpg-agent/11330  ... [continued]: clone()) = 0
        35781.422 ( 0.452 ms): NetworkManager/1112 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7f2f1fffedb0, parent_tidptr: 0x7f2f1ffff9d0, child_tidptr: 0x7f2f1ffff9d0, tls: 0x7f2f1ffff700) = 11331 (NetworkManager)
                ? (     ?   ): NetworkManager/11331  ... [continued]: clone()) = 0
      
      Need to improve the formatting of the second return, to the child, this
      cset only focused on the argument formatting.
      
      If we trace just one pid:
      
        # trace -e clone -p 19863
           0.349 ( 0.025 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb84eaac70, parent_tidptr: 0x7ffb84eab9d0, child_tidptr: 0x7ffb84eab9d0, tls: 0x7ffb84eab700) = 11637 (Chrome_IOThread)
           0.392 ( 0.013 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb664b8c70, parent_tidptr: 0x7ffb664b99d0, child_tidptr: 0x7ffb664b99d0, tls: 0x7ffb664b9700) = 11638 (Chrome_IOThread)
           0.573 ( 0.015 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb6046cc70, parent_tidptr: 0x7ffb6046d9d0, child_tidptr: 0x7ffb6046d9d0, tls: 0x7ffb6046d700) = 11639 (Chrome_IOThread)
           0.617 ( 0.014 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb730dcc70, parent_tidptr: 0x7ffb730dd9d0, child_tidptr: 0x7ffb730dd9d0, tls: 0x7ffb730dd700) = 11640 (Chrome_IOThread)
           4.350 ( 0.065 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb720d9c70, parent_tidptr: 0x7ffb720da9d0, child_tidptr: 0x7ffb720da9d0, tls: 0x7ffb720da700) = 11642 (Chrome_IOThread)
           5.642 ( 0.079 ms): Chrome_IOThrea/19863 clone(flags: VM|FS|FILES|SIGHAND|THREAD|SYSVSEM|SETTLS|PARENT_SETTID|CHILD_CLEARTID, child_stack: 0x7ffb718d8c70, parent_tidptr: 0x7ffb718d99d0, child_tidptr: 0x7ffb718d99d0, tls: 0x7ffb718d9700) = 11643 (Chrome_IOThread)
      ^C#
      
      We'll also have to fix the argument ordering in different arches,
      probably having multiple syscall_fmt entries with each possible order
      and then use perf_evsel__env_arch() (if dealing with a perf.data file)
      or the current system info, for live sessions.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-am068uyubgj83snepolwhbfe@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      33396a3a
    • Arnaldo Carvalho de Melo's avatar
      tools include uapi: Grab a copy of linux/sched.h · 450c86c9
      Arnaldo Carvalho de Melo authored
      So that we make sure we have recent enough defines for things
      such as 'perf trace' system call argument beautifiers.
      
      For instance, the 'clone' syscall argument 'flag' needs to use
      CLONE_NEWCGROUP, and that is not available in RHEL7.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-81sln0ng4a2lcxrth14vcov4@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      450c86c9
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Allow specifying names to syscall arguments formatters · c51bdfec
      Arnaldo Carvalho de Melo authored
      For tracepointless syscalls, like clone, otherwise get them from the
      tracepoint's /format file.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-ml5qvv1w5k96ghwhxpzzsmm3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c51bdfec
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Allow specifying number of syscall args for tracepointless syscalls · 332337da
      Arnaldo Carvalho de Melo authored
      When we don't have syscalls:sys_{enter,exit}_NAME, we had to resort to
      dumping all the 6 syscall arguments, fix it by providing that info for
      such syscalls, like 'clone'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-dfq1jtrxj8dqvqoeqqpr3slu@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      332337da
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Ditch __syscall__arg_val() variant, not needed anymore · 325f5091
      Arnaldo Carvalho de Melo authored
      All callers now can use syscall__arg_val(arg, idx), be it to iterate
      thru the syscall arguments while taking into account alignment, or to
      get values for other arguments that affect how the current argument
      should be formatted (think of fcntl's 'cmd' and 'arg' arguments).
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-wm5b156d8kro1r4y3b33eyta@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      325f5091
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Use the syscall_fmt formatters without a tracepoint · d032d79e
      Arnaldo Carvalho de Melo authored
      Previously we only used the syscall_fmt when we had sc->tp_format set,
      i.e. when we found the (enter, exit) pair in tracefs/events/syscalls/.
      
      But we really only need to use what is in sc->arg_fmt to apply the arg
      beautifiers to the syscall argument values, so do it.
      
      With this we will be able to provide formatters to the "clone" syscall,
      which doesn't have entries in tracefs/events/syscalls/.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-y41nl41jrayjo5ucnde2peix@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d032d79e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint · 5e58fcfa
      Arnaldo Carvalho de Melo authored
      At least "clone" doesn't have (enter, exit) entries tracefs/events/syscalls/,
      but we can provide a syscall_fmt and use it instead, as will be done for
      "clone" in the next cset.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-o12kejgcxddyovn2hlg4gbim@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5e58fcfa
    • Arnaldo Carvalho de Melo's avatar
      perf trace beauty mmap: Ignore 'fd' and 'offset' args for MAP_ANONYMOUS · d57da8c9
      Arnaldo Carvalho de Melo authored
      Just suppress them, not used by the kernel.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-atpt07y2x9a8ttlwja94ow3j@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d57da8c9
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Add missing ' = ' in the default formatting of syscall returns · 6f8fe61e
      Arnaldo Carvalho de Melo authored
      We lost it recently, put it back.
      
      Before:
      
        789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) 4328
      
      After:
      
        789.499 ( 0.001 ms): libvirtd/1175 lseek(fd: 22, whence: CUR) = 4328
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: 1f63139c ("perf trace beauty: Simplify syscall return formatting")
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6f8fe61e
    • Kan Liang's avatar
      perf intel-pt: Always set no branch for dummy event · 91a8c5b8
      Kan Liang authored
      An earlier kernel patch allowed enabling PT and LBR at the same time on
      Goldmont.
      
      commit ccbebba4 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity
      if the core supports it")
      
      However, users still cannot use Intel PT and LBRs simultaneously.  $
      sudo perf record -e cycles,intel_pt//u -b  -- sleep 1 Error: PMU
      Hardware doesn't support sampling/overflow-interrupts.
      
      PT implicitly adds dummy event in perf tool. dummy event is software
      event which doesn't support LBR.
      
      Always setting no branch for dummy event in Intel PT.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170630141656.1626-2-kan.liang@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      91a8c5b8
    • Kan Liang's avatar
      perf intel-pt: Set no_aux_samples for the tracking event · 69d8bd8a
      Kan Liang authored
      The reason of introducing the tracking event (a dummy software event) is
      to collect side-band information. Additional sampling is wasteful.
      no_aux_samples should be set for tracking event.
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170630141656.1626-1-kan.liang@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      69d8bd8a
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.13-20170718' of... · 510457ec
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.13-20170718' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - Initial support for namespaces, using setns to access files in
        namespaces, grabbing their build-ids, etc. We still need to work
        more to deal with namespaces that vanish before we can get the
        needed data to do analysis, but this should be as good as what is
        in bcc now (Krister Johansen)
      
      - Add header record types to pipe-mode, now this command:
      
        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header
      
        Will show the same as in non-pipe mode, i.e. involving a perf.data
        file (David Carrillo-Cisneros)
      
      - Implement a visual marker for fused x86 instructions in the annotate
        TUI browser, available now in 'perf report', more work needed to have
        it available as well in 'perf top' (Jin Yao)
      
        Further explanation from one of Jin's patches:
      
             │   ┌──cmpl   $0x0,argp_program_version_hook
       81.93 │   ├──je     20
             │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
             │   │↓ jne    29
             │   │↓ jmp    43
       11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
      
        That means the cmpl+je is a fused instruction pair and they should be
        considered together.
      
      - Record the branch type and then show statistics and info about
        in callchain entries (Jin Yao)
      
        Example from one of Jin's patches:
      
        # perf record -g -j any,save_type
        # perf report --branch-history --stdio --no-children
      
        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)
      
      - Beautify the fcntl syscall, which is an interesting one in the sense
        that infrastructure had to be put in place to change the formatters of
        some arguments according to the value in a previous one, i.e. cmd
        dictates how arg and the syscall return will be formatted.
        (Arnaldo Carvalho de Melo
      
      Infrastructure changes:
      
      - 'perf test attr' fixes (Jiri Olsa)
      
      Vendor events changes:
      
      - Add POWER9 PMU events Sukadev (Bhattiprolu)
      
      - Support additional POWER8+ PVR in PMU mapfile (Shriya)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      510457ec
    • Alexander Shishkin's avatar
      perf/core: Fix scheduling regression of pinned groups · 3bda69c1
      Alexander Shishkin authored
      Vince Weaver reported:
      
      > I was tracking down some regressions in my perf_event_test testsuite.
      > Some of the tests broke in the 4.11-rc1 timeframe.
      >
      > I've bisected one of them, this report is about
      >	tests/overflow/simul_oneshot_group_overflow
      > This test creates an event group containing two sampling events, set
      > to overflow to a signal handler (which disables and then refreshes the
      > event).
      >
      > On a good kernel you get the following:
      > 	Event perf::instructions with period 1000000
      > 	Event perf::instructions with period 2000000
      > 		fd 3 overflows: 946 (perf::instructions/1000000)
      > 		fd 4 overflows: 473 (perf::instructions/2000000)
      > 	Ending counts:
      > 		Count 0: 946379875
      > 		Count 1: 946365218
      >
      > With the broken kernels you get:
      > 	Event perf::instructions with period 1000000
      > 	Event perf::instructions with period 2000000
      > 		fd 3 overflows: 938 (perf::instructions/1000000)
      > 		fd 4 overflows: 318 (perf::instructions/2000000)
      > 	Ending counts:
      > 		Count 0: 946373080
      > 		Count 1: 653373058
      
      The root cause of the bug is that the following commit:
      
        487f05e1 ("perf/core: Optimize event rescheduling on active contexts")
      
      erronously assumed that event's 'pinned' setting determines whether the
      event belongs to a pinned group or not, but in fact, it's the group
      leader's pinned state that matters.
      
      This was discovered by Vince in the test case described above, where two instruction
      counters are grouped, the group leader is pinned, but the other event is not;
      in the regressed case the counters were off by 33% (the difference between events'
      periods), but should be the same within the error margin.
      
      Fix the problem by looking at the group leader's pinning.
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Tested-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 487f05e1 ("perf/core: Optimize event rescheduling on active contexts")
      Link: http://lkml.kernel.org/r/87lgnmvw7h.fsf@ashishki-desk.ger.corp.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3bda69c1