1. 27 Mar, 2017 9 commits
    • Milian Wolff's avatar
      perf report: Enable sorting by srcline as key · 5dfa210e
      Milian Wolff authored
      Often it is interesting to know how costly a given source line is in
      total. Previously, one had to build these sums manually based on all
      addresses that pointed to the same source line. This patch introduces
      srcline as a sort key, which will do the aggregation for us.
      
      Paired with the recent addition of showing inline frames, this makes
      perf report much more useful for many C++ work loads.
      
      The following shows the new feature in action. First, let's show the
      status quo output when we sort by address. The result contains many hist
      entries that generate the same output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g address
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--60.31%--hypot +20
                  |          |          |
                  |          |          |--8.52%--__hypot_finite +273
                  |          |          |
                  |          |          |--7.32%--__hypot_finite +411
      ...
                   --35.34%--_start +4194346
                             __libc_start_main +241
                             |
                             |--6.65%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--2.70%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
                             |
                             |--1.69%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      
      With this patch and `-g srcline` we instead get the following output:
      
        ~~~~~~~~~~~~~~~~
        $ perf report --stdio --inline -g srcline
        # Children      Self  Command       Shared Object        Symbol
        # ........  ........  ............  ...................  .........................................
        #
            99.89%    35.34%  cpp-inlining  cpp-inlining         [.] main
                  |
                  |--64.55%--main complex:655
                  |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                  |          /usr/include/c++/6.3.1/complex:664 (inline)
                  |          |
                  |          |--64.02%--hypot
                  |          |          |
                  |          |           --59.81%--__hypot_finite
                  |          |
                  |           --0.53%--cabs
                  |
                   --35.34%--_start
                             __libc_start_main
                             |
                             |--12.48%--main random.tcc:3326
                             |          /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
                             |          /usr/include/c++/6.3.1/bits/random.h:185 (inline)
        ...
        ~~~~~~~~~~~~~~~~
      Signed-off-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Yao Jin <yao.jin@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5dfa210e
    • Jin Yao's avatar
      perf report: Show inline stack for browser mode · 0d3eb0b7
      Jin Yao authored
      If the address belongs to an inlined function, the source information
      back to the first non-inlined function will be printed.
      
      For example:
      
      1. Show inlined function name
         perf report -g function --inline
      
      -    0.69%     0.00%  inline   ld-2.23.so           [.] dl_main
         - dl_main
              0.56% _dl_relocate_object
               _dl_relocate_object (inline)
               elf_dynamic_do_Rela (inline)
      
      2. Show the file/line information
         perf report -g address --inline
      
      -    0.69%     0.00%  inline   ld-2.23.so           [.] _dl_start
           _dl_start rtld.c:307
            /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
         + _dl_sysdep_start dl-sysdep.c:250
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-6-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0d3eb0b7
    • Jin Yao's avatar
      perf report: Show inline stack for stdio mode · 0db64dd0
      Jin Yao authored
      If the address belongs to an inlined function, the source information
      back to the first non-inlined function will be printed.
      
      For example:
      
      1. Show inlined function name
         perf report --stdio -g function --inline
      
           0.69%     0.00%  inline   ld-2.23.so           [.] dl_main
                  |
                  ---dl_main
                     |
                      --0.56%--_dl_relocate_object
                                _dl_relocate_object (inline)
                                elf_dynamic_do_Rela (inline)
      
      2. Show the file/line information
         perf report --stdio -g address --inline
      
           0.69%     0.00%  inline   ld-2.23.so           [.] _dl_start_user
                  |
                  ---_dl_start_user .:0
                     _dl_start rtld.c:307
                     /build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
                     _dl_sysdep_start dl-sysdep.c:250
                     |
                      --0.56%--dl_main rtld.c:2076
      
      Committer tests:
      
        # perf record --call-graph dwarf ~/bin/perf stat usleep 1
      
       Performance counter stats for 'usleep 1':
      
                0.443020      task-clock (msec)         #    0.449 CPUs utilized
                       1      context-switches          #    0.002 M/sec
                       0      cpu-migrations            #    0.000 K/sec
                      52      page-faults               #    0.117 M/sec
               1,049,423      cycles                    #    2.369 GHz
                 801,456      instructions              #    0.76  insn per cycle
                 155,609      branches                  #  351.246 M/sec
                   7,026      branch-misses             #    4.52% of all branches
      
             0.000987570 seconds time elapsed
      
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.553 MB perf.data (66 samples) ]
        # perf report --stdio --inline fs__get_mountpoint
        <SNIP>
           1.73%     0.00%  perf     perf           [.] fs__get_mountpoint
                  |
                  ---fs__get_mountpoint
                     fs__get_mountpoint (inline)
                     fs__check_mounts (inline)
                     __statfs
                     entry_SYSCALL_64
                     sys_statfs
                     SYSC_statfs
                     user_statfs
                     user_path_at_empty
                     filename_lookup
                     path_lookupat
                     link_path_walk
                     inode_permission
                     __inode_permission
                     kernfs_iop_permission
                     kernfs_refresh_inode
                     security_inode_notifysecctx
                     selinux_inode_notifysecctx
                     selinux_inode_setsecurity
                     security_context_to_sid
                     security_context_to_sid_core
                     string_to_context_struct
                     symcmp
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-5-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0db64dd0
    • Jin Yao's avatar
      perf report: Introduce --inline option · f3a60646
      Jin Yao authored
      It takes some time to look for inline stack for callgraph addresses.  So
      it provides new option "--inline" to let user decide if enable this
      feature.
      
        --inline:
      
        If a callgraph address belongs to an inlined function, the inline stack
        will be printed. Each entry is the inline function name or file/line.
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f3a60646
    • Jin Yao's avatar
      perf report: Find the inline stack for a given address · a64489c5
      Jin Yao authored
      It would be useful for perf to support a mode to query the inline stack
      for a given callgraph address. This would simplify finding the right
      code in code that does a lot of inlining.
      
      The srcline.c has contained the code which supports to translate the
      address to filename:line_nr. This patch just extends the function to let
      it support getting the inline stacks.
      
      It introduces the inline_list which will store the inline function
      result (filename:line_nr and funcname).
      
      If BFD lib is not supported, the result is only filename:line_nr.
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a64489c5
    • Jin Yao's avatar
      perf report: Refactor common code in srcline.c · 5580338d
      Jin Yao authored
      Introduce dso__name() and filename_split() out of existing code because
      these codes will be used in several places in next patch.
      
      For filename_split(), it may also solve a potential memory leak in
      existing code. In existing addr2line(),
      
              sep = strchr(filename, ':');
              if (sep) {
                      *sep++ = '\0';
                      *file = filename;
                      *line_nr = strtoul(sep, NULL, 0);
                      ret = 1;
              }
      
      out:
              pclose(fp);
              return ret;
      
      If sep is NULL, filename is not freed or returned via file.
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Tested-by: default avatarMilian Wolff <milian.wolff@kdab.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Link: http://lkml.kernel.org/r/1490474069-15823-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5580338d
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove unused 'prefix' from builtin functions · b0ad8ea6
      Arnaldo Carvalho de Melo authored
      We got it from the git sources but never used it for anything, with the
      place where this would be somehow used remaining:
      
        static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
        {
      	prefix = NULL;
      	if (p->option & RUN_SETUP)
      		prefix = NULL; /* setup_perf_directory(); */
      
      Ditch it.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0ad8ea6
    • Ravi Bangoria's avatar
      perf list sdt: Show option in man page · 6963d3c3
      Ravi Bangoria authored
      Commit 40218dae ("perf list: Show SDT and pre-cached events") added
      sdt support in perf list, but it missed to update documentation.
      
      Show sdt option in man perf-list.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Link: http://lkml.kernel.org/r/20170327025538.1753-1-ravi.bangoria@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6963d3c3
    • Adrian Hunter's avatar
      perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() · c3a0bbc7
      Adrian Hunter authored
      Address filtering with kernel symbols incorrectly resulted in the error
      "Cannot determine size of symbol" because the no_size logic was the wrong
      way around.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: stable@vger.kernel.org # v4.9+
      Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c3a0bbc7
  2. 24 Mar, 2017 4 commits
  3. 23 Mar, 2017 7 commits
  4. 21 Mar, 2017 14 commits
  5. 20 Mar, 2017 4 commits
  6. 17 Mar, 2017 2 commits