1. 13 Feb, 2015 5 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Handle multiple threads better wrt syscalls being intermixed · e596663e
      Arnaldo Carvalho de Melo authored
       $ trace time taskset -c 0 usleep 1
         0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
         0.865 ( 0.008 ms): time/16723 execve(arg0: 140733595272004, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = -2
         2.395 ( 1.523 ms): taskset/16723 execve(arg0: 140733595272013, arg1: 140733595272720, arg2: 140733595272768, arg3: 139755107218496, arg4: 7307199665339051828, arg5: 3) = 0
         2.411 ( 0.002 ms): taskset/16723 brk(                                                                  ) = 0x1915000
         3.300 ( 0.058 ms): usleep/16723 nanosleep(rqtp: 0x7ffff4ada190                                        ) = 0
       <SNIP>
         3.305 ( 0.000 ms): usleep/16723 exit_group(
         3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
         3.366 ( 0.001 ms): time/16722 rt_sigaction(sig: INT, act: 0x7fff17f44160, oact: 0x7fff17f44200, sigsetsize: 8) = 0
      
      We we're not seeing this line:
      
        0.845 ( 0.021 ms): time/16722 wait4(upid: 4294967295, stat_addr: 0x7fff17f443d4, ru: 0x7fff17f44438 ) ...
      
      just the one when it finishes:
      
        3.363 ( 2.539 ms): time/16722  ... [continued]: wait4()) = 16723
      
      Still some issues left till we move to ordered_samples when multiple
      CPUs/threads are involved...
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-zq9x30a1ky3djqewqn2v3ja3@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e596663e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Print thread info when following children · 42052bea
      Arnaldo Carvalho de Melo authored
      The default for 'trace workload' is to set perf_event_attr.inherit to 1,
      i.e. to make it equivalent to 'strace -f workload', so we were ending
      with syscalls for multiple processes mixed up, fix it:
      
      Before:
      
        [root@ssdandy ~]# trace -e brk time usleep 1
           0.071 ( 0.002 ms): brk(              ) = 0x100e000
           0.802 ( 0.001 ms): brk(              ) = 0x1d99000
           1.132 ( 0.003 ms): brk(              ) = 0x1d99000
           1.136 ( 0.003 ms): brk(brk: 0x1dba000) = 0x1dba000
           1.140 ( 0.001 ms): brk(              ) = 0x1dba000
        0.00user 0.00system 0:00.00elapsed 63%CPU (0avgtext+0avgdata 528maxresident)k
        0inputs+0outputs (0major+181minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      After:
      
        [root@ssdandy ~]# trace -f -e brk time usleep 1
           0.072 ( 0.002 ms): time/26308 brk(               ) = 0x1e6e000
           0.860 ( 0.001 ms): usleep/26309 brk(             ) = 0xb91000
           1.193 ( 0.003 ms): usleep/26309 brk(             ) = 0xb91000
           1.197 ( 0.003 ms): usleep/26309 brk(brk: 0xbb2000) = 0xbb2000
           1.201 ( 0.001 ms): usleep/26309 brk(             ) = 0xbb2000
        0.00user 0.00system 0:00.00elapsed 0%CPU (0avgtext+0avgdata 524maxresident)k
        0inputs+0outputs (0major+180minor)pagefaults 0swaps
        [root@ssdandy ~]#
      
      BTW: to achieve the 'strace workload' behaviour, i.e. without a explicit
      '-f', one has to use --no-inherit.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      echo Link: http://lkml.kernel.org/n/tip-`ranpwd -l 24`@git.kernel.org
      Link: http://lkml.kernel.org/n/tip-2wu2d5n65msxoq1i7vtcaft2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      42052bea
    • Yunlong Song's avatar
      perf list: Place the header text in its right position · 619a303c
      Yunlong Song authored
      The hearer text 'List of pre-defined events (to be used in -e):' is
      placed in an improper function, which causes an abnormal output, e.g.
      'perf list hw' shows no guiding text at all, and 'perf list hw
      L1-dcache*' shows the guiding text incorrectly in the middle of the
      output.
      
      Example
      Before this patch:
      
       $ perf list hw L1-dcache*
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
       List of pre-defined events (to be used in -e):              <-- incorrect position
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      
      After this patch:
      
       $ perf list hw L1-dcache*
      
       List of pre-defined events (to be used in -e):              <-- correct position
      
         branch-instructions OR branches                    [Hardware event]
         branch-misses                                      [Hardware event]
         bus-cycles                                         [Hardware event]
         cache-misses                                       [Hardware event]
         cache-references                                   [Hardware event]
         cpu-cycles OR cycles                               [Hardware event]
         instructions                                       [Hardware event]
         stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
         stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
      
         L1-dcache-load-misses                              [Hardware cache event]
         L1-dcache-loads                                    [Hardware cache event]
         L1-dcache-prefetch-misses                          [Hardware cache event]
         L1-dcache-prefetches                               [Hardware cache event]
         L1-dcache-store-misses                             [Hardware cache event]
         L1-dcache-stores                                   [Hardware cache event]
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-8-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      619a303c
    • Kaixu Xia's avatar
      perf: Remove the extra validity check on nr_pages · 74390aa5
      Kaixu Xia authored
      The function is_power_of_2() also do the check on nr_pages, so the first
      check performed is unnecessary. On the other hand, the key point is to
      ensure @nr_pages is a power-of-two number and mostly @nr_pages is a
      nonzero value, so in the most cases, the function is_power_of_2() will
      be called.
      Signed-off-by: default avatarKaixu Xia <xiakaixu@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Link: http://lkml.kernel.org/r/1422352512-75150-1-git-send-email-xiakaixu@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74390aa5
    • Yunlong Song's avatar
      perf tools: Fix a bug of segmentation fault · 3a03005f
      Yunlong Song authored
      Fix the 'segmentation fault' bug of 'perf list --list-cmds', which also
      happens in other cases (e.g. record, report ...). This bug happens when
      there are no cmds to list at all.
      
      Example:
      
      Before this patch:
      
        $ perf list --list-cmds
        Segmentation fault
        $
      
        After this patch:
        $ perf list --list-cmds
        $
      
      As shown above, the result prints nothing rather than a segmentation
      fault. The null result means 'perf list' has no cmds to display at this
      time.
      Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1423833115-11199-5-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3a03005f
  2. 12 Feb, 2015 35 commits