• Jin Yao's avatar
    perf diff: Report noisy for cycles diff · cebf7d51
    Jin Yao authored
    This patch prints the stddev and hist for the cycles diff of program
    block. It can help us to understand if the cycles is noisy or not.
    
    This patch is inspired by Andi Kleen's patch:
    
      https://lwn.net/Articles/600471/
    
    We create new option '--cycles-hist'.
    
    Example:
    
      perf record -b ./div
      perf record -b ./div
      perf diff -c cycles
    
      # Baseline                                [Program Block Range] Cycles Diff  Shared Object      Symbol
      # ........  .......................................................... ....  .................  ............................
      #
          46.72%                                      [div.c:40 -> div.c:40]    0  div                [.] main
          46.72%                                      [div.c:42 -> div.c:44]    0  div                [.] main
          46.72%                                      [div.c:42 -> div.c:39]    0  div                [.] main
          20.54%                          [random_r.c:357 -> random_r.c:394]    1  libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:357 -> random_r.c:380]    0  libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:388 -> random_r.c:388]    0  libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:388 -> random_r.c:391]    0  libc-2.27.so       [.] __random_r
          17.04%                              [random.c:288 -> random.c:291]    0  libc-2.27.so       [.] __random
          17.04%                              [random.c:291 -> random.c:291]    0  libc-2.27.so       [.] __random
          17.04%                              [random.c:293 -> random.c:293]    0  libc-2.27.so       [.] __random
          17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
          17.04%                              [random.c:295 -> random.c:295]    0  libc-2.27.so       [.] __random
          17.04%                              [random.c:298 -> random.c:298]    0  libc-2.27.so       [.] __random
           8.40%                                      [div.c:22 -> div.c:25]    0  div                [.] compute_flag
           8.40%                                      [div.c:27 -> div.c:28]    0  div                [.] compute_flag
           5.14%                                    [rand.c:26 -> rand.c:27]    0  libc-2.27.so       [.] rand
           5.14%                                    [rand.c:28 -> rand.c:28]    0  libc-2.27.so       [.] rand
           2.15%                                  [rand@plt+0 -> rand@plt+0]    0  div                [.] rand@plt
           0.00%                                                                   [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
           0.00%                                [do_mmap+714 -> do_mmap+732]  -10  [kernel.kallsyms]  [k] do_mmap
           0.00%                                [do_mmap+737 -> do_mmap+765]    1  [kernel.kallsyms]  [k] do_mmap
           0.00%                                [do_mmap+262 -> do_mmap+299]    0  [kernel.kallsyms]  [k] do_mmap
           0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7  [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
           0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  [kernel.kallsyms]  [k] native_sched_clock
           0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  [kernel.kallsyms]  [k] native_write_msr
    
    When we enable the option '--cycles-hist', the output is
    
      perf diff -c cycles --cycles-hist
    
      # Baseline                                [Program Block Range] Cycles Diff        stddev/Hist  Shared Object      Symbol
      # ........  .......................................................... ....  .................  .................  ............................
      #
          46.72%                                      [div.c:40 -> div.c:40]    0  ± 37.8% ▁█▁▁██▁█   div                [.] main
          46.72%                                      [div.c:42 -> div.c:44]    0  ± 49.4% ▁▁▂█▂▂▂▂   div                [.] main
          46.72%                                      [div.c:42 -> div.c:39]    0  ± 24.1% ▃█▂▄▁▃▂▁   div                [.] main
          20.54%                          [random_r.c:357 -> random_r.c:394]    1  ± 33.5% ▅▂▁█▃▁▂▁   libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:357 -> random_r.c:380]    0  ± 39.4% ▁▁█▁██▅▁   libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:388 -> random_r.c:388]    0                     libc-2.27.so       [.] __random_r
          20.54%                          [random_r.c:388 -> random_r.c:391]    0  ± 41.2% ▁▃▁▂█▄▃▁   libc-2.27.so       [.] __random_r
          17.04%                              [random.c:288 -> random.c:291]    0  ± 48.8% ▁▁▁▁███▁   libc-2.27.so       [.] __random
          17.04%                              [random.c:291 -> random.c:291]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
          17.04%                              [random.c:293 -> random.c:293]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
          17.04%                              [random.c:295 -> random.c:295]    0  ±100.0% ▁█▁▁▁▁▁▁   libc-2.27.so       [.] __random
          17.04%                              [random.c:295 -> random.c:295]    0                     libc-2.27.so       [.] __random
          17.04%                              [random.c:298 -> random.c:298]    0  ± 75.6% ▃█▁▁▁▁▁▁   libc-2.27.so       [.] __random
           8.40%                                      [div.c:22 -> div.c:25]    0  ± 42.1% ▁▃▁▁███▁   div                [.] compute_flag
           8.40%                                      [div.c:27 -> div.c:28]    0  ± 41.8% ██▁▁▄▁▁▄   div                [.] compute_flag
           5.14%                                    [rand.c:26 -> rand.c:27]    0  ± 37.8% ▁▁▁████▁   libc-2.27.so       [.] rand
           5.14%                                    [rand.c:28 -> rand.c:28]    0                     libc-2.27.so       [.] rand
           2.15%                                  [rand@plt+0 -> rand@plt+0]    0                     div                [.] rand@plt
           0.00%                                                                                      [kernel.kallsyms]  [k] __x86_indirect_thunk_rax
           0.00%                                [do_mmap+714 -> do_mmap+732]  -10                     [kernel.kallsyms]  [k] do_mmap
           0.00%                                [do_mmap+737 -> do_mmap+765]    1                     [kernel.kallsyms]  [k] do_mmap
           0.00%                                [do_mmap+262 -> do_mmap+299]    0                     [kernel.kallsyms]  [k] do_mmap
           0.00%  [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0]    7                     [kernel.kallsyms]  [k] __x86_indirect_thunk_r15
           0.00%            [native_sched_clock+0 -> native_sched_clock+119]   -1  ± 38.5% ▄█▁        [kernel.kallsyms]  [k] native_sched_clock
           0.00%                 [native_write_msr+0 -> native_write_msr+16]  -13  ± 47.1% ▁█▇▃▁▁     [kernel.kallsyms]  [k] native_write_msr
    
     v8:
     ---
     Rebase to perf/core branch
    
     v7:
     ---
     1. v6 got Jiri's ACK.
     2. Rebase to latest perf/core branch.
    
     v6:
     ---
     1. Jiri provides better code for using data__hpp_register() in ui_init().
        Use this code in v6.
    
     v5:
     ---
     1. Refine the use of data__hpp_register() in ui_init() according to
        Jiri's suggestion.
    
     v4:
     ---
     1. Rename the new option from '--noisy' to '--cycles-hist'
     2. Remove the option '-n'.
     3. Only update the spark value and stats when '--cycles-hist' is enabled.
     4. Remove the code of printing '..'.
    
     v3:
     ---
     1. Move the histogram to a separate column
     2. Move the svals[] out of struct stats
    
     v2:
     ---
     Jiri got a compile error,
    
      CC       builtin-diff.o
      builtin-diff.c: In function ‘compute_cycles_diff’:
      builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value]
      712 |          labs(pair->block_info->cycles_spark[i] -
          |          ^~~~
    
     Because the result of u64 - u64 is still u64. Now we change the type of
     cycles_spark[] to s64.
    Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
    Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Kan Liang <kan.liang@linux.intel.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lore.kernel.org/lkml/20190925011446.30678-1-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    cebf7d51
sort.h 6.9 KB