1. 12 Feb, 2024 2 commits
    • Ian Rogers's avatar
      perf maps: Switch from rbtree to lazily sorted array for addresses · 659ad349
      Ian Rogers authored
      Maps is a collection of maps primarily sorted by the starting address
      of the map. Prior to this change the maps were held in an rbtree
      requiring 4 pointers per node. Prior to reference count checking, the
      rbnode was embedded in the map so 3 pointers per node were
      necessary. This change switches the rbtree to an array lazily sorted
      by address, much as the array sorting nodes by name. 1 pointer is
      needed per node, but to avoid excessive resizing the backing array may
      be twice the number of used elements. Meaning the memory overhead is
      roughly half that of the rbtree. For a perf record with
      "--no-bpf-event -g -a" of true, the memory overhead of perf inject is
      reduce fom 3.3MB to 3MB, so 10% or 300KB is saved.
      
      Map inserts always happen at the end of the array. The code tracks
      whether the insertion violates the sorting property. O(log n) rb-tree
      complexity is switched to O(1).
      
      Remove slides the array, so O(log n) rb-tree complexity is degraded to
      O(n).
      
      A find may need to sort the array using qsort which is O(n*log n), but
      in general the maps should be sorted and so average performance should
      be O(log n) as with the rbtree.
      
      An rbtree node consumes a cache line, but with the array 4 nodes fit
      on a cache line. Iteration is simplified to scanning an array rather
      than pointer chasing.
      
      Overall it is expected the performance after the change should be
      comparable to before, but with half of the memory consumed.
      
      To avoid a list and repeated logic around splitting maps,
      maps__merge_in is rewritten in terms of
      maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping
      inserting remaining gaps. maps__fixup_overlap_and_insert splits the
      existing mappings, then adds the incoming mapping. By adding the new
      mapping first, then re-inserting the existing mappings the splitting
      behavior matches.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Song Liu <song@kernel.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Artem Savkov <asavkov@redhat.com>
      Cc: bpf@vger.kernel.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240210031746.4057262-2-irogers@google.com
      659ad349
    • Namhyung Kim's avatar
      Merge branch 'perf-tools' into perf-tools-next · 39d14c0d
      Namhyung Kim authored
      To get some fixes in the perf test and JSON metrics into the development
      branch.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      39d14c0d
  2. 10 Feb, 2024 1 commit
  3. 09 Feb, 2024 7 commits
    • Yicong Yang's avatar
      perf stat: Support per-cluster aggregation · cbc917a1
      Yicong Yang authored
      Some platforms have 'cluster' topology and CPUs in the cluster will
      share resources like L3 Cache Tag (for HiSilicon Kunpeng SoC) or L2
      cache (for Intel Jacobsville). Currently parsing and building cluster
      topology have been supported since [1].
      
      perf stat has already supported aggregation for other topologies like
      die or socket, etc. It'll be useful to aggregate per-cluster to find
      problems like L3T bandwidth contention.
      
      This patch add support for "--per-cluster" option for per-cluster
      aggregation. Also update the docs and related test. The output will
      be like:
      
      [root@localhost tmp]# perf stat -a -e LLC-load --per-cluster -- sleep 5
      
       Performance counter stats for 'system wide':
      
      S56-D0-CLS158    4      1,321,521,570      LLC-load
      S56-D0-CLS594    4        794,211,453      LLC-load
      S56-D0-CLS1030    4             41,623      LLC-load
      S56-D0-CLS1466    4             41,646      LLC-load
      S56-D0-CLS1902    4             16,863      LLC-load
      S56-D0-CLS2338    4             15,721      LLC-load
      S56-D0-CLS2774    4             22,671      LLC-load
      [...]
      
      On a legacy system without cluster or cluster support, the output will
      be look like:
      [root@localhost perf]# perf stat -a -e cycles --per-cluster -- sleep 1
      
       Performance counter stats for 'system wide':
      
      S56-D0-CLS0   64         18,011,485      cycles
      S7182-D0-CLS0   64         16,548,835      cycles
      
      Note that this patch doesn't mix the cluster information in the outputs
      of --per-core to avoid breaking any tools/scripts using it.
      
      Note that perf recently supports "--per-cache" aggregation, but it's not
      the same with the cluster although cluster CPUs may share some cache
      resources. For example on my machine all clusters within a die share the
      same L3 cache:
      $ cat /sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_list
      0-31
      $ cat /sys/devices/system/cpu/cpu0/topology/cluster_cpus_list
      0-3
      
      [1] commit c5e22fef ("topology: Represent clusters of CPUs within a die")
      Tested-by: default avatarJie Zhan <zhanjie9@hisilicon.com>
      Reviewed-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYicong Yang <yangyicong@hisilicon.com>
      Cc: james.clark@arm.com
      Cc: 21cnbao@gmail.com
      Cc: prime.zeng@hisilicon.com
      Cc: Jonathan.Cameron@huawei.com
      Cc: fanghao11@huawei.com
      Cc: linuxarm@huawei.com
      Cc: tim.c.chen@intel.com
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240208024026.2691-1-yangyicong@huawei.com
      cbc917a1
    • Namhyung Kim's avatar
      perf tools: Remove misleading comments on map functions · 9a440bb2
      Namhyung Kim authored
      When it converts sample IP to or from objdump-capable one, there's a
      comment saying that kernel modules have DSO_SPACE__USER.  But commit
      02213cec ("perf maps: Mark module DSOs with kernel type") changed
      it and makes the comment confusing.  Let's get rid of it.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Link: https://lore.kernel.org/r/20240208181025.1329645-1-namhyung@kernel.org
      9a440bb2
    • Yang Jihong's avatar
      perf thread_map: Free strlist on normal path in thread_map__new_by_tid_str() · 1eb3d924
      Yang Jihong authored
      slist needs to be freed in both error path and normal path in
      thread_map__new_by_tid_str().
      
      Fixes: b52956c9 ("perf tools: Allow multiple threads or processes in record, stat, top")
      Reviewed-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206083228.172607-6-yangjihong1@huawei.com
      1eb3d924
    • Yang Jihong's avatar
      perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() · bd2cdf26
      Yang Jihong authored
      The curr_pid and cpu_last_switched are used only for the
      'perf sched replay/latency/map'. Put their initialization in
      perf_sched__{lat|map|replay () to reduce unnecessary actions in other
      commands.
      
      Simple functional testing:
      
        # perf sched record perf bench sched messaging
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 0.209 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 16.456 MB perf.data (147907 samples) ]
      
        # perf sched lat
      
         -------------------------------------------------------------------------------------------------------------------------------------------
          Task                  |   Runtime ms  | Switches | Avg delay ms    | Max delay ms    | Max delay start           | Max delay end          |
         -------------------------------------------------------------------------------------------------------------------------------------------
          sched-messaging:(401) |   2990.699 ms |    38705 | avg:   0.661 ms | max:  67.046 ms | max start: 456532.624830 s | max end: 456532.691876 s
          qemu-system-x86:(7)   |    179.764 ms |     2191 | avg:   0.152 ms | max:  21.857 ms | max start: 456532.576434 s | max end: 456532.598291 s
          sshd:48125            |      0.522 ms |        2 | avg:   0.037 ms | max:   0.046 ms | max start: 456532.514610 s | max end: 456532.514656 s
        <SNIP>
          ksoftirqd/11:82       |      0.063 ms |        1 | avg:   0.005 ms | max:   0.005 ms | max start: 456532.769366 s | max end: 456532.769371 s
          kworker/9:0-mm_:34624 |      0.233 ms |       20 | avg:   0.004 ms | max:   0.007 ms | max start: 456532.690804 s | max end: 456532.690812 s
          migration/13:93       |      0.000 ms |        1 | avg:   0.004 ms | max:   0.004 ms | max start: 456532.512669 s | max end: 456532.512674 s
         -----------------------------------------------------------------------------------------------------------------
          TOTAL:                |   3180.750 ms |    41368 |
         ---------------------------------------------------
      
        # echo $?
        0
      
        # perf sched map
          *A0                                                               456532.510141 secs A0 => migration/0:15
          *.                                                                456532.510171 secs .  => swapper:0
           .  *B0                                                           456532.510261 secs B0 => migration/1:21
           .  *.                                                            456532.510279 secs
        <SNIP>
           L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .   .    456532.785979 secs
           L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .    456532.786054 secs
           L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .    456532.786127 secs
           L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .    456532.786197 secs
           L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7   456532.786270 secs
        # echo $?
        0
      
        # perf sched replay
        run measurement overhead: 108 nsecs
        sleep measurement overhead: 66473 nsecs
        the run test took 1000002 nsecs
        the sleep test took 1082686 nsecs
        nr_run_events:        49334
        nr_sleep_events:      50054
        nr_wakeup_events:     34701
        target-less wakeups:  165
        multi-target wakeups: 766
        task      0 (             swapper:         0), nr_events: 15419
        task      1 (             swapper:         1), nr_events: 1
        task      2 (             swapper:         2), nr_events: 1
        <SNIP>
        task    715 (     sched-messaging:    110248), nr_events: 1438
        task    716 (     sched-messaging:    110249), nr_events: 512
        task    717 (     sched-messaging:    110250), nr_events: 500
        task    718 (     sched-messaging:    110251), nr_events: 537
        task    719 (     sched-messaging:    110252), nr_events: 823
        ------------------------------------------------------------
        #1  : 1325.288, ravg: 1325.29, cpu: 7823.35 / 7823.35
        #2  : 1363.606, ravg: 1329.12, cpu: 7655.53 / 7806.56
        #3  : 1349.494, ravg: 1331.16, cpu: 7544.80 / 7780.39
        #4  : 1311.488, ravg: 1329.19, cpu: 7495.13 / 7751.86
        #5  : 1309.902, ravg: 1327.26, cpu: 7266.65 / 7703.34
        #6  : 1309.535, ravg: 1325.49, cpu: 7843.86 / 7717.39
        #7  : 1316.482, ravg: 1324.59, cpu: 7854.41 / 7731.09
        #8  : 1366.604, ravg: 1328.79, cpu: 7955.81 / 7753.57
        #9  : 1326.286, ravg: 1328.54, cpu: 7466.86 / 7724.90
        #10 : 1356.653, ravg: 1331.35, cpu: 7566.60 / 7709.07
        # echo $?
        0
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206083228.172607-5-yangjihong1@huawei.com
      bd2cdf26
    • Yang Jihong's avatar
      perf sched: Move curr_thread initialization to perf_sched__map() · 5e895278
      Yang Jihong authored
      The curr_thread is used only for the 'perf sched map'. Put initialization
      in perf_sched__map() to reduce unnecessary actions in other commands.
      
      Simple functional testing:
      
        # perf sched record perf bench sched messaging
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 0.197 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 15.526 MB perf.data (140095 samples) ]
      
        # perf sched map
          *A0                                                               451264.532445 secs A0 => migration/0:15
          *.                                                                451264.532468 secs .  => swapper:0
           .  *B0                                                           451264.532537 secs B0 => migration/1:21
           .  *.                                                            451264.532560 secs
           .   .  *C0                                                       451264.532644 secs C0 => migration/2:27
           .   .  *.                                                        451264.532668 secs
           .   .   .  *D0                                                   451264.532753 secs D0 => migration/3:33
           .   .   .  *.                                                    451264.532778 secs
           .   .   .   .  *E0                                               451264.532861 secs E0 => migration/4:39
           .   .   .   .  *.                                                451264.532886 secs
           .   .   .   .   .  *F0                                           451264.532973 secs F0 => migration/5:45
        <SNIP>
           A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .   .    451264.790785 secs
           A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .    451264.790858 secs
           A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .    451264.790934 secs
           A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .    451264.791004 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .    451264.791075 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .    451264.791143 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .    451264.791232 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .    451264.791336 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .    451264.791407 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .    451264.791484 secs
           A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7   451264.791553 secs
        # echo $?
        0
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206083228.172607-4-yangjihong1@huawei.com
      5e895278
    • Yang Jihong's avatar
      perf sched: Fix memory leak in perf_sched__map() · ef76a5af
      Yang Jihong authored
      perf_sched__map() needs to free memory of map_cpus, color_pids and
      color_cpus in normal path and rollback allocated memory in error path.
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206083228.172607-3-yangjihong1@huawei.com
      ef76a5af
    • Yang Jihong's avatar
      perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() · c6907863
      Yang Jihong authored
      The start_work_mutex and work_done_wait_mutex are used only for the
      'perf sched replay'. Put their initialization in perf_sched__replay () to
      reduce unnecessary actions in other commands.
      
      Simple functional testing:
      
        # perf sched record perf bench sched messaging
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 0.197 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 14.952 MB perf.data (134165 samples) ]
      
        # perf sched replay
        run measurement overhead: 108 nsecs
        sleep measurement overhead: 65658 nsecs
        the run test took 999991 nsecs
        the sleep test took 1079324 nsecs
        nr_run_events:        42378
        nr_sleep_events:      43102
        nr_wakeup_events:     31852
        target-less wakeups:  17
        multi-target wakeups: 712
        task      0 (             swapper:         0), nr_events: 10451
        task      1 (             swapper:         1), nr_events: 3
        task      2 (             swapper:         2), nr_events: 1
        <SNIP>
        task    717 (     sched-messaging:     74483), nr_events: 152
        task    718 (     sched-messaging:     74484), nr_events: 1944
        task    719 (     sched-messaging:     74485), nr_events: 73
        task    720 (     sched-messaging:     74486), nr_events: 163
        task    721 (     sched-messaging:     74487), nr_events: 942
        task    722 (     sched-messaging:     74488), nr_events: 78
        task    723 (     sched-messaging:     74489), nr_events: 1090
        ------------------------------------------------------------
        #1  : 1366.507, ravg: 1366.51, cpu: 7682.70 / 7682.70
        #2  : 1410.072, ravg: 1370.86, cpu: 7723.88 / 7686.82
        #3  : 1396.296, ravg: 1373.41, cpu: 7568.20 / 7674.96
        #4  : 1381.019, ravg: 1374.17, cpu: 7531.81 / 7660.64
        #5  : 1393.826, ravg: 1376.13, cpu: 7725.25 / 7667.11
        #6  : 1401.581, ravg: 1378.68, cpu: 7594.82 / 7659.88
        #7  : 1381.337, ravg: 1378.94, cpu: 7371.22 / 7631.01
        #8  : 1373.842, ravg: 1378.43, cpu: 7894.92 / 7657.40
        #9  : 1364.697, ravg: 1377.06, cpu: 7324.91 / 7624.15
        #10 : 1363.613, ravg: 1375.72, cpu: 7209.55 / 7582.69
        # echo $?
        0
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206083228.172607-2-yangjihong1@huawei.com
      c6907863
  4. 08 Feb, 2024 3 commits
    • Yicong Yang's avatar
      perf test: Skip metric w/o event name on arm64 in stat STD output linter · 5f70c6c5
      Yicong Yang authored
      stat+std_output.sh test fails on my arm64 machine:
      [root@localhost shell]# ./stat+std_output.sh
      Checking STD output: no args Unknown event name in TopDownL1                 #     0.18 retiring
      [root@localhost shell]# ./stat+std_output.sh
      Checking STD output: no args [Success]
      Checking STD output: system wide [Success]
      Checking STD output: interval [Success]
      Checking STD output: per thread Unknown event name in tmux: server-1114960                                                   #     0.41 frontend_bound
      
      When no args specified `perf stat` will add TopdownL1 metric group
      and the output will be like:
      [root@localhost shell]# perf stat -- stress-ng --vm 1 --timeout 1
      stress-ng: info:  [3351733] setting to a 1 second run per stressor
      stress-ng: info:  [3351733] dispatching hogs: 1 vm
      stress-ng: info:  [3351733] successful run completed in 1.02s
      
       Performance counter stats for 'stress-ng --vm 1 --timeout 1':
      
                1,037.71 msec task-clock                       #    1.000 CPUs utilized
                      13      context-switches                 #   12.528 /sec
                       1      cpu-migrations                   #    0.964 /sec
                  67,544      page-faults                      #   65.090 K/sec
           2,691,932,561      cycles                           #    2.594 GHz                         (74.56%)
           6,571,333,653      instructions                     #    2.44  insn per cycle              (74.92%)
             521,863,142      branches                         #  502.901 M/sec                       (75.21%)
                 425,879      branch-misses                    #    0.08% of all branches             (87.57%)
                              TopDownL1                 #     0.61 retiring                    (87.67%)
                                                        #     0.03 frontend_bound              (87.67%)
                                                        #     0.02 bad_speculation             (87.67%)
                                                        #     0.34 backend_bound               (74.61%)
      
             1.038138390 seconds time elapsed
      
             0.844849000 seconds user
             0.189053000 seconds sys
      
      Metrics in group TopDownL1 don't have event name on arm64 but are not
      listed in the $skip_metric list which they should be listed. Add them
      to the skip list as what does for x86 platforms in [1].
      
      [1] commit 4d60e83d ("perf test: Skip metrics w/o event name in stat STD output linter")
      Signed-off-by: default avatarYicong Yang <yangyicong@hisilicon.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: linuxarm@huawei.com
      Cc: kan.liang@linux.intel.com
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240207091222.54096-1-yangyicong@huawei.com
      5f70c6c5
    • Adrian Hunter's avatar
      perf symbols: Slightly improve module file executable section mappings · 94a830d7
      Adrian Hunter authored
      Currently perf does not record module section addresses except for
      the .text section. In general that means perf cannot get module section
      mappings correct (except for .text) when loading symbols from a kernel
      module file. (Note using --kcore does not have this issue)
      
      Improve that situation slightly by identifying executable sections that
      use the same mapping as the .text section. That happens when an
      executable section comes directly after the .text section, both in memory
      and on file, something that can be determined by following the same layout
      rules used by the kernel, refer kernel layout_sections(). Note whether
      that happens is somewhat arbitrary, so this is not a final solution.
      
      Example from tracing a virtual machine process:
      
       Before:
      
        $ perf script | grep unknown
               CPU 0/KVM    1718   203.511270:     318341 cpu-cycles:P:  ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
        $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
        Map: 0-7e0 41430 [kvm_intel].noinstr.text
        Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
      
       After:
      
        $ perf script | grep 203.511270
               CPU 0/KVM    1718   203.511270:     318341 cpu-cycles:P:  ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
        $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
        Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
      Reported-by: default avatarLike Xu <like.xu.linux@gmail.com>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240208085326.13432-3-adrian.hunter@intel.com
      94a830d7
    • Adrian Hunter's avatar
      perf tools: Make it possible to see perf's kernel and module memory mappings · 0bdfbd04
      Adrian Hunter authored
      Dump kmaps if using 'perf --debug kmaps' or verbose > 2 (e.g. -vvv) for
      tools 'perf script' and 'perf report' if there is no browser.
      
      Example:
      
        $ perf --debug kmaps script 2>&1 >/dev/null | grep kvm.intel
        build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
        Map: 0-3a3 4f5d8 [kvm_intel].modinfo
        Map: 0-5240 5f280 [kvm_intel]__versions
        Map: 0-30 64 [kvm_intel].note.Linux
        Map: 0-14 644c0 [kvm_intel].orc_header
        Map: 0-5297 43680 [kvm_intel].rodata
        Map: 0-5bee 3b837 [kvm_intel].text.unlikely
        Map: 0-7e0 41430 [kvm_intel].noinstr.text
        Map: 0-2080 713c0 [kvm_intel].bss
        Map: 0-26 705c8 [kvm_intel].data..read_mostly
        Map: 0-5888 6a4c0 [kvm_intel].data
        Map: 0-22 70220 [kvm_intel].data.once
        Map: 0-40 705f0 [kvm_intel].data..percpu
        Map: 0-1685 41d20 [kvm_intel].init.text
        Map: 0-4b8 6fd60 [kvm_intel].init.data
        Map: 0-380 70248 [kvm_intel]__dyndbg
        Map: 0-8 70218 [kvm_intel].exit.data
        Map: 0-438 4f980 [kvm_intel]__param
        Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
        Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
        Map: 0-e0 70640 [kvm_intel].data..ro_after_init
        Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
        Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
      
      The example above shows how the module section mappings are all wrong
      except for the main .text mapping at 0xffffffffc13a7000.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Like Xu <like.xu.linux@gmail.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240208085326.13432-2-adrian.hunter@intel.com
      0bdfbd04
  5. 07 Feb, 2024 3 commits
    • Namhyung Kim's avatar
      perf record: Display data size on pipe mode · 5b9e4eef
      Namhyung Kim authored
      Currently pipe mode doesn't set the file size and it results in a
      misleading message of 0 data size at the end.  Although it might miss
      some accounting for pipe header or more, just displaying the data size
      would reduce the possible confusion.
      
      Before:
        $ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]    <======  (here)
            99.58%  perf     perf                  [.] noploop
      
      After:
        $ perf record -o- perf test -w noploop | perf report -i- -q --percent-limit=1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.229 MB - ]
            99.46%  perf     perf                  [.] noploop
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Link: https://lore.kernel.org/r/20240112231340.779469-1-namhyung@kernel.org
      5b9e4eef
    • Kan Liang's avatar
      perf script: Print source line for each jump in brstackinsn · 112c5547
      Kan Liang authored
      With the srcline option, the perf script only prints a source line at
      the beginning of a sample with call/ret from functions, but not for
      each jump in brstackinsn. It's useful to print a source line for each
      jump in brstackinsn when the end user analyze the full assembler
      sequences of branch sequences for the sample.
      
      The srccode option can also be used to locate the source code line.
      However, it's printed almost for every line and makes the output less
      readable.
      
       $perf script -F +brstackinsn,+srcline --xed
      
      Before the patch,
      
       tchain_edit_deb 1463275 15228549.107820:     282495 instructions:u:            401133 f3+0xd (/home/kan/os.li>
        tchain_edit.c:22
              f3+40:  tchain_edit.c:20
              000000000040114e                        jle 0x401133                    # PRED 6 cycles [6]
              0000000000401133                        movl  -0x4(%rbp), %eax
              0000000000401136                        and $0x1, %eax
              0000000000401139                        test %eax, %eax
              000000000040113b                        jz 0x401143
              000000000040113d                        addl  $0x1, -0x4(%rbp)
              0000000000401141                        jmp 0x401147                    # PRED 3 cycles [9] 2.00 IPC
              0000000000401147                        cmpl  $0x3e7, -0x4(%rbp)
              000000000040114e                        jle 0x401133                    # PRED 6 cycles [15] 0.33 IPC
      
      After the patch,
      
       tchain_edit_deb 1463275 15228549.107820:     282495 instructions:u:            401133 f3+0xd (/home/kan/os.li>
        tchain_edit.c:22
              f3+40:  tchain_edit.c:20
              000000000040114e                        jle 0x401133                     srcline: tchain_edit.c:20      # PRED 6 cycles [6]
              0000000000401133                        movl  -0x4(%rbp), %eax
              0000000000401136                        and $0x1, %eax
              0000000000401139                        test %eax, %eax
              000000000040113b                        jz 0x401143
              000000000040113d                        addl  $0x1, -0x4(%rbp)
              0000000000401141                        jmp 0x401147                     srcline: tchain_edit.c:23      # PRED 3 cycles [9] 2.00 IPC
              0000000000401147                        cmpl  $0x3e7, -0x4(%rbp)
              000000000040114e                        jle 0x401133                     srcline: tchain_edit.c:20      # PRED 6 cycles [15] 0.33 IPC
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: ahmad.yasin@intel.com
      Cc: amiri.khalil@intel.com
      Cc: ak@linux.intel.com
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240205145819.1943114-1-kan.liang@linux.intel.com
      112c5547
    • Ian Rogers's avatar
      perf kvm powerpc: Fix build · 8ce5fa4d
      Ian Rogers authored
      Updates to struct parse_events_error needed to be carried through to
      PowerPC specific event parsing.
      
      Fixes: fd7b8e8f ("perf parse-events: Print all errors")
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung <namhyung@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240206235902.2917395-1-irogers@google.com
      8ce5fa4d
  6. 06 Feb, 2024 1 commit
  7. 03 Feb, 2024 2 commits
  8. 02 Feb, 2024 9 commits
  9. 31 Jan, 2024 1 commit
  10. 30 Jan, 2024 4 commits
    • Arnaldo Carvalho de Melo's avatar
      tools include UAPI: Sync linux/mount.h copy with the kernel sources · 1f8c43b0
      Arnaldo Carvalho de Melo authored
      To pick the changes from:
      
        35e27a57 ("fs: keep struct mnt_id_req extensible")
        b4c2bea8 ("add listmount(2) syscall")
        46eae99e ("add statmount(2) syscall")
      
      That doesn't change anything in tools this time as nothing that is
      harvested by the beauty scripts got changed:
      
        $ ls -1 tools/perf/trace/beauty/*mount*sh
        tools/perf/trace/beauty/fsmount.sh
        tools/perf/trace/beauty/mount_flags.sh
        tools/perf/trace/beauty/move_mount_flags.sh
        $
      
      This addresses this perf build warning.
      
        Warning: Kernel ABI header differences:
          diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZbkMiB7ZcOsLP2V5@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1f8c43b0
    • James Clark's avatar
      perf evlist: Fix evlist__new_default() for > 1 core PMU · 7814fe24
      James Clark authored
      The 'Session topology' test currently fails with this message when
      evlist__new_default() opens more than one event:
      
        32: Session topology                                                :
        --- start ---
        templ file: /tmp/perf-test-vv5YzZ
        Using CPUID 0x00000000410fd070
        Opening: unknown-hardware:HG
        ------------------------------------------------------------
        perf_event_attr:
          type                             0 (PERF_TYPE_HARDWARE)
          config                           0xb00000000
          disabled                         1
        ------------------------------------------------------------
        sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 4
        Opening: unknown-hardware:HG
        ------------------------------------------------------------
        perf_event_attr:
          type                             0 (PERF_TYPE_HARDWARE)
          config                           0xa00000000
          disabled                         1
        ------------------------------------------------------------
        sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 5
        non matching sample_type
        FAILED tests/topology.c:73 can't get session
        ---- end ----
        Session topology: FAILED!
      
      This is because when re-opening the file and parsing the header, Perf
      expects that any file that has more than one event has the sample ID
      flag set. Perf record already sets the flag in a similar way when there
      is more than one event, so add the same logic to evlist__new_default().
      
      evlist__new_default() is only currently used in tests, so I don't
      expect this change to have any other side effects. The other tests that
      use it don't save and re-open the file so don't hit this issue.
      
      The session topology test has been failing on Arm big.LITTLE platforms
      since commit 251aa040 ("perf parse-events: Wildcard most
      "numeric" events") when evlist__new_default() started opening multiple
      events for 'cycles'.
      
      Fixes: 251aa040 ("perf parse-events: Wildcard most "numeric" events")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      [ This was failing as well on a Rocket Lake Refresh/14700k Intel hybrid system - Arnaldo ]
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Closes: https://lore.kernel.org/lkml/CAP-5=fWVQ-7ijjK3-w1q+k2WYVNHbAcejb-xY0ptbjRw476VKA@mail.gmail.com/
      Link: https://lore.kernel.org/r/20240124094358.489372-1-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7814fe24
    • Arnaldo Carvalho de Melo's avatar
      tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench' · efe80f9c
      Arnaldo Carvalho de Melo authored
      This is to get the changes from:
      
        94ea9c05 ("x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>")
        10f4c9b9 ("x86/asm: Fix build of UML with KASAN")
      
      That addresses these perf tools build warning:
      
        Warning: Kernel ABI header differences:
          diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
          diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Link: https://lore.kernel.org/lkml/ZbkIKpKdNqOFdMwJ@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      efe80f9c
    • Arnaldo Carvalho de Melo's avatar
      tools headers x86 cpufeatures: Sync with the kernel sources to pick TDX, Zen,... · 15d6daad
      Arnaldo Carvalho de Melo authored
      tools headers x86 cpufeatures: Sync with the kernel sources to pick TDX, Zen, APIC MSR fence changes
      
      To pick the changes from:
      
        1e536e10 ("x86/cpu: Detect TDX partial write machine check erratum")
        765a0542 ("x86/virt/tdx: Detect TDX during kernel boot")
        30fa9283 ("x86/CPU/AMD: Add ZenX generations flags")
        04c30245 ("x86/barrier: Do not serialize MSR accesses on AMD")
      
      This causes these perf files to be rebuilt and brings some X86_FEATURE
      that will be used when updating the copies of
      tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:
      
            CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
            CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
      
      And addresses this perf build warning:
      
        Warning: Kernel ABI header differences:
          diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kai Huang <kai.huang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      15d6daad
  11. 29 Jan, 2024 1 commit
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync unistd.h to pick {list,stat}mount,... · 21fdd8dd
      Arnaldo Carvalho de Melo authored
      tools headers UAPI: Sync unistd.h to pick {list,stat}mount, lsm_{[gs]et_self_attr,list_modules} syscall numbers
      
      To pick the changes in these csets:
      
        d8b0f546 ("wire up syscalls for statmount/listmount")
        5f423759 ("LSM: wireup Linux Security Module syscalls")
      
      Used in some architectures to create syscall tables.
      
      This addresses this perf build warning:
      
        Warning: Kernel ABI header differences:
          diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
      
      Cc: Casey Schaufler <casey@schaufler-ca.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Miklos Szeredi <mszeredi@redhat.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZbfMuAlUMRO9Hqa6@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      21fdd8dd
  12. 27 Jan, 2024 2 commits
    • Ian Rogers's avatar
      perf vendor events intel: Alderlake/sapphirerapids metric fixes · becc24e9
      Ian Rogers authored
      As events are deduplicated by name, ensure PMU prefixes are always
      used in metrics. Previously they may be missed on the first event in a
      formula.
      
      Update metric constraints for architectures with topdown l2 events.
      
      Conversion script updated in:
      https://github.com/intel/perfmon/pull/128Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Closes: https://lore.kernel.org/lkml/ZZam-EG-UepcXtWw@kernel.org/
      Link: https://lore.kernel.org/r/20240104231903.775717-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      becc24e9
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync kvm headers with the kernel sources · e30dca91
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        a5d3df8a ("KVM: remove deprecated UAPIs")
        6d722835 ("KVM x86/xen: add an override for PVCLOCK_TSC_STABLE_BIT")
        89ea60c2 ("KVM: x86: Add support for "protected VMs" that can utilize private memory")
        8dd2eee9 ("KVM: x86/mmu: Handle page fault for private memory")
        a7800aa8 ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
        5a475554 ("KVM: Introduce per-page memory attributes")
        16f95f3b ("KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace")
        bb58b90b ("KVM: Introduce KVM_SET_USER_MEMORY_REGION2")
        3f9cd0ca ("KVM: arm64: Allow userspace to get the writable masks for feature ID registers")
      
      That automatically adds support for some new ioctls and remove a bunch
      of deprecated ones.
      
      This ends up making the new binary to forget about the deprecated one,
      so when used in an older system it will not be able to resolve those
      codes to strings.
      
        $ tools/perf/trace/beauty/kvm_ioctl.sh > before
        $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
        $ tools/perf/trace/beauty/kvm_ioctl.sh > after
        $ diff -u before after
        --- before	2024-01-27 14:48:16.523014020 -0300
        +++ after	2024-01-27 14:48:24.183932866 -0300
        @@ -14,6 +14,7 @@
         	[0x46] = "SET_USER_MEMORY_REGION",
         	[0x47] = "SET_TSS_ADDR",
         	[0x48] = "SET_IDENTITY_MAP_ADDR",
        +	[0x49] = "SET_USER_MEMORY_REGION2",
         	[0x60] = "CREATE_IRQCHIP",
         	[0x61] = "IRQ_LINE",
         	[0x62] = "GET_IRQCHIP",
        @@ -22,14 +23,8 @@
         	[0x65] = "GET_PIT",
         	[0x66] = "SET_PIT",
         	[0x67] = "IRQ_LINE_STATUS",
        -	[0x69] = "ASSIGN_PCI_DEVICE",
         	[0x6a] = "SET_GSI_ROUTING",
        -	[0x70] = "ASSIGN_DEV_IRQ",
         	[0x71] = "REINJECT_CONTROL",
        -	[0x72] = "DEASSIGN_PCI_DEVICE",
        -	[0x73] = "ASSIGN_SET_MSIX_NR",
        -	[0x74] = "ASSIGN_SET_MSIX_ENTRY",
        -	[0x75] = "DEASSIGN_DEV_IRQ",
         	[0x76] = "IRQFD",
         	[0x77] = "CREATE_PIT2",
         	[0x78] = "SET_BOOT_CPU_ID",
        @@ -66,7 +61,6 @@
         	[0x9f] = "GET_VCPU_EVENTS",
         	[0xa0] = "SET_VCPU_EVENTS",
         	[0xa3] = "ENABLE_CAP",
        -	[0xa4] = "ASSIGN_SET_INTX_MASK",
         	[0xa5] = "SIGNAL_MSI",
         	[0xa6] = "GET_XCRS",
         	[0xa7] = "SET_XCRS",
        @@ -97,6 +91,8 @@
         	[0xcd] = "SET_SREGS2",
         	[0xce] = "GET_STATS_FD",
         	[0xd0] = "XEN_HVM_EVTCHN_SEND",
        +	[0xd2] = "SET_MEMORY_ATTRIBUTES",
        +	[0xd4] = "CREATE_GUEST_MEMFD",
         	[0xe0] = "CREATE_DEVICE",
         	[0xe1] = "SET_DEVICE_ATTR",
         	[0xe2] = "GET_DEVICE_ATTR",
        $
      
      This silences these perf build warnings:
      
        Warning: Kernel ABI header differences:
          diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
          diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Chao Peng <chao.p.peng@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jing Zhang <jingzhangos@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paul Durrant <pdurrant@amazon.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/lkml/ZbVLbkngp4oq13qN@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e30dca91
  13. 26 Jan, 2024 4 commits