1. 12 Apr, 2024 25 commits
    • Ian Rogers's avatar
      perf arch x86: Add shellcheck to build · ec440763
      Ian Rogers authored
      Add shellcheck for:
      
        tools/perf/arch/x86/tests/gen-insn-x86-dat.sh
        tools/perf/arch/x86/entry/syscalls/syscalltbl.sh
      
      Address a minor quoting issue.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Link: https://lore.kernel.org/r/20240409023216.2342032-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec440763
    • Ian Rogers's avatar
      perf build: Add shellcheck to tools/perf scripts · 646e22eb
      Ian Rogers authored
      Address shell check errors/warnings in perf-archive.sh and
      perf-completion.sh.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Link: https://lore.kernel.org/r/20240409023216.2342032-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      646e22eb
    • Ian Rogers's avatar
      perf list: Escape '\r' in JSON output · 20b0027c
      Ian Rogers authored
      Events like for sapphirerapids have '\r' in the uncore descriptions. The
      non-escaped versions of this fail JSON validation the the 'perf list'
      test.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240410222353.1722840-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      20b0027c
    • Ian Rogers's avatar
      perf dsos: Switch more loops to dsos__for_each_dso() · 0ffc8fca
      Ian Rogers authored
      Switch loops within dsos.c, add a version that isn't locked. Switch
      some unlocked loops to hold the read lock.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240410064214.2755936-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0ffc8fca
    • Ian Rogers's avatar
      perf dso: Move dso functions out of dsos.c · 1d6eff93
      Ian Rogers authored
      Move dso and dso_id functions to dso.c to match the struct declarations.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240410064214.2755936-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d6eff93
    • Ian Rogers's avatar
      perf dsos: Introduce dsos__for_each_dso() · 73f3fea2
      Ian Rogers authored
      To better abstract the dsos internals, introduce dsos__for_each_dso that
      does a callback on each dso.
      
      This also means the read lock can be correctly held.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240410064214.2755936-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      73f3fea2
    • Ian Rogers's avatar
      perf dsos: Tidy reference counting and locking · f649ed80
      Ian Rogers authored
      Move more functionality into dsos.c generally from machine.c, renaming
      functions to match their new usage.
      
      The find function is made to always "get" before returning a dso.
      
      Reduce the scope of locks in vdso to match the locking paradigm.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240410064214.2755936-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f649ed80
    • Ian Rogers's avatar
      perf dsos: Attempt to better abstract DSOs internals · 83acca9f
      Ian Rogers authored
      Move functions from machine and build-id to dsos. Pass 'struct dsos'
      rather than internal state.
      
      Rename some functions to better represent which data structure they
      operate on.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Anne Macedo <retpolanne@posteo.net>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Chengen Du <chengen.du@canonical.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Markus Elfring <Markus.Elfring@web.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Sun Haiyong <sunhaiyong@loongson.cn>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Link: https://lore.kernel.org/r/20240410064214.2755936-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      83acca9f
    • Adrian Hunter's avatar
      perf record: Fix debug message placement for test consumption · 792bc998
      Adrian Hunter authored
      evlist__config() might mess up the debug output consumed by test
      "Test per-thread recording" in "Miscellaneous Intel PT testing".
      
      Move it out from between the debug prints:
      
        "perf record opening and mmapping events" and
        "perf record done opening and mmapping events"
      
      Fixes: da406202 ("perf tools: Add debug messages and comments for testing")
      Closes: https://lore.kernel.org/linux-perf-users/ZhVfc5jYLarnGzKa@x1/Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240411075447.17306-1-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      792bc998
    • Namhyung Kim's avatar
      perf annotate: Skip DSOs not found · 873a8373
      Namhyung Kim authored
      In some data file, I see the following messages repeated.  It seems it
      doesn't have DSOs in the system and the dso->binary_type is set to
      DSO_BINARY_TYPE__NOT_FOUND.  Let's skip them to avoid the followings.
      
        No output from objdump  --start-address=0x0000000000000000 --stop-address=0x00000000000000d4  -d --no-show-raw-insn       -C "$1"
        Error running objdump  --start-address=0x0000000000000000 --stop-address=0x0000000000000631  -d --no-show-raw-insn       -C "$1"
        ...
      
      Closes: https://lore.kernel.org/linux-perf-users/15e1a2847b8cebab4de57fc68e033086aa6980ce.camel@yandex.ru/Reported-by: default avatarKonstantin Kharlamov <Hi-Angel@yandex.ru>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarKonstantin Kharlamov <Hi-Angel@yandex.ru>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240410185117.1987239-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      873a8373
    • Namhyung Kim's avatar
      perf report: Do not collect sample histogram unnecessarily · 6cdd977e
      Namhyung Kim authored
      The data type profiling alone doesn't need the sample histogram for
      functions.  It only needs the histogram for the types.
      
      Let's remove the condition in the report_callback to check if data type
      profiling is selected and make sure the annotation has the 'struct
      annotated_source' instantiated before calling symbol__disassemble().
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-8-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6cdd977e
    • Namhyung Kim's avatar
      perf report: Add a menu item to annotate data type in TUI · 0bfbe661
      Namhyung Kim authored
      When the hist entry has the type info, it should be able to display the
      annotation browser for the type like in `perf annotate --data-type`.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-7-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0bfbe661
    • Namhyung Kim's avatar
      perf annotate-data: Support event group display in TUI · 2b08f219
      Namhyung Kim authored
      Like in stdio, it should print all events in a group together.
      
      Committer notes:
      
      Collect it:
      
        root@number:~# perf record -a -e '{cpu_core/mem-loads,ldlat=30/P,cpu_core/mem-stores/P}'
        ^C[ perf record: Woken up 8 times to write data ]
        [ perf record: Captured and wrote 4.980 MB perf.data (55825 samples) ]
        root@number:~#
      
      Then do it in stdio:
      
        root@number:~# perf annotate --stdio --data-type
      
        Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples):
         event[0] = cpu_core/mem-loads,ldlat=30/P
         event[1] = cpu_core/mem-stores/P
        ============================================================================
                 Percent     offset       size  field
          100.00  100.00          0         40  union    {
          100.00  100.00          0         40      struct __pthread_mutex_s    __data {
           48.61   23.46          0          4          int     __lock;
            0.00    0.48          4          4          unsigned int    __count;
            6.38   41.32          8          4          int     __owner;
            8.74   34.02         12          4          unsigned int    __nusers;
           35.66    0.26         16          4          int     __kind;
            0.61    0.45         20          2          short int       __spins;
            0.00    0.00         22          2          short int       __elision;
            0.00    0.00         24         16          __pthread_list_t        __list {
            0.00    0.00         24          8              struct __pthread_internal_list*     __prev;
            0.00    0.00         32          8              struct __pthread_internal_list*     __next;
                                                        };
                                                    };
            0.00    0.00          0          0      char*       __size;
           48.61   23.94          0          8      long int    __align;
                                                };
      
      Now with TUI before this patch:
      
        root@number:~# perf annotate --tui --data-type
        Annotate type: 'union ' (790 samples)
            Percent     Offset       Size  Field
             100.00          0         40  union  {
             100.00          0         40      struct __pthread_mutex_s __data {
              48.61          0          4          int  __lock;
               0.00          4          4          unsigned int __count;
               6.38          8          4          int  __owner;
               8.74         12          4          unsigned int __nusers;
              35.66         16          4          int  __kind;
               0.61         20          2          short int    __spins;
               0.00         22          2          short int    __elision;
               0.00         24         16          __pthread_list_t     __list {
               0.00         24          8              struct __pthread_internal_list*  __prev;
               0.00         32          8              struct __pthread_internal_list*  __next;
      
               0.00          0          0      char*    __size;
              48.61          0          8      long int __align;
                                           };
      
      And now after this patch:
      
      Annotate type: 'union ' (790 samples)
                     Percent     Offset       Size  Field
           100.00     100.00          0         40  union  {
           100.00     100.00          0         40      struct __pthread_mutex_s      __data {
            48.61      23.46          0          4          int       __lock;
             0.00       0.48          4          4          unsigned int      __count;
             6.38      41.32          8          4          int       __owner;
             8.74      34.02         12          4          unsigned int      __nusers;
            35.66       0.26         16          4          int       __kind;
             0.61       0.45         20          2          short int __spins;
             0.00       0.00         22          2          short int __elision;
             0.00       0.00         24         16          __pthread_list_t  __list {
             0.00       0.00         24          8              struct __pthread_internal_list*       __prev;
             0.00       0.00         32          8              struct __pthread_internal_list*       __next;
                                                            };
                                                        };
             0.00       0.00          0          0      char* __size;
            48.61      23.94          0          8      long int      __align;
                                                    };
      
      On a followup patch the --tui output should have this that is present in
      --stdio:
      
        And the --stdio has all the missing info in TUI:
      
          Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples):
           event[0] = cpu_core/mem-loads,ldlat=30/P
           event[1] = cpu_core/mem-stores/P
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2b08f219
    • Namhyung Kim's avatar
      perf annotate-data: Add hist_entry__annotate_data_tui() · d001c7a7
      Namhyung Kim authored
      Support data type profiling output on TUI.
      
      Testing from Arnaldo:
      
      First make sure that the debug information for your workload binaries
      in embedded in them by building it with '-g' or install the debuginfo
      packages, since our workload is 'find':
      
        root@number:~# type find
        find is hashed (/usr/bin/find)
        root@number:~# rpm -qf /usr/bin/find
        findutils-4.9.0-5.fc39.x86_64
        root@number:~# dnf debuginfo-install findutils
        <SNIP>
        root@number:~#
      
      Then collect some data:
      
        root@number:~# echo 1 > /proc/sys/vm/drop_caches
        root@number:~# perf mem record find / > /dev/null
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
        root@number:~#
      
      Finally do data-type annotation with the following command, that will
      default, as 'perf report' to the --tui mode, with lines colored to
      highlight the hotspots, etc.
      
        root@number:~# perf annotate --data-type
        Annotate type: 'struct predicate' (58 samples)
            Percent     Offset       Size  Field
             100.00          0        312  struct predicate {
               0.00          0          8      PRED_FUNC        pred_func;
               0.00          8          8      char*    p_name;
               0.00         16          4      enum predicate_type      p_type;
               0.00         20          4      enum predicate_precedence        p_prec;
               0.00         24          1      _Bool    side_effects;
               0.00         25          1      _Bool    no_default_print;
               0.00         26          1      _Bool    need_stat;
               0.00         27          1      _Bool    need_type;
               0.00         28          1      _Bool    need_inum;
               0.00         32          4      enum EvaluationCost      p_cost;
               0.00         36          4      float    est_success_rate;
               0.00         40          1      _Bool    literal_control_chars;
               0.00         41          1      _Bool    artificial;
               0.00         48          8      char*    arg_text;
        <SNIP>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d001c7a7
    • Namhyung Kim's avatar
      perf annotate-data: Add hist_entry__annotate_data_tty() · 9b561be1
      Namhyung Kim authored
      And move the related code into util/annotate-data.c file.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9b561be1
    • Namhyung Kim's avatar
      perf annotate: Show progress of sample processing · d9aedc12
      Namhyung Kim authored
      Like 'perf report', it can take a while to process samples.
      
      Show a progress window to inform users how that it is not stuck.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d9aedc12
    • Namhyung Kim's avatar
      perf annotate-data: Skip sample histogram for stack canary · eb833488
      Namhyung Kim authored
      It's a pseudo data type and has no field.
      
      Fixes: b3c95109 ("perf annotate-data: Add stack canary type")
      Closes: https://lore.kernel.org/lkml/Zhb6jJneP36Z-or0@x1Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240411033256.2099646-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb833488
    • James Clark's avatar
      perf tests: Remove dependency on lscpu · 7aa87499
      James Clark authored
      This check can be done with uname which is more portable. At the same
      time re-arrange it into a standard if statement so that it's more
      readable.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Spoorthy S <spoorts2@in.ibm.com>
      Link: https://lore.kernel.org/r/20240410103458.813656-5-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7aa87499
    • James Clark's avatar
      perf map: Remove kernel map before updating start and end addresses · df12e21d
      James Clark authored
      In a debug build there is validation that mmap lists are sorted when
      taking a lock. In machine__update_kernel_mmap() the start and end
      addresses are updated resulting in an unsorted list before the map is
      removed from the list. When the map is removed, the lock is taken which
      triggers the validation and the failure:
      
        $ perf test "object code reading"
        --- start ---
        perf: util/maps.c:88: check_invariants: Assertion `map__start(prev) <= map__start(map)' failed.
        Aborted
      
      Fix it by updating the addresses after removal, but before insertion.
      The bug depends on the ordering and type of debug info on the system and
      doesn't reproduce everywhere.
      
      Fixes: 659ad349 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Spoorthy S <spoorts2@in.ibm.com>
      Link: https://lore.kernel.org/r/20240410103458.813656-4-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      df12e21d
    • James Clark's avatar
      perf tests: Apply attributes to all events in object code reading test · 2dade41a
      James Clark authored
      PERF_PMU_CAP_EXTENDED_HW_TYPE results in multiple events being opened on
      heterogeneous systems. Currently this test only sets its required
      attributes on the first event. Not disabling enable_on_exec on the other
      events causes the test to fail because the forked objdump processes are
      sampled. No tracking event is opened so Perf only knows about its own
      mappings causing the objdump samples to give the following error:
      
        $ perf test -vvv "object code reading"
      
        Reading object code for memory address: 0xffff9aaa55ec
        thread__find_map failed
        ---- end(-1) ----
        24: Object code reading              : FAILED!
      
      Fixes: 251aa040 ("perf parse-events: Wildcard most "numeric" events")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Spoorthy S <spoorts2@in.ibm.com>
      Link: https://lore.kernel.org/r/20240410103458.813656-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2dade41a
    • James Clark's avatar
      perf tests: Make "test data symbol" more robust on Neoverse N1 · 256ef072
      James Clark authored
      To prevent anyone from seeing a test failure appear as a regression and
      thinking that it was caused by their code change, insert some noise into
      the loop which makes it immune to sampling bias issues (errata 1694299).
      
      The "test data symbol" test can fail with any unrelated change that
      shifts the loop into an unfortunate position in the Perf binary which is
      almost impossible to debug as the root cause of the test failure.
      Ultimately it's caused by the referenced errata.
      
      Fixes: 60abedb8 ("perf test: Introduce script for data symbol testing")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Spoorthy S <spoorts2@in.ibm.com>
      Link: https://lore.kernel.org/r/20240410103458.813656-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      256ef072
    • Ian Rogers's avatar
      perf metrics: Remove the "No_group" metric group · 4b5ee6db
      Ian Rogers authored
      Rather than place metrics without a metric group in "No_group" place
      them in a a metric group that is their name. Still allow such metrics
      to be selected if "No_group" is passed, this change just impacts perf
      list.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240403164636.3429091-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4b5ee6db
    • Namhyung Kim's avatar
      perf annotate: Get rid of symbol__ensure_annotate() · 0235abd8
      Namhyung Kim authored
      Now symbol__annotate() is reentrant and it doesn't need to remove
      non-instruction lines.  Let's get rid of symbol__ensure_annotate() and
      call symbol__annotate() directly.  Also we can use it to get the arch
      pointer instead of calling evsel__get_arch() directly.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240405211800.1412920-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0235abd8
    • Namhyung Kim's avatar
      perf annotate-data: Do not delete non-asm lines · 879ebf3c
      Namhyung Kim authored
      For data type profiling, it removed non-instruction lines from the list
      of annotation lines.  It was to simplify the implementation dealing with
      instructions like to calculate the PC-relative address and to search the
      shortest path to the target instruction or basic block.
      
      But it means that it removes all the comments and debug information in
      the annotate output like source file name and line numbers.  To support
      both code annotation and data type annotation, it'd be better to keep
      the non-instruction lines as well.
      
      So this change is to skip those lines during the data type profiling
      and to display them in the normal perf annotate output.
      
      No function changes intended (other than having more lines).
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240405211800.1412920-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      879ebf3c
    • Namhyung Kim's avatar
      perf annotate-data: Fix global variable lookup · 65785213
      Namhyung Kim authored
      The recent change in the global variable handling added a bug to miss
      setting the return value even if it found a data type.  Also add the
      type name in the debug message.
      
      Fixes: 1ebb5e17 ("perf annotate-data: Add get_global_var_type()")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240405211800.1412920-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      65785213
  2. 08 Apr, 2024 11 commits
  3. 05 Apr, 2024 2 commits
    • Andi Kleen's avatar
      perf script: Add capstone support for '-F +brstackdisasm' · d8120446
      Andi Kleen authored
      Support capstone output for the '-F +brstackinsn' branch dump.
      
      The new output is enabled with the new field 'brstackdisasm'.
      
      This was possible before with --xed, but now also allow it for users
      that don't have xed using the builtin capstone support.
      
      Before:
      
        perf record -b emacs -Q --batch '()'
        perf script -F +brstackinsn
        ...
                  emacs   55778 1814366.755945:     151564 cycles:P:      7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s>        intel_check_word.constprop.0+237:
                00007f0ab2d1711d        insn: 75 e6                     # PRED 3 cycles [3]
                00007f0ab2d17105        insn: 73 51
                00007f0ab2d17107        insn: 48 89 c1
                00007f0ab2d1710a        insn: 48 39 ca
                00007f0ab2d1710d        insn: 73 96
                00007f0ab2d1710f        insn: 48 8d 04 11
                00007f0ab2d17113        insn: 48 d1 e8
                00007f0ab2d17116        insn: 49 8d 34 c1
                00007f0ab2d1711a        insn: 44 3a 06
                00007f0ab2d1711d        insn: 75 e6                     # PRED 3 cycles [6] 3.00 IPC
                00007f0ab2d17105        insn: 73 51                     # PRED 1 cycles [7] 1.00 IPC
                00007f0ab2d17158        insn: 48 8d 50 01
                00007f0ab2d1715c        insn: eb 92                     # PRED 1 cycles [8] 2.00 IPC
                00007f0ab2d170f0        insn: 48 39 ca
                00007f0ab2d170f3        insn: 73 b0                     # PRED 1 cycles [9] 2.00 IPC
      
      After (perf must be compiled with capstone):
      
        perf script -F +brstackdisasm
      
        ...
                   emacs   55778 1814366.755945:     151564 cycles:P:      7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s>        intel_check_word.constprop.0+237:
                00007f0ab2d1711d        jne intel_check_word.constprop.0+0xd5   # PRED 3 cycles [3]
                00007f0ab2d17105        jae intel_check_word.constprop.0+0x128
                00007f0ab2d17107        movq %rax, %rcx
                00007f0ab2d1710a        cmpq %rcx, %rdx
                00007f0ab2d1710d        jae intel_check_word.constprop.0+0x75
                00007f0ab2d1710f        leaq (%rcx, %rdx), %rax
                00007f0ab2d17113        shrq $1, %rax
                00007f0ab2d17116        leaq (%r9, %rax, 8), %rsi
                00007f0ab2d1711a        cmpb (%rsi), %r8b
                00007f0ab2d1711d        jne intel_check_word.constprop.0+0xd5   # PRED 3 cycles [6] 3.00 IPC
                00007f0ab2d17105        jae intel_check_word.constprop.0+0x128  # PRED 1 cycles [7] 1.00 IPC
                00007f0ab2d17158        leaq 1(%rax), %rdx
                00007f0ab2d1715c        jmp intel_check_word.constprop.0+0xc0   # PRED 1 cycles [8] 2.00 IPC
                00007f0ab2d170f0        cmpq %rcx, %rdx
                00007f0ab2d170f3        jae intel_check_word.constprop.0+0x75   # PRED 1 cycles [9] 2.00 IPC
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Link: https://lore.kernel.org/r/20240401210925.209671-3-ak@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d8120446
    • Andi Kleen's avatar
      perf script: Support 32bit code under 64bit OS with capstone · 38ab6013
      Andi Kleen authored
      Use the DSO to resolve whether an IP is 32bit or 64bit and use that to
      configure capstone to the correct mode. This allows to correctly
      disassemble 32bit code under a 64bit OS.
      
        % cat > loop.c
        volatile int var;
        int main(void)
        {
        	int i;
        	for (i = 0; i < 100000; i++)
        		var++;
        }
        % gcc -m32 -o loop loop.c
        % perf record -e cycles:u ./loop
        % perf script -F +disasm
          loop   82665 1833176.618023:      1 cycles:u:   f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2)   movl %esp, %eax
          loop   82665 1833176.618029:      1 cycles:u:   f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2)   movl %esp, %eax
          loop   82665 1833176.618031:      7 cycles:u:   f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2)   movl %esp, %eax
          loop   82665 1833176.618034:     91 cycles:u:   f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2)   movl %esp, %eax
          loop   82665 1833176.618036:   1242 cycles:u:   f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2)   movl %esp, %eax
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Link: https://lore.kernel.org/r/20240401210925.209671-2-ak@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38ab6013
  4. 04 Apr, 2024 2 commits
    • Thomas Richter's avatar
      perf stat: Do not fail on metrics on s390 z/VM systems · c2f3d7df
      Thomas Richter authored
      On s390 z/VM virtual machines command 'perf list' also displays metrics:
      
        # perf list | grep -A 20 'Metric Groups:'
        Metric Groups:
      
        No_group:
         cpi
              [Cycles per Instruction]
         est_cpi
              [Estimated Instruction Complexity CPI infinite Level 1]
         finite_cpi
              [Cycles per Instructions from Finite cache/memory]
         l1mp
              [Level One Miss per 100 Instructions]
         l2p
              [Percentage sourced from Level 2 cache]
         l3p
              [Percentage sourced from Level 3 on same chip cache]
         l4lp
              [Percentage sourced from Level 4 Local cache on same book]
         l4rp
              [Percentage sourced from Level 4 Remote cache on different book]
         memp
              [Percentage sourced from memory]
         ....
        #
      
      The command
      
        # perf stat -M cpi -- true
        event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/.....'
                              \___ Bad event or PMU
      
        Unable to find PMU or event on a PMU of 'CPU_CYCLES'
      
         event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/...'
                              \___ Cannot find PMU `CPU_CYCLES'.
                                   Missing kernel support?
       #
      
      fails. 'perf stat' should not fail on metrics when the referenced CPU
      Counter Measurement PMU is not available.
      
      Output after:
      
        # perf stat -M est_cpi -- sleep 1
      
        Performance counter stats for 'sleep 1':
      
           1,000,887,494 ns   duration_time   #     0.00 est_cpi
      
             1.000887494 seconds time elapsed
      
             0.000143000 seconds user
             0.000662000 seconds sys
      
       #
      
      Fixes: 7f76b311 ("perf list: Add IBM z16 event description for s390")
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240404064806.1362876-2-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c2f3d7df
    • Thomas Richter's avatar
      perf report: Fix PAI counter names for s390 virtual machines · b74bc5a6
      Thomas Richter authored
      s390 introduced the Processor Activity Instrumentation (PAI) counter
      facility on LPAR and virtual machines z/VM for models 3931 and 3932.
      
      These counters are stored as raw data in the perf.data file and are
      displayed with:
      
       # perf report -i /tmp//perfout-635468 -D | grep Counter
      	Counter:007 <unknown> Value:0x00000000000186a0
      	Counter:032 <unknown> Value:0x0000000000000001
      	Counter:032 <unknown> Value:0x0000000000000001
      	Counter:032 <unknown> Value:0x0000000000000001
       #
      
      However on z/VM virtual machines, the counter names are not retrieved
      from the PMU and are shown as '<unknown>'.  This is caused by the CPU
      string saved in the mapfile.csv for this machine:
      
         ^IBM.393[12].*3\.7.[[:xdigit:]]+$,3,cf_z16,core
      
      This string contains the CPU Measurement facility first and second
      version number and authorization level (3\.7.[[:xdigit:]]+).  These
      numbers do not apply to the PAI counter facility.  In fact they can be
      omitted.
      
      Shorten the CPU identification string for this machine to manufacturer
      and model. This is sufficient for all PMU devices.
      
      Output after:
      
       # perf report -i /tmp//perfout-635468 -D | grep Counter
      	Counter:007 km_aes_128 Value:0x00000000000186a0
      	Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
      	Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
      	Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
       #
      
      Fixes: b539deaf ("perf report: Add s390 raw data interpretation for PAI counters")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240404064806.1362876-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b74bc5a6