1. 21 Mar, 2024 40 commits
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Fix 'newfstatat'/'fstatat' argument pretty printing · 0831638e
      Arnaldo Carvalho de Melo authored
      There were needless two entries, one for 'newfstatat' and another for
      'fstatat', keep just one and pretty print its 'flags' argument using the
      fs_at_flags scnprintf that is also used by other FS syscalls such as
      'stat', now:
      
        root@number:~# perf trace -e newfstatat --max-events=5
             0.000 ( 0.010 ms): abrt-dump-jour/1400 newfstatat(dfd: 7, filename: "", statbuf: 0x7fff0d127000, flag: EMPTY_PATH) = 0
             0.020 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 9, filename: "", statbuf: 0x55752507b0e8, flag: EMPTY_PATH) = 0
             0.039 ( 0.004 ms): abrt-dump-jour/1400 newfstatat(dfd: 19, filename: "", statbuf: 0x557525061378, flag: EMPTY_PATH) = 0
             0.047 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 20, filename: "", statbuf: 0x5575250b8cc8, flag: EMPTY_PATH) = 0
             0.053 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 22, filename: "", statbuf: 0x5575250535d8, flag: EMPTY_PATH) = 0
        root@number:~#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-6-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0831638e
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Beautify the 'flags' arg of unlinkat · 4d923282
      Arnaldo Carvalho de Melo authored
      Reusing the fs_at_flags array done for the 'stat' syscall.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-5-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4d923282
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce faccessat2 flags scnprintf routine · b8171a84
      Arnaldo Carvalho de Melo authored
      The fsaccessat and fsaccessat2 now have beautifiers for its arguments.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-4-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8171a84
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce scrape script for the 'statx' syscall 'mask' argument · f122b3d6
      Arnaldo Carvalho de Melo authored
      It was using the first variation on producing a string representation
      for a binary flag, one that used the system's stat.h and preprocessor
      tricks that had to be updated everytime a new flag was introduced.
      
      Use the more recent scrape script + strarray +
      strarray__scnprintf_flags() combo.
      
        $ tools/perf/trace/beauty/statx_mask.sh
        static const char *statx_mask[] = {
        	[ilog2(0x00000001) + 1] = "TYPE",
        	[ilog2(0x00000002) + 1] = "MODE",
        	[ilog2(0x00000004) + 1] = "NLINK",
        	[ilog2(0x00000008) + 1] = "UID",
        	[ilog2(0x00000010) + 1] = "GID",
        	[ilog2(0x00000020) + 1] = "ATIME",
        	[ilog2(0x00000040) + 1] = "MTIME",
        	[ilog2(0x00000080) + 1] = "CTIME",
        	[ilog2(0x00000100) + 1] = "INO",
        	[ilog2(0x00000200) + 1] = "SIZE",
        	[ilog2(0x00000400) + 1] = "BLOCKS",
        	[ilog2(0x00000800) + 1] = "BTIME",
        	[ilog2(0x00001000) + 1] = "MNT_ID",
        	[ilog2(0x00002000) + 1] = "DIOALIGN",
        	[ilog2(0x00004000) + 1] = "MNT_ID_UNIQUE",
        };
        $
      
      Now we need a copy of uapi/linux/stat.h from tools/include/ in the
      scrape only directory tools/perf/trace/beauty/include.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-3-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f122b3d6
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce scrape script for various fs syscalls 'flags' arguments · 3d6cfbaf
      Arnaldo Carvalho de Melo authored
      It was using the first variation on producing a string representation
      for a binary flag, one that used the system's fcntl.h and preprocessor
      tricks that had to be updated everytime a new flag was introduced.
      
      Use the more recent scrape script + strarray + strarray__scnprintf_flags() combo.
      
        $ tools/perf/trace/beauty/fs_at_flags.sh
        static const char *fs_at_flags[] = {
        	[ilog2(0x100) + 1] = "SYMLINK_NOFOLLOW",
        	[ilog2(0x200) + 1] = "REMOVEDIR",
        	[ilog2(0x400) + 1] = "SYMLINK_FOLLOW",
        	[ilog2(0x800) + 1] = "NO_AUTOMOUNT",
        	[ilog2(0x1000) + 1] = "EMPTY_PATH",
        	[ilog2(0x0000) + 1] = "STATX_SYNC_AS_STAT",
        	[ilog2(0x2000) + 1] = "STATX_FORCE_SYNC",
        	[ilog2(0x4000) + 1] = "STATX_DONT_SYNC",
        	[ilog2(0x8000) + 1] = "RECURSIVE",
        	[ilog2(0x80000000) + 1] = "GETATTR_NOSEC",
        };
        $
      
      Now we need a copy of uapi/linux/fcntl.h from tools/include/ in the
      scrape only directory tools/perf/trace/beauty/include and will use that
      fs_at_flags array for other fs syscalls.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/20240320193115.811899-2-acme@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3d6cfbaf
    • Ian Rogers's avatar
      perf tests: Run tests in parallel by default · 4cef0e7a
      Ian Rogers authored
      Switch from running tests sequentially to running in parallel by
      default. Change the opt-in '-p' or '--parallel' flag to '-S' or
      '--sequential'.
      
      On an 8 core tigerlake an address sanitizer run time changes from:
      
        326.54user 622.73system 6:59.91elapsed 226%CPU
      
      to:
      
        973.02user 583.98system 3:01.17elapsed 859%CPU
      
      So over twice as fast, saving 4 minutes.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301174711.2646944-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4cef0e7a
    • Ian Rogers's avatar
      perf help: Lower levenshtein penality for deleting character · 7aea01ea
      Ian Rogers authored
      The levenshtein penalty for deleting a character was far higher than
      subsituting or inserting a character. Lower the penalty to match that
      of inserting a character.
      
      Before:
      
        $ perf recccord
        perf: 'recccord' is not a perf-command. See 'perf --help'.
        $
      
      After:
      
        $ perf recccord
        perf: 'recccord' is not a perf-command. See 'perf --help'.
      
        Did you mean this?
                record
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301201306.2680986-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7aea01ea
    • Ian Rogers's avatar
      perf tools: Suggest inbuilt commands for unknown command · f664d515
      Ian Rogers authored
      The existing unknown command code looks for perf scripts like
      perf-archive.sh and perf-iostat.sh, however, inbuilt commands aren't
      suggested. Add the inbuilt commands so they may be suggested too.
      
      Before:
      
        $ perf reccord
        perf: 'reccord' is not a perf-command. See 'perf --help'.
        $
      
      After:
      
        $ perf reccord
        perf: 'reccord' is not a perf-command. See 'perf --help'.
      
        Did you mean this?
                record
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20240301201306.2680986-1-irogers@google.com
      [ Added some fixes from Ian to problems I noticed while testing ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f664d515
    • Ian Rogers's avatar
      perf test: Read child test 10 times a second rather than 1 · 5f2f051a
      Ian Rogers authored
      Make the perf test output smoother by timing out the poll of the child
      process after 100ms rather than 1s.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5f2f051a
    • Ian Rogers's avatar
      perf test: Use a single fd for the child process out/err · e120f709
      Ian Rogers authored
      Switch from dumping err then out, to a single file descriptor for both
      of them. This allows the err and output to be correctly interleaved in
      verbose output.
      
      Fixes: b482f5f8 ("perf tests: Add option to run tests in parallel")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e120f709
    • Ian Rogers's avatar
      perf test: Stat output per thread of just the parent process · f68c981b
      Ian Rogers authored
      Per-thread mode requires either system-wide (-a), a pid (-p) or a tid
      (-t).
      
      The stat output tests were using system-wide mode but this is racy when
      threads are starting and exiting - something that happens a lot when
      running the tests in parallel (perf test -p).
      
      Avoid the race conditions by using pid mode with the pid of the parent
      process.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f68c981b
    • Ian Rogers's avatar
      perf record: Delete session after stopping sideband thread · 88ce0106
      Ian Rogers authored
      The session has a header in it which contains a perf env with
      bpf_progs. The bpf_progs are accessed by the sideband thread and so
      the sideband thread must be stopped before the session is deleted, to
      avoid a use after free.  This error was detected by AddressSanitizer
      in the following:
      
        ==2054673==ERROR: AddressSanitizer: heap-use-after-free on address 0x61d000161e00 at pc 0x55769289de54 bp 0x7f9df36d4ab0 sp 0x7f9df36d4aa8
        READ of size 8 at 0x61d000161e00 thread T1
            #0 0x55769289de53 in __perf_env__insert_bpf_prog_info util/env.c:42
            #1 0x55769289dbb1 in perf_env__insert_bpf_prog_info util/env.c:29
            #2 0x557692bbae29 in perf_env__add_bpf_info util/bpf-event.c:483
            #3 0x557692bbb01a in bpf_event__sb_cb util/bpf-event.c:512
            #4 0x5576928b75f4 in perf_evlist__poll_thread util/sideband_evlist.c:68
            #5 0x7f9df96a63eb in start_thread nptl/pthread_create.c:444
            #6 0x7f9df9726a4b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
      
        0x61d000161e00 is located 384 bytes inside of 2136-byte region [0x61d000161c80,0x61d0001624d8)
        freed by thread T0 here:
            #0 0x7f9dfa6d7288 in __interceptor_free libsanitizer/asan/asan_malloc_linux.cpp:52
            #1 0x557692978d50 in perf_session__delete util/session.c:319
            #2 0x557692673959 in __cmd_record tools/perf/builtin-record.c:2884
            #3 0x55769267a9f0 in cmd_record tools/perf/builtin-record.c:4259
            #4 0x55769286710c in run_builtin tools/perf/perf.c:349
            #5 0x557692867678 in handle_internal_command tools/perf/perf.c:402
            #6 0x557692867a40 in run_argv tools/perf/perf.c:446
            #7 0x557692867fae in main tools/perf/perf.c:562
            #8 0x7f9df96456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
      
      Fixes: 657ee553 ("perf evlist: Introduce side band thread")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Link: https://lore.kernel.org/r/20240301074639.2260708-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      88ce0106
    • Ian Rogers's avatar
      perf tools: Add/use PMU reverse lookup from config to name · 67ee8e71
      Ian Rogers authored
      Add perf_pmu__name_from_config that does a reverse lookup from a
      config number to an alias name. The lookup is expensive as the config
      is computed for every alias by filling in a perf_event_attr, but this
      is only done when verbose output is enabled. The lookup also only
      considers config, and not config1, config2 or config3.
      
      An example of the output:
      
        $ perf stat -vv -e data_read true
        ...
        perf_event_attr:
          type                             24 (uncore_imc_free_running_0)
          size                             136
          config                           0x20ff (data_read)
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ...
      
      Committer notes:
      
      Fix the python binding build by adding dummies for not strictly
      needed perf_pmu__name_from_config() and perf_pmus__find_by_type().
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-7-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      67ee8e71
    • Ian Rogers's avatar
      perf tools: Use pmus to describe type from attribute · 70938820
      Ian Rogers authored
      When dumping a perf_event_attr, use pmus to find the PMU and its name
      by the type number. This allows dynamically added PMUs to be described.
      
      Before:
      
        $ perf stat -vv -e data_read true
        ...
        perf_event_attr:
          type                             24
          size                             136
          config                           0x20ff
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ...
      
      After:
      
        $ perf stat -vv -e data_read true
        ...
        perf_event_attr:
          type                             24 (uncore_imc_free_running_0)
          size                             136
          config                           0x20ff
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          exclude_guest                    1
        ...
      
      However, it also means that when we have a PMU name we prefer it to a
      hard coded name:
      
      Before:
      
        $ perf stat -vv -e faults true
        ...
        perf_event_attr:
          type                             1 (PERF_TYPE_SOFTWARE)
          size                             136
          config                           0x2 (PERF_COUNT_SW_PAGE_FAULTS)
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ...
      
      After:
      
        $ perf stat -vv -e faults true
        ...
        perf_event_attr:
          type                             1 (software)
          size                             136
          config                           0x2 (PERF_COUNT_SW_PAGE_FAULTS)
          sample_type                      IDENTIFIER
          read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
          disabled                         1
          inherit                          1
          enable_on_exec                   1
          exclude_guest                    1
        ...
      
      It feels more consistent to do this, rather than only prefer a PMU
      name when a hard coded name isn't available.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      70938820
    • Ian Rogers's avatar
      perf list: Give more details about raw event encodings · 4ccf3bb7
      Ian Rogers authored
      List all the PMUs, not just the first core one, and list real format
      specifiers with value ranges.
      
      Before:
      
        $ perf list
        ...
          rNNN                                               [Raw hardware event descriptor]
          cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
               [(see 'man perf-list' on how to encode it)]
          mem:<addr>[/len][:access]                          [Hardware breakpoint]
        ...
      
      After:
      
        $ perf list
        ...
          rNNN                                               [Raw event descriptor]
          cpu/event=0..255,pc,edge,.../modifier              [Raw event descriptor]
               [(see 'man perf-list' or 'man perf-record' on how to encode it)]
          breakpoint//modifier                               [Raw event descriptor]
          cstate_core/event=0..0xffffffffffffffff/modifier   [Raw event descriptor]
          cstate_pkg/event=0..0xffffffffffffffff/modifier    [Raw event descriptor]
          i915/i915_eventid=0..0x1fffff/modifier             [Raw event descriptor]
          intel_bts//modifier                                [Raw event descriptor]
          intel_pt/ptw,event,cyc_thresh=0..15,.../modifier   [Raw event descriptor]
          kprobe/retprobe/modifier                           [Raw event descriptor]
          msr/event=0..0xffffffffffffffff/modifier           [Raw event descriptor]
          power/event=0..255/modifier                        [Raw event descriptor]
          software//modifier                                 [Raw event descriptor]
          tracepoint//modifier                               [Raw event descriptor]
          uncore_arb/event=0..255,edge,inv,.../modifier      [Raw event descriptor]
          uncore_cbox/event=0..255,edge,inv,.../modifier     [Raw event descriptor]
          uncore_clock/event=0..255/modifier                 [Raw event descriptor]
          uncore_imc_free_running/event=0..255,umask=0..255/modifier[Raw event descriptor]
          uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier[Raw event descriptor]
          mem:<addr>[/len][:access]                          [Hardware breakpoint]
        ...
      
      With '--details' provide more details on the formats encoding:
      
        cpu/event=0..255,pc,edge,.../modifier              [Raw event descriptor]
             [(see 'man perf-list' or 'man perf-record' on how to encode it)]
              cpu/event=0..255,pc,edge,offcore_rsp=0..0xffffffffffffffff,ldlat=0..0xffff,inv,
              umask=0..255,frontend=0..0xffffff,cmask=0..255,config=0..0xffffffffffffffff,
              config1=0..0xffffffffffffffff,config2=0..0xffffffffffffffff,config3=0..0xffffffffffffffff,
              name=string,period=number,freq=number,branch_type=(u|k|hv|any|...),time,
              call-graph=(fp|dwarf|lbr),stack-size=number,max-stack=number,nr=number,inherit,no-inherit,
              overwrite,no-overwrite,percore,aux-output,aux-sample-size=number/modifier
        breakpoint//modifier                               [Raw event descriptor]
              breakpoint//modifier
        cstate_core/event=0..0xffffffffffffffff/modifier   [Raw event descriptor]
              cstate_core/event=0..0xffffffffffffffff/modifier
        cstate_pkg/event=0..0xffffffffffffffff/modifier    [Raw event descriptor]
              cstate_pkg/event=0..0xffffffffffffffff/modifier
        i915/i915_eventid=0..0x1fffff/modifier             [Raw event descriptor]
              i915/i915_eventid=0..0x1fffff/modifier
        intel_bts//modifier                                [Raw event descriptor]
              intel_bts//modifier
        intel_pt/ptw,event,cyc_thresh=0..15,.../modifier   [Raw event descriptor]
              intel_pt/ptw,event,cyc_thresh=0..15,pt,notnt,branch,tsc,pwr_evt,fup_on_ptw,cyc,noretcomp,
              mtc,psb_period=0..15,mtc_period=0..15/modifier
        kprobe/retprobe/modifier                           [Raw event descriptor]
              kprobe/retprobe/modifier
        msr/event=0..0xffffffffffffffff/modifier           [Raw event descriptor]
              msr/event=0..0xffffffffffffffff/modifier
        power/event=0..255/modifier                        [Raw event descriptor]
              power/event=0..255/modifier
        software//modifier                                 [Raw event descriptor]
              software//modifier
        tracepoint//modifier                               [Raw event descriptor]
              tracepoint//modifier
        uncore_arb/event=0..255,edge,inv,.../modifier      [Raw event descriptor]
              uncore_arb/event=0..255,edge,inv,umask=0..255,cmask=0..31/modifier
        uncore_cbox/event=0..255,edge,inv,.../modifier     [Raw event descriptor]
              uncore_cbox/event=0..255,edge,inv,umask=0..255,cmask=0..31/modifier
        uncore_clock/event=0..255/modifier                 [Raw event descriptor]
              uncore_clock/event=0..255/modifier
        uncore_imc_free_running/event=0..255,umask=0..255/modifier[Raw event descriptor]
              uncore_imc_free_running/event=0..255,umask=0..255/modifier
        uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier[Raw event descriptor]
              uprobe/ref_ctr_offset=0..0xffffffff,retprobe/modifier
      
      Committer notes:
      
      Address this build error in various distros:
      
        55    58.44 ubuntu:24.04                  : FAIL gcc version 13.2.0 (Ubuntu 13.2.0-17ubuntu2)
          util/pmu.c:1638:70: error: '_Static_assert' with no message is a C2x extension [-Werror,-Wc2x-extensions]
           1638 |         _Static_assert(ARRAY_SIZE(terms) == __PARSE_EVENTS__TERM_TYPE_NR - 6);
                |                                                                             ^
                |                                                                             , ""
          1 error generated.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4ccf3bb7
    • Ian Rogers's avatar
      perf list: Allow wordwrap to wrap on commas · aa1f4ad2
      Ian Rogers authored
      A raw event encoding may be a block with terms separated by commas. If
      wrapping such a string it would be useful to break at the commas, so
      add this ability to wordwrap.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa1f4ad2
    • Ian Rogers's avatar
      perf pmu: Drop "default_core" from alias names · 39aa4ff6
      Ian Rogers authored
      "default_core" is used by jevents.py for json events' PMU name when
      none is specified. On x86 the "default_core" is typically the PMU
      "cpu". When creating an alias see if the event's PMU name is
      "default_core" in which case don't record it. This means in places
      like "perf list" the PMU's name will be used in its place.
      
      Before:
      
      $ perf list --details
        ...
        cache:
          l1d.replacement
               [Counts the number of cache lines replaced in L1 data cache]
                default_core/event=0x51,period=0x186a3,umask=0x1/
        ...
      
      After:
      
      $ perf list --details
        ...
        cache:
          l1d.replacement
               [Counts the number of cache lines replaced in L1 data cache. Unit: cpu]
                cpu/event=0x51,period=0x186a3,umask=0x1/
        ...
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      39aa4ff6
    • Ian Rogers's avatar
      perf list: Add tracepoint encoding to detailed output · 525615ef
      Ian Rogers authored
      The tracepoint id holds the config value and is probed in determining
      what an event is. Add reading of the id so that we can display the
      event encoding as:
      
        $ perf list --details
        ...
          alarmtimer:alarmtimer_cancel                       [Tracepoint event]
                tracepoint/config=0x18c/
        ...
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20240308001915.4060155-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      525615ef
    • Arnaldo Carvalho de Melo's avatar
      perf beauty: Introduce scrape script for 'clone' syscall 'flags' argument · 2316ef58
      Arnaldo Carvalho de Melo authored
      It was using the first variation on producing a string representation
      for a binary flag, one that used the copy of uapi/linux/sched.h with
      preprocessor tricks that had to be updated everytime a new flag was
      introduced.
      
      Use the more recent scrape script + strarray + strarray__scnprintf_flags() combo.
      
        $ tools/perf/trace/beauty/clone.sh | head -5
        static const char *clone_flags[] = {
        	[ilog2(0x00000100) + 1] = "VM",
        	[ilog2(0x00000200) + 1] = "FS",
        	[ilog2(0x00000400) + 1] = "FILES",
        	[ilog2(0x00000800) + 1] = "SIGHAND",
        $
      
      Now we can move uapi/linux/sched.h from tools/include/, that is used for
      building perf to the scrape only directory tools/perf/trace/beauty/include.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZfnULIn3XKDq0bpc@x1Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2316ef58
    • Namhyung Kim's avatar
      perf annotate-data: Do not retry for invalid types · bd62de08
      Namhyung Kim authored
      In some cases, it was able to find a type or location info (for per-cpu
      variable) but cannot match because of invalid offset or missing global
      information.  In those cases, it's meaningless to go to the outer scope
      and retry because there will be no additional information.
      
      Let's change the return type of find_matching_type() and bail out if it
      returns -1 for the cases.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-24-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bd62de08
    • Namhyung Kim's avatar
      perf annotate-data: Add a cache for global variable types · 55ee3d00
      Namhyung Kim authored
      They are often searched by many different places.  Let's add a cache
      for them to reduce the duplicate DWARF access.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-23-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55ee3d00
    • Namhyung Kim's avatar
      perf annotate-data: Add stack canary type · b3c95109
      Namhyung Kim authored
      When the stack protector is enabled, compiler would generate code to
      check stack overflow with a special value called 'stack carary' at
      runtime.  On x86_64, GCC hard-codes the stack canary as %gs:40.
      
      While there's a definition of fixed_percpu_data in asm/processor.h,
      it seems that the header is not included everywhere and many places
      it cannot find the type info.  As it's in the well-known location (at
      %gs:40), let's add a pseudo stack canary type to handle it specially.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-22-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b3c95109
    • Namhyung Kim's avatar
      perf annotate-data: Handle ADD instructions · eb9190af
      Namhyung Kim authored
      There are different patterns for percpu variable access using a constant
      value added to the base.
      
        2aeb:  mov    -0x7da0f7e0(,%rax,8),%r14  # r14 = __per_cpu_offset[cpu]
        2af3:  mov    $0x34740,%rax              # rax = address of runqueues
      * 2afa:  add    %rax,%r14                  # r14 = &per_cpu(runqueues, cpu)
        2bfd:  cmpl   $0x0,0x10(%r14)            # cpu_rq(cpu)->has_blocked_load
        2b03:  je     0x2b36
      
      At the first instruction, r14 has the __per_cpu_offset.  And then rax
      has an immediate value and then added to r14 to calculate the address of
      a per-cpu variable.  So it needs to track the immediate values and ADD
      instructions.
      
      Similar but a little different case is to use "this_cpu_off" instead of
      "__per_cpu_offset" for the current CPU.  This time the variable address
      comes with PC-rel addressing.
      
        89:  mov     $0x34740,%rax                # rax = address of runqueues
      * 90:  add     %gs:0x7f015f60(%rip),%rax    # 19a78  <this_cpu_off>
        98:  incl    0xd8c(%rax)                  # cpu_rq(cpu)->sched_count
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-21-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb9190af
    • Namhyung Kim's avatar
      perf annotate-data: Support general per-cpu access · f5b09592
      Namhyung Kim authored
      This is to support per-cpu variable access often without a matching
      DWARF entry.  For some reason, I cannot find debug info of per-cpu
      variables sometimes.  They have more complex pattern to calculate the
      address of per-cpu variables like below.
      
        2b7d:  mov    -0x1e0(%rbp),%rax           ; rax = cpu
        2b84:  mov    -0x7da0f7e0(,%rax,8),%rcx   ; rcx = __per_cpu_offset[cpu]
      * 2b8c:  mov    0x34870(%rcx),%rax          ; *(__per_cpu_offset[cpu] + 0x34870)
      
      Let's assume the rax register has a number for a CPU at 2b7d.  The next
      instruction is to get the per-cpu offset' for that cpu.  The offset
      -0x7da0f7e0 is 0xffffffff825f0820 in u64 which is the address of the
      '__per_cpu_offset' array in my system.  So it'd get the actual offset
      of that CPU's per-cpu region and save it to the rcx register.
      
      Then, at 2b8c, accesses using rcx can be handled same as the global
      variable access.  To handle this case, it should check if the offset
      of the instruction matches to the address of '__per_cpu_offset'.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-20-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f5b09592
    • Namhyung Kim's avatar
      perf annotate-data: Track instructions with a this-cpu variable · ad62edbf
      Namhyung Kim authored
      Like global variables, this per-cpu variables should be tracked
      correctly.  Factor our get_global_var_type() to handle both global
      and per-cpu (for this cpu) variables in the same manner.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-19-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ad62edbf
    • Namhyung Kim's avatar
      perf annotate-data: Handle this-cpu variables in kernel · 02e17ca9
      Namhyung Kim authored
      On x86, the kernel gets the current task using the current macro like
      below:
      
        #define current  get_current()
      
        static __always_inline struct task_struct *get_current(void)
        {
            return this_cpu_read_stable(pcpu_hot.current_task);
        }
      
      So it returns the current_task field of struct pcpu_hot which is the
      first member.  On my build, it's located at 0x32940.
      
        $ nm vmlinux | grep pcpu_hot
        0000000000032940 D pcpu_hot
      
      And the current macro generates the instructions like below:
      
        mov  %gs:0x32940, %rcx
      
      So the %gs segment register points to the beginning of the per-cpu
      region of this cpu and it points the variable with a constant.
      
      Let's update the instruction location info to have a segment register
      and handle %gs in kernel to look up a global variable.  Pretend it as
      a global variable by changing the register number to DWARF_REG_PC.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-18-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      02e17ca9
    • Namhyung Kim's avatar
      perf annotate: Parse x86 segment register location · cbaf89a8
      Namhyung Kim authored
      Add a segment field in the struct annotated_insn_loc and save it for the
      segment based addressing like %gs:0x28.  For simplicity it now handles
      %gs register only.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-17-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cbaf89a8
    • Namhyung Kim's avatar
      perf annotate-data: Check register state for type · bdc80ace
      Namhyung Kim authored
      As instruction tracking updates the type state for each register, check
      the final type info for the target register at the given instruction.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-16-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bdc80ace
    • Namhyung Kim's avatar
      perf annotate-data: Implement instruction tracking · eb8a55e0
      Namhyung Kim authored
      If it failed to find a variable for the location directly, it might be
      due to a missing variable in the source code.  For example, accessing
      pointer variables in a chain can result in the case like below:
      
        struct foo *foo = ...;
      
        int i = foo->bar->baz;
      
      The DWARF debug information is created for each variable so it'd have
      one for 'foo'.  But there's no variable for 'foo->bar' and then it
      cannot know the type of 'bar' and 'baz'.
      
      The above source code can be compiled to the follow x86 instructions:
      
        mov  0x8(%rax), %rcx
        mov  0x4(%rcx), %rdx   <=== PMU sample
        mov  %rdx, -4(%rbp)
      
      Let's say 'foo' is located in the %rax and it has a pointer to struct
      foo.  But perf sample is captured in the second instruction and there
      is no variable or type info for the %rcx.
      
      It'd be great if compiler could generate debug info for %rcx, but we
      should handle it on our side.  So this patch implements the logic to
      iterate instructions and update the type table for each location.
      
      As it already collected a list of scopes including the target
      instruction, we can use it to construct the type table smartly.
      
        +----------------  scope[0] subprogram
        |
        | +--------------  scope[1] lexical_block
        | |
        | | +------------  scope[2] inlined_subroutine
        | | |
        | | | +----------  scope[3] inlined_subroutine
        | | | |
        | | | | +--------  scope[4] lexical_block
        | | | | |
        | | | | |     ***  target instruction
        ...
      
      Image the target instruction has 5 scopes, each scope will have its own
      variables and parameters.  Then it can start with the innermost scope
      (4).  So it'd search the shortest path from the start of scope[4] to
      the target address and build a list of basic blocks.  Then it iterates
      the basic blocks with the variables in the scope and update the table.
      If it finds a type at the target instruction, then returns it.
      
      Otherwise, it moves to the upper scope[3].  Now it'd search the shortest
      path from the start of scope[3] to the start of scope[4].  Then connect
      it to the existing basic block list.  Then it'd iterate the blocks with
      variables for both scopes.  It can repeat this until it finds a type at
      the target instruction or reaches to the top scope[0].
      
      As the basic blocks contain the shortest path, it won't worry about
      branches and can update the table simply.
      
      The final check will be done by find_matching_type() in the next patch.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-15-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb8a55e0
    • Namhyung Kim's avatar
      perf annotate-data: Handle call instructions · cffb7910
      Namhyung Kim authored
      When updating instruction states, the call instruction should play a
      role since it changes the register states.  For simplicity, mark some
      registers as caller-saved registers (should be arch-dependent), and
      invalidate them all after a function call.
      
      If the function returns something, the designated register (ret_reg)
      will have the type info.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-14-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cffb7910
    • Namhyung Kim's avatar
      perf annotate-data: Handle global variable access · 0a41e5d6
      Namhyung Kim authored
      When updating the instruction states, it also needs to handle global
      variable accesses.  Same as it does for PC-relative addressing, it can
      look up the type by address (if it's defined in the same file), or by
      name after finding the symbol by address (for declarations).
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-13-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a41e5d6
    • Namhyung Kim's avatar
      perf annotate-data: Add get_global_var_type() · 1ebb5e17
      Namhyung Kim authored
      Accessing global variable is common when it tracks execution later.
      Factor out the common code into a function for later use.
      
      It adds thread and cpumode to struct data_loc_info to find (global)
      symbols if needed.  Also remove var_name as it's retrieved in the
      helper function.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-12-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1ebb5e17
    • Namhyung Kim's avatar
      perf annotate-data: Add update_insn_state() · 4f903455
      Namhyung Kim authored
      The update_insn_state() function is to update the type state table after
      processing each instruction.  For now, it handles MOV (on x86) insn
      to transfer type info from the source location to the target.
      
      The location can be a register or a stack slot.  Check carefully when
      memory reference happens and fetch the type correctly.  It basically
      ignores write to a memory since it doesn't change the type info.  One
      exception is writes to (new) stack slots for register spilling.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-11-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4f903455
    • Namhyung Kim's avatar
      perf annotate-data: Maintain variable type info · 06b2ce75
      Namhyung Kim authored
      As it collected basic block and variable information in each scope, it
      now can build a state table to find matching variable at the location.
      
      The struct type_state is to keep the type info saved in each register
      and stack slot.  The update_var_state() updates the table when it finds
      variables in the current address.  It expects die_collect_vars() filled
      a list of variables with type info and starting address.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-10-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06b2ce75
    • Namhyung Kim's avatar
      perf annotate-data: Add debug messages · 90429524
      Namhyung Kim authored
      Add a new debug option "type-profile" to enable the detailed info during
      the type analysis especially for instruction tracking.  You can use this
      before the command name like 'report' or 'annotate'.
      
        $ perf --debug type-profile annotate --data-type
      
      Committer testing:
      
      First get some memory events:
      
        $ perf mem record ls
      
      Then, without data-type profiling debug:
      
        $ perf annotate --data-type | head
        Annotate type: 'struct rtld_global' in /usr/lib64/ld-linux-x86-64.so.2 (1 samples):
        ============================================================================
            samples     offset       size  field
                  1          0       4336  struct rtld_global	 {
                  0          0          0      struct link_namespaces*	_dl_ns;
                  0       2560          8      size_t	_dl_nns;
                  0       2568         40      __rtld_lock_recursive_t	_dl_load_lock {
                  0       2568         40          pthread_mutex_t	mutex {
                  0       2568         40              struct __pthread_mutex_s	__data {
                  0       2568          4                  int	__lock;
        $
      
      And with only data-type profiling:
      
        $ perf --debug type-profile annotate --data-type | head
        -----------------------------------------------------------
        find_data_type_die [1e67] for reg13873052 (PC) offset=0x150e2 in dl_main
        CU die offset: 0x29cd3
        found PC-rel by addr=0x34020 offset=0x20
        -----------------------------------------------------------
        find_data_type_die [2e] for reg12 offset=0 in __GI___readdir64
        CU die offset: 0x137a45
        frame base: cfa=1 fbreg=-1
        found "__futex" in scope=2/2 (die: 0x137ad5) 0(reg12) type=int (die:2a)
        -----------------------------------------------------------
        find_data_type_die [52] for reg5 offset=0 in __memmove_avx_unaligned_erms
        CU die offset: 0x1124ed
        no variable found
        Annotate type: 'struct rtld_global' in /usr/lib64/ld-linux-x86-64.so.2 (1 samples):
        ============================================================================
            samples     offset       size  field
                  1          0       4336  struct rtld_global	 {
                  0          0          0      struct link_namespaces*	_dl_ns;
                  0       2560          8      size_t	_dl_nns;
                  0       2568         40      __rtld_lock_recursive_t	_dl_load_lock {
                  0       2568         40          pthread_mutex_t	mutex {
                  0       2568         40              struct __pthread_mutex_s	__data {
                  0       2568          4                  int	__lock;
        $
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-9-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      90429524
    • Namhyung Kim's avatar
      perf annotate: Add annotate_get_basic_blocks() · 5cdd3fd7
      Namhyung Kim authored
      The annotate_get_basic_blocks() is to find a list of basic blocks from
      the source instruction to the destination instruction in a function.
      
      It'll be used to find variables in a scope.  Use BFS (Breadth First
      Search) to find a shortest path to carry the variable/register state
      minimally.
      
      Also change find_disasm_line() to be used in annotate_get_basic_blocks()
      and add 'allow_update' argument to control if it can update the IP.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-8-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5cdd3fd7
    • Namhyung Kim's avatar
      perf annotate-data: Introduce 'struct data_loc_info' · a3f4d5b5
      Namhyung Kim authored
      The find_data_type() needs many information to describe the location of
      the data.  Add the new 'struct data_loc_info' to pass those information at
      once.
      
      No functional changes intended.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-7-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a3f4d5b5
    • Namhyung Kim's avatar
      perf map: Add map__objdump_2rip() · 52a09bc2
      Namhyung Kim authored
      Sometimes we want to convert an address in objdump output to
      map-relative address to match with a sample data.  Let's add
      map__objdump_2rip() for that.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      52a09bc2
    • Namhyung Kim's avatar
      perf dwarf-aux: Add die_find_func_rettype() · 7a838c2f
      Namhyung Kim authored
      The die_find_func_rettype() is to find a debug entry for the given
      function name and sets the type information of the return value.  By
      convention, it'd return the pointer to the type die (should be the
      same as the given mem_die argument) if found, or NULL otherwise.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7a838c2f
    • Namhyung Kim's avatar
      perf dwarf-aux: Handle type transfer for memory access · 437683a9
      Namhyung Kim authored
      We want to track type states as instructions are executed.  Each
      instruction can access compound types like struct or union and load/
      store its members to a different location.
      
      The die_deref_ptr_type() is to find a type of memory access with a
      pointer variable.  If it points to a compound type like struct, the
      target memory is a member in the struct.  The access will happen with an
      offset indicating which member it refers.  Let's follow the DWARF info
      to figure out the type of the pointer target.
      
      For example, say we have the following code.
      
        struct foo {
          int a;
          int b;
        };
      
        struct foo *p = malloc(sizeof(*p));
        p->b = 0;
      
      The last pointer access should produce x86 asm like below:
      
        mov  0x0, 4(%rbx)
      
      And we know %rbx register has a pointer to struct foo.  Then offset 4
      should return the debug info of member 'b'.
      
      Also variables of compound types can be accessed directly without a
      pointer.  The die_get_member_type() is to handle a such case.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20240319055115.4063940-4-namhyung@kernel.org
      [ Check if die_get_real_type() returned NULL ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      437683a9