1. 26 Jul, 2024 2 commits
  2. 17 Jul, 2024 6 commits
  3. 12 Jul, 2024 9 commits
    • Howard Chu's avatar
      perf trace: Fix iteration of syscall ids in syscalltbl->entries · 7a2fb561
      Howard Chu authored
      This is a bug found when implementing pretty-printing for the
      landlock_add_rule system call, I decided to send this patch separately
      because this is a serious bug that should be fixed fast.
      
      I wrote a test program to do landlock_add_rule syscall in a loop,
      yet perf trace -e landlock_add_rule freezes, giving no output.
      
      This bug is introduced by the false understanding of the variable "key"
      below:
      ```
      for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
      	struct syscall *sc = trace__syscall_info(trace, NULL, key);
      	...
      }
      ```
      The code above seems right at the beginning, but when looking at
      syscalltbl.c, I found these lines:
      
      ```
      for (i = 0; i <= syscalltbl_native_max_id; ++i)
      	if (syscalltbl_native[i])
      		++nr_entries;
      
      entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
      ...
      
      for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
      	if (syscalltbl_native[i]) {
      		entries[j].name = syscalltbl_native[i];
      		entries[j].id = i;
      		++j;
      	}
      }
      ```
      
      meaning the key is merely an index to traverse the syscall table,
      instead of the actual syscall id for this particular syscall.
      
      So if one uses key to do trace__syscall_info(trace, NULL, key), because
      key only goes up to trace->sctbl->syscalls.nr_entries, for example, on
      my X86_64 machine, this number is 373, it will end up neglecting all
      the rest of the syscall, in my case, everything after `rseq`, because
      the traversal will stop at 373, and `rseq` is the last syscall whose id
      is lower than 373
      
      in tools/perf/arch/x86/include/generated/asm/syscalls_64.c:
      ```
      	...
      	[334] = "rseq",
      	[424] = "pidfd_send_signal",
      	...
      ```
      
      The reason why the key is scrambled but perf trace works well is that
      key is used in trace__syscall_info(trace, NULL, key) to do
      trace->syscalls.table[id], this makes sure that the struct syscall returned
      actually has an id the same value as key, making the later bpf_prog
      matching all correct.
      
      After fixing this bug, I can do perf trace on 38 more syscalls, and
      because more syscalls are visible, we get 8 more syscalls that can be
      augmented.
      
      before:
      
      perf $ perf trace -vv --max-events=1 |& grep Reusing
      Reusing "open" BPF sys_enter augmenter for "stat"
      Reusing "open" BPF sys_enter augmenter for "lstat"
      Reusing "open" BPF sys_enter augmenter for "access"
      Reusing "connect" BPF sys_enter augmenter for "accept"
      Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
      Reusing "connect" BPF sys_enter augmenter for "bind"
      Reusing "connect" BPF sys_enter augmenter for "getsockname"
      Reusing "connect" BPF sys_enter augmenter for "getpeername"
      Reusing "open" BPF sys_enter augmenter for "execve"
      Reusing "open" BPF sys_enter augmenter for "truncate"
      Reusing "open" BPF sys_enter augmenter for "chdir"
      Reusing "open" BPF sys_enter augmenter for "mkdir"
      Reusing "open" BPF sys_enter augmenter for "rmdir"
      Reusing "open" BPF sys_enter augmenter for "creat"
      Reusing "open" BPF sys_enter augmenter for "link"
      Reusing "open" BPF sys_enter augmenter for "unlink"
      Reusing "open" BPF sys_enter augmenter for "symlink"
      Reusing "open" BPF sys_enter augmenter for "readlink"
      Reusing "open" BPF sys_enter augmenter for "chmod"
      Reusing "open" BPF sys_enter augmenter for "chown"
      Reusing "open" BPF sys_enter augmenter for "lchown"
      Reusing "open" BPF sys_enter augmenter for "mknod"
      Reusing "open" BPF sys_enter augmenter for "statfs"
      Reusing "open" BPF sys_enter augmenter for "pivot_root"
      Reusing "open" BPF sys_enter augmenter for "chroot"
      Reusing "open" BPF sys_enter augmenter for "acct"
      Reusing "open" BPF sys_enter augmenter for "swapon"
      Reusing "open" BPF sys_enter augmenter for "swapoff"
      Reusing "open" BPF sys_enter augmenter for "delete_module"
      Reusing "open" BPF sys_enter augmenter for "setxattr"
      Reusing "open" BPF sys_enter augmenter for "lsetxattr"
      Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
      Reusing "open" BPF sys_enter augmenter for "getxattr"
      Reusing "open" BPF sys_enter augmenter for "lgetxattr"
      Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
      Reusing "open" BPF sys_enter augmenter for "listxattr"
      Reusing "open" BPF sys_enter augmenter for "llistxattr"
      Reusing "open" BPF sys_enter augmenter for "removexattr"
      Reusing "open" BPF sys_enter augmenter for "lremovexattr"
      Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
      Reusing "open" BPF sys_enter augmenter for "mq_open"
      Reusing "open" BPF sys_enter augmenter for "mq_unlink"
      Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
      Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
      Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
      Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
      Reusing "open" BPF sys_enter augmenter for "symlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
      Reusing "connect" BPF sys_enter augmenter for "accept4"
      Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
      Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
      Reusing "open" BPF sys_enter augmenter for "memfd_create"
      Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "statx"
      
      after
      
      perf $ perf trace -vv --max-events=1 |& grep Reusing
      Reusing "open" BPF sys_enter augmenter for "stat"
      Reusing "open" BPF sys_enter augmenter for "lstat"
      Reusing "open" BPF sys_enter augmenter for "access"
      Reusing "connect" BPF sys_enter augmenter for "accept"
      Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
      Reusing "connect" BPF sys_enter augmenter for "bind"
      Reusing "connect" BPF sys_enter augmenter for "getsockname"
      Reusing "connect" BPF sys_enter augmenter for "getpeername"
      Reusing "open" BPF sys_enter augmenter for "execve"
      Reusing "open" BPF sys_enter augmenter for "truncate"
      Reusing "open" BPF sys_enter augmenter for "chdir"
      Reusing "open" BPF sys_enter augmenter for "mkdir"
      Reusing "open" BPF sys_enter augmenter for "rmdir"
      Reusing "open" BPF sys_enter augmenter for "creat"
      Reusing "open" BPF sys_enter augmenter for "link"
      Reusing "open" BPF sys_enter augmenter for "unlink"
      Reusing "open" BPF sys_enter augmenter for "symlink"
      Reusing "open" BPF sys_enter augmenter for "readlink"
      Reusing "open" BPF sys_enter augmenter for "chmod"
      Reusing "open" BPF sys_enter augmenter for "chown"
      Reusing "open" BPF sys_enter augmenter for "lchown"
      Reusing "open" BPF sys_enter augmenter for "mknod"
      Reusing "open" BPF sys_enter augmenter for "statfs"
      Reusing "open" BPF sys_enter augmenter for "pivot_root"
      Reusing "open" BPF sys_enter augmenter for "chroot"
      Reusing "open" BPF sys_enter augmenter for "acct"
      Reusing "open" BPF sys_enter augmenter for "swapon"
      Reusing "open" BPF sys_enter augmenter for "swapoff"
      Reusing "open" BPF sys_enter augmenter for "delete_module"
      Reusing "open" BPF sys_enter augmenter for "setxattr"
      Reusing "open" BPF sys_enter augmenter for "lsetxattr"
      Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
      Reusing "open" BPF sys_enter augmenter for "getxattr"
      Reusing "open" BPF sys_enter augmenter for "lgetxattr"
      Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
      Reusing "open" BPF sys_enter augmenter for "listxattr"
      Reusing "open" BPF sys_enter augmenter for "llistxattr"
      Reusing "open" BPF sys_enter augmenter for "removexattr"
      Reusing "open" BPF sys_enter augmenter for "lremovexattr"
      Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
      Reusing "open" BPF sys_enter augmenter for "mq_open"
      Reusing "open" BPF sys_enter augmenter for "mq_unlink"
      Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
      Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
      Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
      Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
      Reusing "open" BPF sys_enter augmenter for "symlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
      Reusing "connect" BPF sys_enter augmenter for "accept4"
      Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
      Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
      Reusing "open" BPF sys_enter augmenter for "memfd_create"
      Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
      Reusing "fremovexattr" BPF sys_enter augmenter for "statx"
      
      TL;DR:
      
      These are the new syscalls that can be augmented
      Reusing "openat" BPF sys_enter augmenter for "open_tree"
      Reusing "openat" BPF sys_enter augmenter for "openat2"
      Reusing "openat" BPF sys_enter augmenter for "mount_setattr"
      Reusing "openat" BPF sys_enter augmenter for "move_mount"
      Reusing "open" BPF sys_enter augmenter for "fsopen"
      Reusing "openat" BPF sys_enter augmenter for "fspick"
      Reusing "openat" BPF sys_enter augmenter for "faccessat2"
      Reusing "openat" BPF sys_enter augmenter for "fchmodat2"
      
      as for the perf trace output:
      
      before
      
      perf $ perf trace -e faccessat2 --max-events=1
      [no output]
      
      after
      
      perf $ ./perf trace -e faccessat2 --max-events=1
           0.000 ( 0.037 ms): waybar/958 faccessat2(dfd: 40, filename: "uevent")                               = 0
      
      P.S. The reason why this bug was not found in the past five years is
      probably because it only happens to the newer syscalls whose id is
      greater, for instance, faccessat2 of id 439, which not a lot of people
      care about when using perf trace.
      
      [Arnaldo]: notes
      
      That and the fact that the BPF code was hidden before having to use -e,
      that got changed kinda recently when we switched to using BPF skels for
      augmenting syscalls in 'perf trace':
      
      ⬢[acme@toolbox perf-tools-next]$ git log --oneline tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
      a9f4c6c9 perf trace: Collect sys_nanosleep first argument
      29d16de2 perf augmented_raw_syscalls.bpf: Move 'struct timespec64' to vmlinux.h
      5069211e perf trace: Use the right bpf_probe_read(_str) variant for reading user data
      33b725ce perf trace: Avoid compile error wrt redefining bool
      7d964231 perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(augmented_arg->value) is a power of two.
      262b54b6 perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(saddr) is a power of two.
      18364804 perf bpf_skel augmented_raw_syscalls: Cap the socklen parameter using &= sizeof(saddr)
      cd2cece6 perf trace: Tidy comments related to BPF + syscall augmentation
      5e6da6be perf trace: Migrate BPF augmentation to use a skeleton
      ⬢[acme@toolbox perf-tools-next]$
      
      ⬢[acme@toolbox perf-tools-next]$ git show --oneline --pretty=reference 5e6da6be | head -1
      5e6da6be (perf trace: Migrate BPF augmentation to use a skeleton, 2023-08-10)
      ⬢[acme@toolbox perf-tools-next]$
      
      I.e. from August, 2023.
      
      One had as well to ask for BUILD_BPF_SKEL=1, which now is default if all
      it needs is available on the system.
      
      I simplified the code to not expose the 'struct syscall' outside of
      tools/perf/util/syscalltbl.c, instead providing a function to go from
      the index to the syscall id:
      
        int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);
      Signed-off-by: default avatarHoward Chu <howardchu95@gmail.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Link: https://lore.kernel.org/lkml/ZmhlAxbVcAKoPTg8@x1
      Link: https://lore.kernel.org/r/20240705132059.853205-2-howardchu95@gmail.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      7a2fb561
    • Ian Rogers's avatar
      perf dso: Fix address sanitizer build · 1553419c
      Ian Rogers authored
      Various files had been missed from having accessor functions added for
      the sake of dso reference count checking. Add the function calls and
      missing dso accessor functions.
      
      Fixes: ee756ef7 ("perf dso: Add reference count checking and accessor functions")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: James Clark <james.clark@linaro.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Yunseong Kim <yskelg@gmail.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: John Garry <john.g.garry@oracle.com>
      Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      1553419c
    • Leo Yan's avatar
      perf mem: Warn if memory events are not supported on all CPUs · 14b0fffa
      Leo Yan authored
      It is possible that memory events are not supported on all CPUs.
      
      Prints a warning by dumping the enabled CPU maps in this case.
      Signed-off-by: default avatarLeo Yan <leo.yan@arm.com>
      Reviewed-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: coresight@lists.linaro.org
      Link: https://lore.kernel.org/r/20240706152035.86983-3-leo.yan@arm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      14b0fffa
    • Leo Yan's avatar
      perf arm-spe: Support multiple Arm SPE PMUs · e6b4da67
      Leo Yan authored
      A platform can have more than one Arm SPE PMU. For example, a system
      with multiple clusters may have each cluster enabled with its own Arm
      SPE instance. In such case, the PMU devices will be named 'arm_spe_0',
      'arm_spe_1', and so on.
      
      Currently, the tool only supports 'arm_spe_0'. This commit extends
      support to multiple Arm SPE PMUs by detecting the substring 'arm_spe_'.
      Signed-off-by: default avatarLeo Yan <leo.yan@arm.com>
      Reviewed-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: coresight@lists.linaro.org
      Link: https://lore.kernel.org/r/20240706152035.86983-2-leo.yan@arm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      e6b4da67
    • Haoze Xie's avatar
      perf build x86: Fix SC2034 error in syscalltbl.sh · 759ce73c
      Haoze Xie authored
      Change the unused var in 'arch/x86/entry/syscalls/syscalltbl.sh' to '_'
      when reading from '$sorted_table'. This change allows the script to pass
      tests of ShellCheck before and after version 0.7.2 at the same time.
      
      When building in arch x86, syscalltbl.sh got a ShellCheck warning, which
      makes compilation error:
      
          In arch/x86/entry/syscalls/syscalltbl.sh line 27:
          while read nr _abi name entry _compat; do
                        ^-^ SC2034: abi appears unused.
                        Verify use (or export if used externally).
                                        ^----^ SC2034: compat appears unused.
                                     Verify use (or export if used externally).
      
      The script reads unused param abi and compat. It uses format '_xxx' to
      indicate dummy vars, which won't work properly when ShellCheck <= 0.7.2.
      
      According to SC2034, the more general way of writing is to use directly
      '_' to indicate discarding vars. 'entry' is also replaced by '_' because
      it just happens to be defined in emit function, otherwise it will lead
      to some misunderstandings.
      
      Link: https://www.shellcheck.net/wiki/SC2034Signed-off-by: default avatarHaoze Xie <royenheart@gmail.com>
      Signed-off-by: default avatarYuan Tan <tanyuan@tinylab.org>
      Link: https://lore.kernel.org/r/2143cab4cd8468c88860f4e5e382d0e6b4d89ac9.1720372178.git.royenheart@gmail.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      759ce73c
    • Haoze Xie's avatar
      perf record: Fix memset out-of-range error · 6353abd3
      Haoze Xie authored
      Modified the object of 'memset' from '&lost.lost' to '&lost' in
      record__read_lost_samples. This allows 'memset' to access memory properly
      without causing out-of-bounds problems.
      
      The problems got from builtin-record.c are:
      
      In file included from /usr/include/string.h:495,
                       from util/parse-events.h:13,
                       from builtin-record.c:14:
      In function 'memset',
          inlined from 'record__read_lost_samples' at
          builtin-record.c:1958:6,
          inlined from '__cmd_record.constprop' at builtin-record.c:2817:2:
      /usr/include/x86_64-linux-gnu/bits/string_fortified.h:71:10: error:
      '__builtin_memset' offset [17, 64] from the object at 'lost' is out
      of the bounds of referenced subobject 'lost' with type
      'struct perf_record_lost_samples' at offset 0 [-Werror=array-bounds]
      71|return __builtin___memset_chk (__dest,__ch,__len,__bos0 (__dest));
        |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      The error arised when performing a memset operation on the 'lost' variable,
      the bytes of 'sizeof(lost)' exceeds that of '&lost.lost', which are 64
      and 16.
      
      Fixes: 6c1785cd ("perf record: Ensure space for lost samples")
      Signed-off-by: default avatarHaoze Xie <royenheart@gmail.com>
      Signed-off-by: default avatarYuan Tan <tanyuan@tinylab.org>
      Link: https://lore.kernel.org/r/11e12f171b846577cac698cd3999db3d7f6c4d03.1720372317.git.royenheart@gmail.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      6353abd3
    • Madadi Vineeth Reddy's avatar
      perf sched map: Add --fuzzy-name option for fuzzy matching in task names · 306f921e
      Madadi Vineeth Reddy authored
      The --fuzzy-name option can be used if fuzzy name matching is required.
      For example, "taskname" can be matched to any string that contains
      "taskname" as its substring.
      
      Sample output for --task-name wdav --fuzzy-name
      =============
       .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
       .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
       .  *-   B0  .   .   .   -   .   131040.641379 secs
      *C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
       C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
       C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
      *-   .   B0  .   D0  .   E0  .   131040.641581 secs
      Suggested-by: default avatarChen Yu <yu.c.chen@intel.com>
      Reviewed-and-tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarMadadi Vineeth Reddy <vineethr@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240707182716.22054-4-vineethr@linux.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      306f921e
    • Madadi Vineeth Reddy's avatar
      perf sched map: Add support for multiple task names using CSV · 9cc0afed
      Madadi Vineeth Reddy authored
      To track the scheduling patterns of multiple tasks simultaneously,
      multiple task names can be specified using a comma separator
      without any whitespace.
      
      Sample output for --task-name perf,wdavdaemon
      =============
       .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
       .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
       .  *-   B0  .   .   .   -   .   131040.641379 secs
      *C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
      
      ...
      
       .  *-   .   .   .   .   .   .   131041.395649 secs
       .   .   .   .   .   .   .  *X2  131041.403969 secs X2 => perf:70211
       .   .   .   .   .   .   .  *-   131041.404006 secs
      Suggested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Reviewed-and-tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarMadadi Vineeth Reddy <vineethr@linux.ibm.com>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Link: https://lore.kernel.org/r/20240707182716.22054-3-vineethr@linux.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      9cc0afed
    • Madadi Vineeth Reddy's avatar
      perf sched map: Add task-name option to filter the output map · 3116d609
      Madadi Vineeth Reddy authored
      By default, perf sched map prints sched-in events for all the tasks
      which may not be required all the time as it prints lot of symbols
      and rows to the terminal.
      
      With --task-name option, one could specify the specific task name
      for which the map has to be shown. This would help in analyzing the
      CPU usage patterns easier for that specific task. Since multiple
      PID's might have the same task name, using task-name filter
      would be more useful for debugging.
      
      For other tasks, instead of printing the symbol, '-' is printed and
      the same '.' is used to represent idle. '-' is used instead of symbol
      for other tasks because it helps in clear visualization of task
      of interest and secondly the symbol itself doesn't mean anything
      because the sched-in of that symbol will not be printed(first sched-in
      contains pid and the corresponding symbol).
      
      When using the --task-name option, the sched-out time is represented
      by a '*-'. Since not all task sched-in events are printed, the sched-out
      time of the relevant task might be lost. This representation ensures
      that the sched-out time of the interested task is not overlooked.
      
      6.10.0-rc1
      ==========
      *A0                              131040.639793 secs A0 => migration/0:19
      *.                               131040.639801 secs .  => swapper:0
       .  *B0                          131040.639830 secs B0 => migration/1:24
       .  *.                           131040.639836 secs
       .   .  *C0                      131040.640108 secs C0 => migration/2:30
       .   .  *.                       131040.640163 secs
       .   .   .  *D0                  131040.640386 secs D0 => migration/3:36
       .   .   .  *.                   131040.640395 secs
      
      6.10.0-rc1 + patch (--task-name wdavdaemon)
      =============
       .  *A0  .   .   .   .   -   .   131040.641346 secs A0 => wdavdaemon:62509
       .   A0 *B0  .   .   .   -   .   131040.641378 secs B0 => wdavdaemon:62274
       -  *-   B0  .   .   .   -   .   131040.641379 secs
      *C0  .   B0  .   .   .   .   .   131040.641572 secs C0 => wdavdaemon:62283
       C0  .   B0  .  *D0  .   .   .   131040.641572 secs D0 => wdavdaemon:62277
       C0  .   B0  .   D0  .  *E0  .   131040.641578 secs E0 => wdavdaemon:62270
      *-   .   B0  .   D0  .   E0  .   131040.641581 secs
       .   .   B0  .   D0  .  *-   .   131040.641583 secs
      Reviewed-and-tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarMadadi Vineeth Reddy <vineethr@linux.ibm.com>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Link: https://lore.kernel.org/r/20240707182716.22054-2-vineethr@linux.ibm.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      3116d609
  4. 11 Jul, 2024 1 commit
  5. 04 Jul, 2024 1 commit
  6. 03 Jul, 2024 4 commits
  7. 02 Jul, 2024 4 commits
  8. 28 Jun, 2024 7 commits
  9. 26 Jun, 2024 6 commits
    • Veronika Molnarova's avatar
      perf test stat_bpf_counter.sh: Stabilize the test results · e8b86f03
      Veronika Molnarova authored
      The test has been failing for some time when two separate runs of
      perf benchmarks are recorded for cycles events and their counts are
      compared, while once the recording was done with option --bpf-counters
      and once without it. It is expected that the count of the samples
      should be within a certain range, firstly the difference was set to be
      within 10%, which was then later raised to 20%. However, the test case
      keeps failing on certain architectures as recording the provided
      benchmark can produce completely different counts based on the
      current load of the system.
      
      Sampling two separate runs on intel-eaglestream-spr-13 of "perf stat
      --no-big-num -e cycles -- perf bench sched messaging -g 1 -l 100 -t":
      
       Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
      
               396782898      cycles
      
             0.010051983 seconds time elapsed
      
             0.008664000 seconds user
             0.097058000 seconds sys
      
       Performance counter stats for 'perf bench sched messaging -g 1 -l 100 -t':
      
              1431133032      cycles
      
             0.021803714 seconds time elapsed
      
             0.023377000 seconds user
             0.349918000 seconds sys
      
      , which is ranging from 400mil to 1400mil samples.
      
      Instead of recording the cycles use instructions event, which provides
      more stable values. At the same time change the tested workload to one
      of the provided testing workloads by perf that is not based on a
      scheduler, which can provide another dependency on the current load.
      
      Sampling instructions event with the new workload provide much more
      stable results on intel-eaglestream-spr-13 of "perf stat --no-big-num
      -e instructions -- perf test -w brstack":
      
       Performance counter stats for 'perf test -w brstack':
      
                64584494      instructions
      
             0.009173945 seconds time elapsed
      
             0.007262000 seconds user
             0.002071000 seconds sys
      
       Performance counter stats for 'perf test -w brstack':
      
                64672669      instructions
      
             0.008888135 seconds time elapsed
      
             0.005018000 seconds user
             0.004018000 seconds sys
      Signed-off-by: default avatarVeronika Molnarova <vmolnaro@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: mpetlan@redhat.com
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625092001.10909-1-vmolnaro@redhat.com
      e8b86f03
    • Ian Rogers's avatar
      perf python: Clean up build dependencies · e4b19e2c
      Ian Rogers authored
      The python build now depends on libraries and doesn't use
      python-ext-sources except for the util/python.c dependency. Switch to
      just directly depending on that file and util/setup.py. This allows
      the removal of python-ext-sources.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Gary Guo <gary@garyguo.net>
      Cc: Alex Gaynor <alex.gaynor@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Alice Ryhl <aliceryhl@google.com>
      Cc: Andrei Vagin <avagin@google.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Benno Lossin <benno.lossin@proton.me>
      Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
      Cc: Andreas Hindborg <a.hindborg@samsung.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625214117.953777-9-irogers@google.com
      e4b19e2c
    • Ian Rogers's avatar
      perf python: Switch module to linking libraries from building source · 9dabf400
      Ian Rogers authored
      setup.py was building most perf sources causing setup.py to mimic the
      Makefile logic as well as flex/bison code to be stubbed out, due to
      complexity building. By using libraries fewer functions are stubbed
      out, the build is faster and the Makefile logic is reused which should
      simplify updating. The libraries are passed through LDFLAGS to avoid
      complexity in python.
      
      Force the -fPIC flag for libbpf.a to ensure it is suitable for linking
      into the perf python module.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Gary Guo <gary@garyguo.net>
      Cc: Alex Gaynor <alex.gaynor@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Alice Ryhl <aliceryhl@google.com>
      Cc: Andrei Vagin <avagin@google.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Benno Lossin <benno.lossin@proton.me>
      Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
      Cc: Andreas Hindborg <a.hindborg@samsung.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625214117.953777-8-irogers@google.com
      9dabf400
    • Ian Rogers's avatar
      perf util: Make util its own library · e467705a
      Ian Rogers authored
      Make the util directory into its own library. This is done to avoid
      compiling code twice, once for the perf tool and once for the perf
      python module. For convenience:
        arch/common.c
        scripts/perl/Perf-Trace-Util/Context.c
        scripts/python/Perf-Trace-Util/Context.c
      are made part of this library.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Gary Guo <gary@garyguo.net>
      Cc: Alex Gaynor <alex.gaynor@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Alice Ryhl <aliceryhl@google.com>
      Cc: Andrei Vagin <avagin@google.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Benno Lossin <benno.lossin@proton.me>
      Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
      Cc: Andreas Hindborg <a.hindborg@samsung.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
      e467705a
    • Ian Rogers's avatar
      perf bench: Make bench its own library · 21cc3bc0
      Ian Rogers authored
      Make the benchmark code into a library so it may be linked against
      things like the python module to avoid compiling code twice.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Gary Guo <gary@garyguo.net>
      Cc: Alex Gaynor <alex.gaynor@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Alice Ryhl <aliceryhl@google.com>
      Cc: Andrei Vagin <avagin@google.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Benno Lossin <benno.lossin@proton.me>
      Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
      Cc: Andreas Hindborg <a.hindborg@samsung.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625214117.953777-6-irogers@google.com
      21cc3bc0
    • Ian Rogers's avatar
      perf test: Make tests its own library · 1dad99af
      Ian Rogers authored
      Make the tests code its own library. This is done to avoid compiling
      code twice, once for the perf tool and once for the perf python
      module.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Gary Guo <gary@garyguo.net>
      Cc: Alex Gaynor <alex.gaynor@gmail.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Alice Ryhl <aliceryhl@google.com>
      Cc: Andrei Vagin <avagin@google.com>
      Cc: Yicong Yang <yangyicong@hisilicon.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Leo Yan <leo.yan@linux.dev>
      Cc: Oliver Upton <oliver.upton@linux.dev>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Benno Lossin <benno.lossin@proton.me>
      Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
      Cc: Andreas Hindborg <a.hindborg@samsung.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240625214117.953777-5-irogers@google.com
      1dad99af