1. 25 Jun, 2024 12 commits
    • Namhyung Kim's avatar
      perf symbol: Simplify kernel module checking · e988a5b5
      Namhyung Kim authored
      In dso__load(), it checks if the dso is a kernel module by looking the
      symtab type.  Actually dso has 'is_kmod' field to check that easily and
      dso__set_module_info() set the symtab type and the is_kmod bit.  So it
      should have the same result to check the is_kmod bit.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240621170528.608772-3-namhyung@kernel.org
      e988a5b5
    • Namhyung Kim's avatar
      perf report: Fix condition in sort__sym_cmp() · cb39d05e
      Namhyung Kim authored
      It's expected that both hist entries are in the same hists when
      comparing two.  But the current code in the function checks one without
      dso sort key and other with the key.  This would make the condition true
      in any case.
      
      I guess the intention of the original commit was to add '!' for the
      right side too.  But as it should be the same, let's just remove it.
      
      Fixes: 69849fc5 ("perf hists: Move sort__has_dso into struct perf_hpp_list")
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240621170528.608772-2-namhyung@kernel.org
      cb39d05e
    • Junhao He's avatar
      perf pmus: Fixes always false when compare duplicates aliases · dd9a426e
      Junhao He authored
      In the previous loop, all the members in the aliases[j-1] have been freed
      and set to NULL. But in this loop, the function pmu_alias_is_duplicate()
      compares the aliases[j] with the aliases[j-1] that has already been
      disposed, so the function will always return false and duplicate aliases
      will never be discarded.
      
      If we find duplicate aliases, it skips the zfree aliases[j], which is
      accompanied by a memory leak.
      
      We can use the next aliases[j+1] to theck for duplicate aliases to
      fixes the aliases NULL pointer dereference, then goto zfree code snippet
      to release it.
      
      After patch testing:
       $ perf list --unit=hisi_sicl,cpa pmu
      
       uncore cpa:
         cpa_p0_rd_dat_32b
              [Number of read ops transmitted by the P0 port which size is 32 bytes.
               Unit: hisi_sicl,cpa]
         cpa_p0_rd_dat_64b
              [Number of read ops transmitted by the P0 port which size is 64 bytes.
               Unit: hisi_sicl,cpa]
      
      Fixes: c3245d20 ("perf pmu: Abstract alias/event struct")
      Signed-off-by: default avatarJunhao He <hejunhao3@huawei.com>
      Cc: ravi.bangoria@amd.com
      Cc: james.clark@arm.com
      Cc: prime.zeng@hisilicon.com
      Cc: cuigaosheng1@huawei.com
      Cc: jonathan.cameron@huawei.com
      Cc: linuxarm@huawei.com
      Cc: yangyicong@huawei.com
      Cc: robh@kernel.org
      Cc: renyu.zj@linux.alibaba.com
      Cc: kjain@linux.ibm.com
      Cc: john.g.garry@oracle.com
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240614094318.11607-1-hejunhao3@huawei.com
      dd9a426e
    • Yunseong Kim's avatar
      perf unwind-libunwind: Add malloc() failure handling · 83da316a
      Yunseong Kim authored
      Add malloc() failure handling in unread_unwind_spec_debug_frame().
      This make caller find_proc_info() works well when the allocation failure.
      Signed-off-by: default avatarYunseong Kim <yskelg@gmail.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Austin Kim <austindh.kim@gmail.com>
      Cc: shjy180909@gmail.com
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240619204211.6438-2-yskelg@gmail.com
      83da316a
    • Yunseong Kim's avatar
      util: constant -1 with expression of type char · e9ffa312
      Yunseong Kim authored
      This patch resolve following warning.
      
        tools/perf/util/evsel.c:1620:9: error: result of comparison of constant
         -1 with expression of type 'char' is always false
         -Werror,-Wtautological-constant-out-of-range-compare
         1620 |                 if (c == -1)
              |                     ~ ^  ~~
      Signed-off-by: default avatarYunseong Kim <yskelg@gmail.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Austin Kim <austindh.kim@gmail.com>
      Cc: shjy180909@gmail.com
      Cc: Ze Gao <zegao2021@gmail.com>
      Cc: Leo Yan <leo.yan@linux.dev>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240619203428.6330-2-yskelg@gmail.com
      e9ffa312
    • Fernand Sieber's avatar
      perf: Timehist account sch delay for scheduled out running · d363c2a8
      Fernand Sieber authored
      When using perf timehist, sch delay is only computed for a waking task,
      not for a pre empted task. This patches changes sch delay to account for
      both. This makes sense as testing scheduling policy need to consider the
      effect of scheduling delay globally, not only for waking tasks.
      
      Example of `perf timehist` report before the patch for `stress` task
      competing with each other.
      
      First column is wait time, second column sch delay, third column
      runtime.
      
      1.492060 [0000]  s    stress[81]                          1.999      0.000      2.000      R  next: stress[83]
      1.494060 [0000]  s    stress[83]                          2.000      0.000      2.000      R  next: stress[81]
      1.496060 [0000]  s    stress[81]                          2.000      0.000      2.000      R  next: stress[83]
      1.498060 [0000]  s    stress[83]                          2.000      0.000      1.999      R  next: stress[81]
      
      After the patch, it looks like this (note that all wait time is not zero
      anymore):
      
      1.492060 [0000]  s    stress[81]                          1.999      1.999      2.000      R  next: stress[83]
      1.494060 [0000]  s    stress[83]                          2.000      2.000      2.000      R  next: stress[81]
      1.496060 [0000]  s    stress[81]                          2.000      2.000      2.000      R  next: stress[83]
      1.498060 [0000]  s    stress[83]                          2.000      2.000      1.999      R  next: stress[81]
      Signed-off-by: default avatarFernand Sieber <sieberf@amazon.com>
      Reviewed-by: default avatarMadadi Vineeth Reddy <vineethr@linux.ibm.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240618090339.87482-1-sieberf@amazon.com
      d363c2a8
    • Adrian Hunter's avatar
      perf tests: Add APX and other new instructions to x86 instruction decoder test · fcd094e5
      Adrian Hunter authored
      Add samples of APX and other new instructions to the 'x86 instruction
      decoder - new instructions' test.
      
      Note the test is only available if the perf tool has been built with
      EXTRA_TESTS=1.
      
      Example:
      
        $ make EXTRA_TESTS=1 -C tools/perf
        $ tools/perf/perf test -F -v 'new ins' |& grep -i 'jmpabs\|popp\|pushp'
        Decoded ok: d5 00 a1 ef cd ab 90 78 56 34 12    jmpabs $0x1234567890abcdef
        Decoded ok: d5 08 53                    pushp  %rbx
        Decoded ok: d5 18 50                    pushp  %r16
        Decoded ok: d5 19 57                    pushp  %r31
        Decoded ok: d5 19 5f                    popp   %r31
        Decoded ok: d5 18 58                    popp   %r16
        Decoded ok: d5 08 5b                    popp   %rbx
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Nikolay Borisov <nik.borisov@suse.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-11-adrian.hunter@intel.com
      fcd094e5
    • Adrian Hunter's avatar
      perf intel pt: Add new JMPABS instruction to the Intel PT instruction decoder · a44abd2c
      Adrian Hunter authored
      JMPABS is 64-bit absolute direct jump instruction, encoded with a mandatory
      REX2 prefix. JMPABS is designed to be used in the procedure linkage table
      (PLT) to replace indirect jumps, because it has better performance. In that
      case the jump target will be amended at run time. To enable Intel PT to
      follow the code, a TIP packet is always emitted when JMPABS is traced under
      Intel PT.
      
      Refer to the Intel Advanced Performance Extensions (Intel APX) Architecture
      Specification for details.
      
      Decode JMPABS as an indirect jump, because it has an associated TIP packet
      the same as an indirect jump and the control flow should follow the TIP
      packet payload, and not assume it is the same as the on-file object code
      JMPABS target address.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chang S. Bae <chang.seok.bae@intel.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Nikolay Borisov <nik.borisov@suse.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: x86@kernel.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240502105853.5338-10-adrian.hunter@intel.com
      a44abd2c
    • Chaitanya S Prakash's avatar
      perf test: Check output of the probe ... --funcs command · abc0f0c4
      Chaitanya S Prakash authored
      Test "perf probe of function from different CU" only checks if the perf
      command has failed and doesn't test the --funcs output. In the issue
      reported in the previous commit, the garbage output of the --funcs
      command was being ignored by the test when it could have been caught.
      
      The script first makes use of --funcs option with the perf probe command
      to check if the function "foo" exists in the testfile before adding a
      probe to it in the next command. The output of probe...--funcs command
      is redirected to stdout, therefore, add '| grep "foo"' to validate the
      result.
      Signed-off-by: default avatarChaitanya S Prakash <chaitanyas.prakash@arm.com>
      Reviewed-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: anshuman.khandual@arm.com
      Cc: james.clark@arm.com
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240601125946.1741414-11-ChaitanyaS.Prakash@arm.com
      abc0f0c4
    • Athira Rajeev's avatar
      tools/perf: Fix parallel-perf python script to replace new python syntax ":=" usage · 7d49ced8
      Athira Rajeev authored
      perf test "perf script tests" fails as below in systems
      with python 3.6
      
      	File "/home/athira/linux/tools/perf/tests/shell/../../scripts/python/parallel-perf.py", line 442
      	if line := p.stdout.readline():
                   ^
      	SyntaxError: invalid syntax
      	--- Cleaning up ---
      	---- end(-1) ----
      	92: perf script tests: FAILED!
      
      This happens because ":=" is a new syntax that assigns values
      to variables as part of a larger expression. This is introduced
      from python 3.8 and hence fails in setup with python 3.6
      Address this by splitting the large expression and check the
      value in two steps:
      Previous line: if line := p.stdout.readline():
      Current change:
      	line = p.stdout.readline()
      	if line:
      
      With patch
      
      	./perf test "perf script tests"
      	 93: perf script tests:  Ok
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: akanksha@linux.ibm.com
      Cc: kjain@linux.ibm.com
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240623064850.83720-3-atrajeev@linux.vnet.ibm.com
      7d49ced8
    • Athira Rajeev's avatar
      tools/perf: Use is_perf_pid_map_name helper function to check dso's of pattern /tmp/perf-%d.map · b9241f15
      Athira Rajeev authored
      commit 80d496be ("perf report: Add support for profiling JIT
      generated code") added support for profiling JIT generated code.
      This patch handles dso's of form "/tmp/perf-$PID.map".
      
      Some of the references doesn't check exactly for same pattern.
      some uses "if (!strncmp(dso_name, "/tmp/perf-", 10))". Fix
      this by using helper function perf_pid_map_tid and
      is_perf_pid_map_name which looks for proper pattern of
      form: "/tmp/perf-$PID.map" for these checks.
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: akanksha@linux.ibm.com
      Cc: kjain@linux.ibm.com
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240623064850.83720-2-atrajeev@linux.vnet.ibm.com
      b9241f15
    • Athira Rajeev's avatar
      tools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__load · b0979f00
      Athira Rajeev authored
      Perf test for perf probe of function from different CU fails
      as below:
      
      	./perf test -vv "test perf probe of function from different CU"
      	116: test perf probe of function from different CU:
      	--- start ---
      	test child forked, pid 2679
      	Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.Msa7iy89bx/testfile
      	  Error: Failed to add events.
      	--- Cleaning up ---
      	"foo" does not hit any event.
      	  Error: Failed to delete events.
      	---- end(-1) ----
      	116: test perf probe of function from different CU                   : FAILED!
      
      The test does below to probe function "foo" :
      
      	# gcc -g -Og -flto -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.c
      	-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
      	# gcc -g -Og -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.c
      	-o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o
      	# gcc -g -Og -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
      	/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o
      	/tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o
      
      	# ./perf probe -x /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile foo
      	Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile
      	   Error: Failed to add events.
      
      Perf probe fails to find symbol foo in the executable placed in
      /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7
      
      Simple reproduce:
      
       # mktemp -d /tmp/perf-checkXXXXXXXXXX
         /tmp/perf-checkcWpuLRQI8j
      
       # gcc -g -o test test.c
       # cp test /tmp/perf-checkcWpuLRQI8j/
       # nm /tmp/perf-checkcWpuLRQI8j/test | grep foo
         00000000100006bc T foo
      
       # ./perf probe -x /tmp/perf-checkcWpuLRQI8j/test foo
         Failed to find symbol foo in /tmp/perf-checkcWpuLRQI8j/test
            Error: Failed to add events.
      
      But it works with any files like /tmp/perf/test. Only for
      patterns with "/tmp/perf-", this fails.
      
      Further debugging, commit 80d496be ("perf report: Add support
      for profiling JIT generated code") added support for profiling JIT
      generated code. This patch handles dso's of form
      "/tmp/perf-$PID.map" .
      
      The check used "if (strncmp(self->name, "/tmp/perf-", 10) == 0)"
      to match "/tmp/perf-$PID.map". With this commit, any dso in
      /tmp/perf- folder will be considered separately for processing
      (not only JIT created map files ). Fix this by changing the
      string pattern to check for "/tmp/perf-%d.map". Add a helper
      function is_perf_pid_map_name to do this check. In "struct dso",
      dso->long_name holds the long name of the dso file. Since the
      /tmp/perf-$PID.map check uses the complete name, use dso___long_name for
      the string name.
      
      With the fix,
      	# ./perf test "test perf probe of function from different CU"
      	117: test perf probe of function from different CU                   : Ok
      
      Fixes: 56cbeacf ("perf probe: Add test for regression introduced by switch to die_get_decl_file()")
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Reviewed-by: default avatarChaitanya S Prakash <chaitanyas.prakash@arm.com>
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: akanksha@linux.ibm.com
      Cc: kjain@linux.ibm.com
      Cc: maddy@linux.ibm.com
      Cc: disgoel@linux.vnet.ibm.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20240623064850.83720-1-atrajeev@linux.vnet.ibm.com
      b0979f00
  2. 24 Jun, 2024 1 commit
  3. 21 Jun, 2024 4 commits
  4. 20 Jun, 2024 23 commits