Commits · 3cdd98b42d212160d7aae746a97960a4595cbfc2 · Kirill Smelkov / linux

02 May, 2024 10 commits

perf maps: Remove check_invariants() from maps__lock() · 3cdd98b4

Namhyung Kim authored Apr 29, 2024

I found that the debug build was a slowed down a lot by the maps lock
code since it checks the invariants whenever it gets the pointer to the
lock.  This means it checks twice the invariants before and after the
access.

Instead, let's move the checking code within the lock area but after any
modification and remove it from the read paths.  This would remove (more
than) half of the maps lock overhead.

The time for perf report with a huge data file (200k+ of MMAP2 events).

  Non-debug     Before      After
  ---------   --------   --------
     2m 43s     6m 45s     4m 21s
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240429225738.1491791-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

3cdd98b4

perf cs-etm: Improve version detection and error reporting · e3123079

James Clark authored May 01, 2024

When the config validation functions are warning about ETMv3, they do it
based on "not ETMv4". If the drivers aren't all loaded or the hardware
doesn't support Coresight it will appear as "not ETMv4" and then Perf
will print the error message "... not supported in ETMv3 ..." which is
wrong and confusing.

cs_etm_is_etmv4() is also misnamed because it also returns true for
ETE because ETE has a superset of the ETMv4 metadata files. Although
this was always done in the correct order so it wasn't a bug.

Improve all this by making a single get version function which also
handles not present as a separate case. Change the ETMv3 error message
to only print when ETMv3 is detected, and add a new error message for
the not present case.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Leo Yan <leo.yan@linux.dev>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-4-james.clark@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

e3123079

perf cs-etm: Remove repeated fetches of the ETM PMU · bc5e0e1b

James Clark authored May 01, 2024

Most functions already have cs_etm_pmu, so it's a bit neater to pass
it through rather than itr only to convert it again.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-3-james.clark@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

bc5e0e1b

perf cs-etm: Use struct perf_cpu as much as possible · cbaf2c4f

James Clark authored May 01, 2024

The perf_cpu struct makes some iterators simpler and avoids some
mistakes with interchanging CPU IDs with indexes etc. At the moment in
this file the conversion to an integer is done somewhere in the middle
of the call tree. Change it to delay the conversion to an int until the
leaf functions.

Some of the usage patterns are duplicated, so instead of changing them
all, make cs_etm_get_ro() more reusable and use that everywhere.
cs_etm_get_ro() didn't return an error before, but return one now so
that it can also be used where an error is needed. Continue to ignore
the error where it was already ignored.

Use cs_etm_pmu_path_exists() instead of cs_etm_get_ro() in
cs_etm_is_etmv4() because cs_etm_get_ro() prints a warning, but path
exists is sufficient for this use case.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-2-james.clark@arm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

cbaf2c4f

perf annotate-data: Check kind of stack variables · b7d4aacf

Namhyung Kim authored May 01, 2024

I sometimes see ("unknown type") in the result and it was because it
didn't check the type of stack variables properly during the instruction
tracking.  The stack can carry constant values (without type info) and
if the target instruction is accessing the stack location, it resulted
in the "unknown type".

Maybe we could pick one of integer types for the constant, but it
doesn't really mean anything useful.  Let's just drop the stack slot if
it doesn't have a valid type info.

Here's an example how it got the unknown type.
Note that 0xffffff48 = -0xb8.
  -----------------------------------------------------------
  find data type for 0xffffff48(reg6) at ...
  CU for ...
  frame base: cfa=0 fbreg=6
  scope: [2/2] (die:11cb97f)
  bb: [37 - 3a]
  var [37] reg15 type='int' size=0x4 (die:0x1180633)
  bb: [40 - 4b]
  mov [40] imm=0x1 -> reg13
  var [45] reg8 type='sigset_t*' size=0x8 (die:0x11a39ee)
  mov [45] imm=0x1 -> reg2                     <---  here reg2 has a constant
  bb: [215 - 237]
  mov [218] reg2 -> -0xb8(stack) constant      <---  and save it to the stack
  mov [225] reg13 -> -0xc4(stack) constant
  call [22f] find_task_by_vgpid
  call [22f] return -> reg0 type='struct task_struct*' size=0x8 (die:0x11881e8)
  bb: [5c8 - 5cf]
  bb: [2fb - 302]
  mov [2fb] -0xc4(stack) -> reg13 constant
  bb: [13b - 14d]
  mov [143] 0xd50(reg3) -> reg5 type='struct task_struct*' size=0x8 (die:0xa31f3c)
  bb: [153 - 153]
  chk [153] reg6 offset=0xffffff48 ok=0 kind=0 fbreg    <--- access here
  found by insn track: 0xffffff48(reg6) type-offset=0
   type='G<EF>^K<F6><AF>U' size=0 (die:0xffffffffffffffff)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-7-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

b7d4aacf

perf annotate-data: Handle multi regs in find_data_type_block() · af89e8f2

Namhyung Kim authored May 01, 2024

The instruction tracking should be the same for the both registers.

Just do it once and compare the result with multi regs as with the
previous patches.

Then we don't need to call find_data_type_block() separately for each
reg.

Let's remove the 'reg' argument from the relevant functions.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-6-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

af89e8f2

perf annotate-data: Check memory access with two registers · eba1f853

Namhyung Kim authored May 01, 2024

The following instruction pattern is used to access a global variable.

  mov     $0x231c0, %rax
  movsql  %edi, %rcx
  mov     -0x7dc94ae0(,%rcx,8), %rcx
  cmpl    $0x0, 0xa60(%rcx,%rax,1)     <<<--- here

The first instruction set the address of the per-cpu variable (here, it
is 'runqueues' of type 'struct rq').  The second instruction seems like
a cpu number of the per-cpu base.  The third instruction get the base
offset of per-cpu area for that cpu.  The last instruction compares the
value of the per-cpu variable at the offset of 0xa60.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-5-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

eba1f853

perf annotate-data: Handle direct global variable access · 4449c904

Namhyung Kim authored May 01, 2024

Like per-cpu base offset array, sometimes it accesses the global
variable directly using the offset.  Allow this type of instructions as
long as it finds a global variable for the address.

  movslq  %edi, %rcx
  mov     -0x7dc94ae0(,%rcx,8), %rcx   <<<--- here

As %rcx has a valid type (i.e. array index) from the first instruction,
it will be checked by the first case in check_matching_type().  But as
it's not a pointer type, the match will fail.  But in this case, it
should check if it accesses the kernel global array variable.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-4-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

4449c904

perf annotate-data: Collect global variables in advance · c1da8411

Namhyung Kim authored May 01, 2024

Currently it looks up global variables from the current CU using address
and name.  But it sometimes fails to find a variable as the variable can
come from a different CU - but it's still strange it failed to find a
declaration for some reason.

Anyway, it can collect all global variables from all CU once and then
lookup them later on.  This slightly improves the success rate of my
test data set.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-3-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

c1da8411

perf dwarf-aux: Add die_collect_global_vars() · d7b60803

Namhyung Kim authored May 01, 2024

This function is to search all global variables in the CU.  We want to
have the list of global variables at once and match them later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240502060011.1838090-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

d7b60803

27 Apr, 2024 30 commits

perf test: Reintroduce -p/--parallel and make -S/--sequential the default · 8c618b58

Arnaldo Carvalho de Melo authored Apr 26, 2024

We can't default to doing parallel tests as there are tests that compete
for the same resources and thus clash, for instance tests that put in
place 'perf probe' probes, that clean the probes without regard to other
tests needs, ARM64 coresight tests, Intel PT ones, etc.

So reintroduce --p/--parallel and make -S/--sequential the default.

We need to come up with infrastructure that state which tests can't run
in parallel because they need exclusive access to some resource,
something as simple as "probes" that would then avoid 'perf probe' tests
from running while other such test is running, or make the tests more
resilient, till then we can't use parallel mode as default.

While at it, document all these options in the 'perf test' man page.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reported-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Ziwm18BqIn_vc1vn@x1Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8c618b58

tools headers: Synchronize linux/bits.h with the kernel sources · 450f941e

Arnaldo Carvalho de Melo authored Apr 26, 2024

To pick up the changes in this cset:

   3c7a8e19 ("uapi: introduce uapi-friendly macros for GENMASK")

That just causes perf to rebuild. Its just some macros going to an uapi
header that we now have to grab a copy into tools/ as well.

This addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/linux/bits.h include/linux/bits.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lore.kernel.org/lkml/ZiwJsFOBez0MS4r9@x1Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

450f941e

tools headers x86 cpufeatures: Sync with the kernel sources to pick BHI mitigation changes · 8f211643

Arnaldo Carvalho de Melo authored Apr 25, 2024

To pick the changes from:

  95a6ccbd ("x86/bhi: Mitigate KVM by default")
  ec9404e4 ("x86/bhi: Add BHI mitigation knob")
  be482ff9 ("x86/bhi: Enumerate Branch History Injection (BHI) bug")
  0f4a8376 ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S")
  7390db8a ("x86/bhi: Add support for clearing branch history at syscall entry")

This causes these perf files to be rebuilt and brings some X86_FEATURE
that will be used when updating the copies of
tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:

      CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
      CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/ZirIx4kPtJwGFZS0@x1Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8f211643

perf annotate: Fix data type profiling on stdio · 2b87383c

Namhyung Kim authored Apr 22, 2024

The loop in hists__find_annotations() never set the 'nd' pointer to NULL
and it makes stdio output repeating the last element forever.  I think
it doesn't set to NULL for TUI to prevent it from exiting unexpectedly.
But it should just set on stdio mode.

Fixes: d001c7a7 ("perf annotate-data: Add hist_entry__annotate_data_tui()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240423020643.740029-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

2b87383c

perf build: Pretend scandirat is missing with msan · 8524d71c

Ian Rogers authored Mar 20, 2024

Memory sanitizer lacks an interceptor for scandirat, reporting all
memory it allocates as uninitialized. Memory sanitizer has a scandir
interceptor so use the fallback function in this case. This allows
'perf test' to run under memory sanitizer.

Additional notes from Ian on running in this mode:

Note, as msan needs to instrument memory allocations libraries need to
be compiled with it. I lacked the msan built libraries and so built
with:
```
$ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g
-fno-omit-frame-pointer -fsanitize=memory
-fsanitize-memory-track-origins" CC=clang CXX=clang++ HOSTCC=clang
NO_LIBTRACEEVENT=1 NO_LIBELF=1 BUILD_BPF_SKEL=0 NO_LIBPFM=1
```
oh, I disabled libbpf here as the bpf system call also lacks msan interceptors.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240320163244.1287780-1-irogers@google.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8524d71c

perf intel-pt: Fix unassigned instruction op (discovered by MemorySanitizer) · e101a05f

Adrian Hunter authored Mar 26, 2024

MemorySanitizer discovered instances where the instruction op value was
not assigned.:

  WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x5581c00a76b3 in intel_pt_sample_flags tools/perf/util/intel-pt.c:1527:17
  Uninitialized value was stored to memory at
    #0 0x5581c005ddf8 in intel_pt_walk_insn tools/perf/util/intel-pt-decoder/intel-pt-decoder.c:1256:25

The op value is used to set branch flags for branch instructions
encountered when walking the code, so fix by setting op to
INTEL_PT_OP_OTHER in other cases.

Fixes: 4c761d80 ("perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/20240320162619.1272015-1-irogers@google.com/
Link: https://lore.kernel.org/r/20240326083223.10883-1-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

e101a05f

perf record: Fix comment misspellings · 7cc72090

Howard Chu authored Apr 25, 2024

Fix comment misspellings
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240425060427.1800663-1-howardchu95@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

7cc72090

perf annotate: Update DSO binary type when trying build-id · 8f3ec810

Namhyung Kim authored Apr 24, 2024

dso__disassemble_filename() tries to get the filename for objdump (or
capstone) using build-id.  But I found sometimes it didn't disassemble
some functions.

It turned out that those functions belong to a DSO which has no binary
type set.  It seems it sets the binary type for some special files only
- like kernel (kallsyms or kcore) or BPF images.  And there's a logic to
skip dso with DSO_BINARY_TYPE__NOT_FOUND.

As it's checked the build-id cache link, it should set the binary type
as DSO_BINARY_TYPE__BUILD_ID_CACHE.

Fixes: 873a8373 ("perf annotate: Skip DSOs not found")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

8f3ec810

perf annotate: Fallback disassemble to objdump when capstone fails · f35847de

Namhyung Kim authored Apr 24, 2024

I found some cases that capstone failed to disassemble.  Probably my
capstone is an old version but anyway there's a chance it can fail.  And
then it silently stopped in the middle.  In my case, it didn't
understand "RDPKRU" instruction.

Let's check if the capstone disassemble reached the end of the function
and fallback to objdump if not.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

f35847de

perf annotate-data: Check if 'struct annotation_source' was allocated on 'perf report' TUI · 47557db9

Namhyung Kim authored Apr 24, 2024

As it removed the sample accounting for code when no symbol sort key is
given for 'perf report' TUI, it might not have allocated the
'struct annotated_source' yet.  Let's check if it's NULL first.

Fixes: 6cdd977e ("perf report: Do not collect sample histogram unnecessarily")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240424230015.1054013-1-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

47557db9

perf test: Add a new test for 'perf annotate' · 281bf8f6

Namhyung Kim authored Apr 23, 2024

Add a basic 'perf annotate' test:

  $ ./perf test annotate -vv
   76: perf annotate basic tests:
  --- start ---
  test child forked, pid 846989
   fbcd0-fbd55 l noploop
  perf does have symbol 'noploop'
  Basic perf annotate test
           : 0     0xfbcd0 <noploop>:
      0.00 :   fbcd0:       pushq   %rbp
      0.00 :   fbcd1:       movq    %rsp, %rbp
      0.00 :   fbcd4:       pushq   %r12
      0.00 :   fbcd6:       pushq   %rbx
      0.00 :   fbcd7:       movl    $1, %ebx
      0.00 :   fbcdc:       subq    $0x10, %rsp
      0.00 :   fbce0:       movq    %fs:0x28, %rax
      0.00 :   fbce9:       movq    %rax, -0x18(%rbp)
      0.00 :   fbced:       xorl    %eax, %eax
      0.00 :   fbcef:       testl   %edi, %edi
      0.00 :   fbcf1:       jle     0xfbd04
      0.00 :   fbcf3:       movq    (%rsi), %rdi
      0.00 :   fbcf6:       movl    $0xa, %edx
      0.00 :   fbcfb:       xorl    %esi, %esi
      0.00 :   fbcfd:       callq   0x41920
      0.00 :   fbd02:       movl    %eax, %ebx
      0.00 :   fbd04:       leaq    -0x7b(%rip), %r12	# fbc90 <sighandler>
      0.00 :   fbd0b:       movl    $2, %edi
      0.00 :   fbd10:       movq    %r12, %rsi
      0.00 :   fbd13:       callq   0x40a00
      0.00 :   fbd18:       movl    $0xe, %edi
      0.00 :   fbd1d:       movq    %r12, %rsi
      0.00 :   fbd20:       callq   0x40a00
      0.00 :   fbd25:       movl    %ebx, %edi
      0.00 :   fbd27:       callq   0x407c0
      0.10 :   fbd2c:       movl    0x89785e(%rip), %eax	# 993590 <done>
      0.00 :   fbd32:       testl   %eax, %eax
     99.90 :   fbd34:       je      0xfbd2c
      0.00 :   fbd36:       movq    -0x18(%rbp), %rax
      0.00 :   fbd3a:       subq    %fs:0x28, %rax
      0.00 :   fbd43:       jne     0xfbd50
      0.00 :   fbd45:       addq    $0x10, %rsp
      0.00 :   fbd49:       xorl    %eax, %eax
      0.00 :   fbd4b:       popq    %rbx
      0.00 :   fbd4c:       popq    %r12
      0.00 :   fbd4e:       popq    %rbp
      0.00 :   fbd4f:       retq
      0.00 :   fbd50:       callq   0x407e0
      0.00 :   fbcd0:       pushq   %rbp
      0.00 :   fbcd1:       movq    %rsp, %rbp
      0.00 :   fbcd4:       pushq   %r12
      0.00 :   fbcd0:  push   %rbp
      0.00 :   fbcd1:  mov    %rsp,%rbp
      0.00 :   fbcd4:  push   %r12
  Basic annotate test [Success]
  ---- end(0) ----
   76: perf annotate basic tests                                       : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240424001231.849972-1-namhyung@kernel.org
[ Improved a bit the error messages ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

281bf8f6

perf parse-events: Tidy the setting of the default event name · bb65ff78