1. 04 Apr, 2023 4 commits
    • Namhyung Kim's avatar
      perf hist: Improve srcfile sort key performance (really) · 6094c774
      Namhyung Kim authored
      The earlier commit f0cdde28 ("perf hist: Improve srcfile sort
      key performance") updated the srcfile logic but missed to change the
      ->cmp() callback which is called for every sample.
      
      It should use the same logic like in the srcline to speed up the
      processing because it'd return the same information repeatedly for the
      same address.  The real processing will be done in
      sort__srcfile_collapse().
      
      Fixes: f0cdde28 ("perf hist: Improve srcfile sort key performance")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230323025005.191239-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6094c774
    • Thomas Richter's avatar
      perf test: Fix wrong size expectation for 'Setup struct perf_event_attr' · 30df88a8
      Thomas Richter authored
      The test case "perf test 'Setup struct perf_event_attr'" is failing.
      
      On s390 this output is observed:
      
       # ./perf test -Fvvvv 17
       17: Setup struct perf_event_attr                                    :
       --- start ---
       running './tests/attr/test-stat-C0'
       Using CPUID IBM,8561,703,T01,3.6,002f
       .....
       Event event:base-stat
            fd = 1
            group_fd = -1
            flags = 0|8
            cpu = *
            type = 0
            size = 128     <<<--- wrong, specified in file base-stat
            config = 0
            sample_period = 0
            sample_type = 65536
            ...
       'PERF_TEST_ATTR=/tmp/tmpgw574wvg ./perf stat -o \
      	/tmp/tmpgw574wvg/perf.data -e cycles -C 0 kill >/dev/null \
      	2>&1 ret '1', expected '1'
        loading result events
          Event event-0-0-4
            fd = 4
            group_fd = -1
            cpu = 0
            pid = -1
            flags = 8
            type = 0
            size = 136     <<<--- actual size used in system call
            .....
        compare
          matching [event-0-0-4]
            to [event:base-stat]
            [cpu] 0 *
            [flags] 8 0|8
            [type] 0 0
            [size] 136 128
          ->FAIL
          match: [event-0-0-4] matches []
        expected size=136, got 128
        FAILED './tests/attr/test-stat-C0' - match failure
      
      This mismatch is caused by
      commit 09519ec3 ("perf: Add perf_event_attr::config3")
      which enlarges the structure perf_event_attr by 8 bytes.
      
      Fix this by adjusting the expected value of size.
      
      Output after:
       # ./perf test -Fvvvv 17
       17: Setup struct perf_event_attr                                    :
       --- start ---
       running './tests/attr/test-stat-C0'
       Using CPUID IBM,8561,703,T01,3.6,002f
       ...
        matched
        compare
          matching [event-0-0-4]
            to [event:base-stat]
            [cpu] 0 *
            [flags] 8 0|8
            [type] 0 0
            [size] 136 136
            ....
         ->OK
         match: [event-0-0-4] matches ['event:base-stat']
       matched
      
      Fixes: 09519ec3 ("perf: Add perf_event_attr::config3")
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230322094731.1768281-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      30df88a8
    • Ian Rogers's avatar
      perf build: Add warning for when vmlinux.h generation fails · 1d796654
      Ian Rogers authored
      The warning advises on the NO_BPF_SKEL=1 option.
      Suggested-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230322183108.1380882-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d796654
    • Artem Savkov's avatar
      perf report: Append inlines to non-DWARF callchains · 46d21ec0
      Artem Savkov authored
      Append information about inlined functions to FP and LBR callchains from
      DWARF debuginfo when available. Do so by calling append_inlines() from
      add_callchain_ip().
      
      Testing it:
      
      Frame-pointer mode recorded with 'perf record --call-graph=fp --freq=max -- ./a.out'
      
        #include <stdio.h>
        #include <stdint.h>
      
        static __attribute__((noinline)) uint32_t func5(uint32_t i)
        {
                return i + 10;
        }
      
        static uint32_t func4(uint32_t i)
        {
                return func5(i + 5);
        }
      
        static inline uint32_t func3(uint32_t i)
        {
                return func4(i + 4);
        }
      
        static __attribute__((noinline)) uint32_t func2(uint32_t i)
        {
                return func3(i + 3);
        }
      
        static uint32_t func1(uint32_t i)
        {
                return func2(i + 2);
        }
      
        __attribute__((noinline)) uint64_t entry(void)
        {
                uint64_t ret = 0;
                uint32_t i = 0;
                for (i = 0; i < 1000000; i++) {
                        ret += func1(i);
                        ret -= func2(i);
                        ret += func3(i);
                        ret += func4(i);
                        ret -= func5(i);
                }
                return ret;
        }
      
        int main(int argc, char **argv)
        {
                printf("%s\n", __func__);
                return entry();
        }
        ======
      
      Here is the output I get with '--call-graph callee --no-children'
      
        ======
        # To display the perf.data header info, please use --header/--header-only options.
        #
        #
        # Total Lost Samples: 0
        #
        # Samples: 250  of event 'cycles:u'
        # Event count (approx.): 26819859
        #
        # Overhead  Command  Shared Object         Symbol
        # ........  .......  ....................  .....................................
        #
            43.58%  a.out    a.out                 [.] func5
                    |
                    |--28.93%--entry
                    |          main
                    |          __libc_start_call_main
                    |
                     --14.65%--func4 (inlined)
                               |
                               |--10.45%--entry
                               |          main
                               |          __libc_start_call_main
                               |
                                --4.20%--func3 (inlined)
                                          entry
                                          main
                                          __libc_start_call_main
      
            38.80%  a.out    a.out                 [.] entry
                    |
                    |--23.27%--func4 (inlined)
                    |          |
                    |          |--20.28%--func3 (inlined)
                    |          |          func2
                    |          |          main
                    |          |          __libc_start_call_main
                    |          |
                    |           --2.99%--entry
                    |                     main
                    |                     __libc_start_call_main
                    |
                    |--8.17%--func5
                    |          main
                    |          __libc_start_call_main
                    |
                    |--3.89%--func1 (inlined)
                    |          entry
                    |          main
                    |          __libc_start_call_main
                    |
                     --3.48%--entry
                               main
                               __libc_start_call_main
      
            13.07%  a.out    a.out                 [.] func2
                    |
                    ---func5
                       main
                       __libc_start_call_main
      
             1.54%  a.out    [unknown]             [k] 0xffffffff81e011b7
             1.16%  a.out    [unknown]             [k] 0xffffffff81e00193
                    |
                     --0.57%--__mmap64 (inlined)
                               __mmap64 (inlined)
      
             0.34%  a.out    ld-linux-x86-64.so.2  [.] __tunable_get_val
             0.34%  a.out    ld-linux-x86-64.so.2  [.] strcmp
             0.32%  a.out    libc.so.6             [.] strchr
             0.31%  a.out    ld-linux-x86-64.so.2  [.] _dl_relocate_object
             0.22%  a.out    ld-linux-x86-64.so.2  [.] _dl_init_paths
             0.18%  a.out    ld-linux-x86-64.so.2  [.] get_common_cache_info.constprop.0
             0.14%  a.out    ld-linux-x86-64.so.2  [.] __GI___tunables_init
      
        #
        # (Tip: Show individual samples with: perf script)
        #
        ======
      
        It does not seem to be out of order, or at least it is consistent with
        what I get with dwarf unwinders.
      
      Committer notes:
      
      Adrian Hunter pointed out that this breaks --branch-history, so don't do
      it for branches, see the second Link below.
      Suggested-by: default avatarAndrii Nakryiko <andrii.nakryiko@gmail.com>
      Signed-off-by: <asavkov@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230316133557.868731-2-asavkov@redhat.com
      Link: https://lore.kernel.org/r/54129783-2960-84e1-05e9-97ac70ffb432@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      46d21ec0
  2. 21 Mar, 2023 5 commits
  3. 20 Mar, 2023 11 commits
    • German Gomez's avatar
      perf report: Add 'simd' sort field · ea15483e
      German Gomez authored
      Add 'simd' sort field to visualize SIMD ops in 'perf report'.
      
      Rows are labeled with the SIMD ISA, and the type of predicate (if any):
      
        - [p] partial predicate
        - [e] empty predicate (no elements in the vector being used)
      
      Example with Arm SPE and SVE (Scalable Vector Extension):
      
        #include <arm_sve.h>
      
        double src[1025], dst[1025];
      
        int main(void) {
          svfloat64_t vc = svdup_f64(1);
          for(;;)
            for(int i = 0; i < 1025; i += svcntd())
            {
              svbool_t pg = svwhilelt_b64(i, 1025);
              svfloat64_t vsrc = svld1(pg, &src[i]);
              svfloat64_t vdst = svadd_x(pg, vsrc, vc);
              svst1(pg, &dst[i], vdst);
            }
          return 0;
        }
      
        ... compiled using "gcc-11 -march=armv8-a+sve -O3"
      
      Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1:
      
        $ perf record -e arm_spe_0// -- ./a.out
        $ perf report --itrace=i1i -s overhead,pid,simd,sym
      
        Overhead      Pid:Command   Simd     Symbol
        ........  ................  .......  ......................
      
          53.76%    10758:program            [.] main
          46.14%    10758:program   [.] SVE  [.] main
           0.09%    10758:program   [p] SVE  [.] main
      
      The report shows 0.09% of the sampled SVE operations use partial
      predicates due to src and dst arrays not being multiples of the vector
      register lengths.
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman.Khandual@arm.com
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.comSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ea15483e
    • German Gomez's avatar
      perf arm-spe: Add SVE flags to the SPE samples · 03a6c16e
      German Gomez authored
      Add flags from the Scalable Vector Extension (SVE) to the SPE samples
      which are available from Armv8.3 (FEAT_SPEv1p1).
      
      These will be displayed in a new SIMD sort field in a later commit.
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
      Cc: Anshuman.Khandual@arm.com
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-perf-users@vger.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      03a6c16e
    • German Gomez's avatar
      perf arm-spe: Refactor arm-spe to support operation packet type · 0066015a
      German Gomez authored
      Extend the decoder of Arm SPE records to support more fields from the
      operation packet type.
      
      Not all fields are being decoded by this commit. Only those needed to
      support the use-case SVE load/store/other operations.
      Suggested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman.Khandual@arm.com
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.comSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0066015a
    • German Gomez's avatar
      perf event: Add 'simd_flags' field to 'struct perf_sample' · f43cc1a9
      German Gomez authored
      Add new field to 'struct perf_sample' to store flags related to SIMD
      ops.
      
      It will be used to store SIMD information from SVE and NEON when
      profiling using ARM SPE.
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman.Khandual@arm.com
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.comSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f43cc1a9
    • Adrian Hunter's avatar
      perf intel-pt: Add support for new branch instructions ERETS and ERETU · 052072f6
      Adrian Hunter authored
      Intel Flexible Return and Event Delivery (FRED) adds instructions ERETS
      (return to supervisor) and ERETU (return to user). Intel PT instruction
      decoder needs to know about these instructions because they are
      branch instructions. Similar to IRET instructions, when the decoder
      encounters one of these instructions it will match it to a TIP (target
      instruction pointer) packet that informs what the branch destination is.
      
      The existing "x86 instruction decoder - new instructions" test can be
      used to test the result e.g.
      
        $ perf test -v ins |& grep eret
        Decoded ok: f2 0f 01 ca         erets
        Decoded ok: f3 0f 01 ca         eretu
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      052072f6
    • Adrian Hunter's avatar
      perf intel-pt: Add event type names UINTR and UIRET · 34f576c9
      Adrian Hunter authored
      UINTR and UIRET are listed in table 32-50 "CFE Packet Type and Vector
      Fields Details" in the Intel Processor Trace chapter of The Intel SDM
      Volume 3 version 078.
      
      The codes are for "User interrupt delivered" and "Exiting from user
      interrupt routine" respectively.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      34f576c9
    • Ian Rogers's avatar
      perf symbol: Sort names under write lock · ec9640f7
      Ian Rogers authored
      If finding a name doesn't find the sorted names then they are
      allocated and sorted. This shouldn't be done under a read lock as
      another reader may access it. Release the read lock and acquire the
      write lock, then release the write lock and reacquire the read lock.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ec9640f7
    • Ian Rogers's avatar
      perf test: Fix memory leak in symbols · 82c6d83b
      Ian Rogers authored
      machine__delete() doesn't delete threads. Add call to delete threads
      ahead of deleting the machine.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      82c6d83b
    • Ian Rogers's avatar
      perf tests: Add common error route for code-reading · 9bb5e1f6
      Ian Rogers authored
      A later change will enforce that the map is put on this path
      regardless of success or error.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9bb5e1f6
    • Ian Rogers's avatar
      perf bpf_counter: Use public cpumap accessors · 39b5e434
      Ian Rogers authored
      Avoid the use of internal apis via the cpumap accessor functions.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      39b5e434
    • Ian Rogers's avatar
      perf symbol: Avoid memory leak from abi::__cxa_demangle · c9602aa0
      Ian Rogers authored
      Rather than allocate memory, allow abi::__cxa_demangle to do
      that. This avoids a problem where on error NULL was returned
      triggering a memory leak.
      
      Fixes: 3b4e4efe ("perf symbol: Add abi::__cxa_demangle C++ demangling support")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c9602aa0
  4. 15 Mar, 2023 20 commits
    • Leo Yan's avatar
      perf kvm: Update documentation to reflect new changes · 96d54169
      Leo Yan authored
      Update documentation for new sorting and option '--stdio'.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      96d54169
    • Leo Yan's avatar
      perf kvm: Add TUI mode for stat report · 984f16cd
      Leo Yan authored
      Since we have supported histograms list and prepared the dimensions in
      the tool, this patch adds TUI mode for stat report.  It also adds UI
      progress for sorting for better user experience.
      
      Committer notes:
      
      kvm_display() is only used by functions enclosed in:
      
        #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So do it with this new function as well.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      984f16cd
    • Leo Yan's avatar
      perf kvm: Add dimensions for percentages · 32a5c2b8
      Leo Yan authored
      Add dimensions for count and time percentages, it would be useful for
      user to review percentage statistics.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      32a5c2b8
    • Leo Yan's avatar
      perf kvm: Support printing attributions for dimensions · fbb70bd3
      Leo Yan authored
      This patch adds header, entry callback and width for every dimension,
      thus in TUI mode the tool can print items with the defined attributions.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fbb70bd3
    • Leo Yan's avatar
      perf kvm: Polish sorting key · c695d48a
      Leo Yan authored
      Since histograms supports sorting, the tool doesn't need to maintain the
      mapping between the sorting keys and the corresponding comparison
      callbacks, therefore, this patch removes structure kvm_event_key.
      
      But we still need to validate the sorting key, this patch uses an array
      for sorting keys and renames function select_key() to is_valid_key()
      to validate the sorting key passed by user.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c695d48a
    • Leo Yan's avatar
      perf kvm: Use histograms list to replace cached list · f57a6414
      Leo Yan authored
      perf kvm tool defines its own cached list which is managed with RB tree,
      histograms also provide RB tree to manage data entries.  Since now we
      have introduced histograms in the tool, it's not necessary to use the
      self defined list and we can directly use histograms list to manage
      KVM events.
      
      This patch changes to use histograms list to track KVM events, and it
      invokes the common function hists__output_resort_cb() to sort result,
      this also give us flexibility to extend more sorting key words easily.
      
      After histograms list supported, the cached list is redundant so remove
      the relevant code for it.
      
      Committer notes:
      
      kvm_hists__reinit() is only used by functions enclosed in:
      
        #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So do it with this new function as well.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f57a6414
    • Leo Yan's avatar
      perf kvm: Add dimensions for KVM event statistics · 41f1138e
      Leo Yan authored
      To support KVM event statistics, this patch firstly registers histograms
      columns and sorting fields; every column or field has its own format
      structure, the format structure is dereferenced to access the dimension,
      finally the dimension provides the comparison callback for sorting
      result.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41f1138e
    • Leo Yan's avatar
      perf hist: Add 'kvm_info' field in histograms entry · ebf39d29
      Leo Yan authored
      __hists__add_entry() creates a temporary entry and compare it with
      existed histograms entries, if any existed entry equals to the
      temporary entry it skips to allocation to avoid duplication.
      
      The problem for support KVM event in histograms is it doesn't contain
      any info to identify KVM event and can be used for comparison entries.
      
      This patch adds 'kvm_info' field in the histograms entry which contains
      the KVM event's key, this identifier will be used for comparison
      histograms entries in later change.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ebf39d29
    • Leo Yan's avatar
      perf kvm: Parse address location for samples · 001b08f4
      Leo Yan authored
      Parse address location for samples and save it into the structure
      'perf_kvm_stat', it is to be used by histograms entry.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      001b08f4
    • Leo Yan's avatar
      perf kvm: Pass argument 'sample' to kvm_alloc_init_event() · 730651f7
      Leo Yan authored
      This patch adds an argument 'sample' for kvm_alloc_init_event(), and its
      caller functions are updated as well for passing down the 'sample'
      pointer.
      
      This is a preparation change to allow later patch to create histograms
      entries for kvm event, no any functionality changes.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      730651f7
    • Leo Yan's avatar
      perf kvm: Introduce histograms data structures · 2d08124b
      Leo Yan authored
      This is a preparation to support histograms in perf kvm tool.  As first
      step, this patch defines histograms data structures and initialize them.
      
      Committer notes:
      
      Those are only used by functions enclosed in:
      
        #if efined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So do this for these new functions and struct as well.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2d08124b
    • Leo Yan's avatar
      perf kvm: Use macro to replace variable 'decode_str_len' · 2d31e0bf
      Leo Yan authored
      The variable 'decode_str_len' defines the string length for KVM event
      name and every arch defines its own values.
      
      This introduces complexity that the variable definition are spreading in
      multiple source files under arch folder.  This patch refactors code to
      use a macro KVM_EVENT_NAME_LEN to define event name length and thus
      remove the definitions in arch files.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2d31e0bf
    • Leo Yan's avatar
      perf kvm: Use subtraction for comparison metrics · dd787ae4
      Leo Yan authored
      Currently the metrics comparison uses greater operator (>), it returns
      the boolean value (0 or 1).
      
      This patch changes to use subtraction as comparison result, which can
      be used by histograms sorting.  Since the subtraction result is u64
      type, we change key_cmp_fun's return type to int64_t to avoid overflow.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd787ae4
    • Leo Yan's avatar
      perf kvm: Move up metrics helpers · f098376d
      Leo Yan authored
      This patch moves up the helper functions of event's metrics for later
      adding code to call them.
      
      No any functionality changes, but has a function renaming from
      compare_kvm_event_{metric}() to cmp_event_{metric}().
      
      Committer notes:
      
      Those helper functions are only used if this is true:
      
        if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)
      
      So keep them enclosed with that.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f098376d
    • Leo Yan's avatar
      perf kvm: Add pointer to 'perf_kvm_stat' in kvm event · a7d451a8
      Leo Yan authored
      Sometimes, handling kvm events needs to base on global variables, e.g.
      when read event counts we need to know the target vcpu ID; the global
      variables are stored in structure perf_kvm_stat.
      
      This patch adds add a 'perf_kvm_stat' pointer in kvm event structure,
      it is to be used by later refactoring.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a7d451a8
    • Leo Yan's avatar
      perf kvm: Refactor overall statistics · 9c3aa1f4
      Leo Yan authored
      Currently the tool computes overall statistics when sort the results.
      This patch refactors overall statistics during events processing,
      therefore, the function update_total_coun() is not needed anymore, an
      extra benefit is we can de-couple code between the statistics and the
      sorting.
      
      This patch is not expected any functionality changes.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9c3aa1f4
    • Namhyung Kim's avatar
      perf record: Update documentation for BPF filters · c46bf3bd
      Namhyung Kim authored
      Add more description and examples.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c46bf3bd
    • Namhyung Kim's avatar
      perf bpf filter: Show warning for missing sample flags · 4310551b
      Namhyung Kim authored
      For a BPF filter to work properly, users need to provide appropriate
      options to enable the sample types.  Otherwise the BPF program would
      see an invalid value (i.e. always 0) and filter won't work well.
      
      Show a warning message if sample types are missing like below.
      
        $ sudo ./perf record -e cycles --filter 'addr < 100' true
        Error: cycles event does not have PERF_SAMPLE_ADDR
         Hint: please add -d option to perf record.
        failed to set filter "BPF" on event cycles with 22 (Invalid argument)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4310551b
    • Namhyung Kim's avatar
      perf bpf filter: Add logical OR operator · 46996dd7
      Namhyung Kim authored
      It supports two or more expressions connected as a group and the group
      result is considered true when one of them returns true.  The new group
      operators (GROUP_BEGIN and GROUP_END) are added to setup and check the
      condition.  As it doesn't allow nested groups, the condition is saved
      in local variables.
      
      For example, the following is to get samples only if the data source
      memory level is L2 cache or the weight value is greater than 30.
      
        $ sudo ./perf record -adW -e cpu/mem-loads/pp \
        > --filter 'mem_lvl == l2 || weight > 30' -- sleep 1
      
        $ sudo ./perf script -F data_src,weight
           10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A		    47
           11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A      57
           10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                56
           10650100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L2 miss|LCK No|BLK  N/A                    144
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                16
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                20
           11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A     189
           1026a100142 |OP LOAD|LVL L1 or L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes|BLK  N/A              193
           10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                18
           ...
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      46996dd7
    • Namhyung Kim's avatar
      perf bpf filter: Add data_src sample data support · ff612055
      Namhyung Kim authored
      The data_src has many entries to express memory behaviors.  Add each
      term separately so that users can combine them for their purpose.
      
      I didn't add prefix for the constants for simplicity as they are mostly
      distinguishable but I had to use l1_miss and l2_hit for mem_dtlb since
      mem_lvl has different values for the same names.  Note that I decided
      mem_lvl to be used as an alias of mem_lvlnum as it's deprecated now.
      According to the comment in the UAPI header, users should use the mix of
      mem_lvlnum, mem_remote and mem_snoop.  Also the SNOOPX bits are
      concatenated to mem_snoop for simplicity.
      
      The following terms are used for data_src and the corresponding perf
      sample data fields:
      
       * mem_op : { load, store, pfetch, exec }
       * mem_lvl: { l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem }
       * mem_snoop: { none, hit, miss, hitm, fwd, peer }
       * mem_remote: { remote }
       * mem_lock: { locked }
       * mem_dtlb { l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault }
       * mem_blk { by_data, by_addr }
       * mem_hops { hops0, hops1, hops2, hops3 }
      
      We can now use a filter expression like below:
      
        'mem_op == load, mem_lvl <= l2, mem_dtlb == l1_hit'
        'mem_dtlb == l2_miss, mem_hops > hops1'
        'mem_lvl == ram, mem_remote == 1'
      
      Note that 'na' is shared among the terms as it has the same value except
      for mem_lvl.  I don't have a good idea to handle that for now.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff612055