1. 16 Jun, 2023 4 commits
    • Kan Liang's avatar
      perf stat,jevents: Introduce Default tags for the default mode · b0a9e8f8
      Kan Liang authored
      Introduce a new metricgroup, Default, to tag all the metric groups which
      will be collected in the default mode.
      
      Add a new field, DefaultMetricgroupName, in the JSON file to indicate
      the real metric group name. It will be printed in the default output
      to replace the event names.
      
      There is nothing changed for the output format.
      
      On SPR, both TopdownL1 and TopdownL2 are displayed in the default
      output.
      
      On ARM, Intel ICL and later platforms (before SPR), only TopdownL1 is
      displayed in the default output.
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230615135315.3662428-4-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0a9e8f8
    • Kan Liang's avatar
      perf metric: JSON flag to default metric group · 969a4661
      Kan Liang authored
      For the default output, the default metric group could vary on different
      platforms. For example, on SPR, the TopdownL1 and TopdownL2 metrics
      should be displayed in the default mode. On ICL, only the TopdownL1
      should be displayed.
      
      Add a flag so we can tag the default metric group for different
      platforms rather than hack the perf code.
      
      The flag is added to Intel TopdownL1 since ICL and ADL, TopdownL2
      metrics since SPR.
      
      Add a new field, DefaultMetricgroupName, in the JSON file to indicate
      the real metric group name.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20230615135315.3662428-3-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      969a4661
    • Kan Liang's avatar
      perf evsel: Fix the annotation for hardware events on hybrid · e15e4a3d
      Kan Liang authored
      The annotation for hardware events is wrong on hybrid. For example,
      
       # ./perf stat -a sleep 1
      
       Performance counter stats for 'system wide':
      
               32,148.85 msec cpu-clock                        #   32.000 CPUs utilized
                     374      context-switches                 #   11.633 /sec
                      33      cpu-migrations                   #    1.026 /sec
                     295      page-faults                      #    9.176 /sec
              18,979,960      cpu_core/cycles/                 #  590.378 K/sec
             261,230,783      cpu_atom/cycles/                 #    8.126 M/sec                       (54.21%)
              17,019,732      cpu_core/instructions/           #  529.404 K/sec
              38,020,470      cpu_atom/instructions/           #    1.183 M/sec                       (63.36%)
               3,296,743      cpu_core/branches/               #  102.546 K/sec
               6,692,338      cpu_atom/branches/               #  208.167 K/sec                       (63.40%)
                  96,421      cpu_core/branch-misses/          #    2.999 K/sec
               1,016,336      cpu_atom/branch-misses/          #   31.613 K/sec                       (63.38%)
      
      The hardware events have extended type on hybrid, but the evsel__match()
      doesn't take it into account.
      
      Filter the config on hybrid before checking.
      
      With the patch,
      
       # ./perf stat -a sleep 1
      
       Performance counter stats for 'system wide':
      
               32,139.90 msec cpu-clock                        #   32.003 CPUs utilized
                     343      context-switches                 #   10.672 /sec
                      32      cpu-migrations                   #    0.996 /sec
                      73      page-faults                      #    2.271 /sec
              13,712,841      cpu_core/cycles/                 #    0.000 GHz
             258,301,691      cpu_atom/cycles/                 #    0.008 GHz                         (54.20%)
              12,428,163      cpu_core/instructions/           #    0.91  insn per cycle
              37,786,557      cpu_atom/instructions/           #    2.76  insn per cycle              (63.35%)
               2,418,826      cpu_core/branches/               #   75.259 K/sec
               6,965,962      cpu_atom/branches/               #  216.739 K/sec                       (63.38%)
                  72,150      cpu_core/branch-misses/          #    2.98% of all branches
               1,032,746      cpu_atom/branch-misses/          #   42.70% of all branches             (63.35%)
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ahmad Yasin <ahmad.yasin@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20230615135315.3662428-2-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e15e4a3d
    • Ian Rogers's avatar
      perf srcline: Fix handling of inline functions · e90208e9
      Ian Rogers authored
      We write an address then a ',' to addr2line. With inline data we
      generally get back (// are my comments):
      0x1234    // address
      foo       // function name
      foo.c:123 // filename:line
      bar       // function name
      bar.c:123 // filename:line
      0x000000000000000 // sentinel address created by ','
      ??        // unknown function name
      ??:0      // unknown filename:line
      
      The code was assuming the inline data also had the address, which is
      incorrect. This means the first inline function name (bar above) needs
      to be checked to see if it is the sentinel, otherwise to be treated as
      a function name. The regression was caused by the addition of
      addresses as the kernel is reporting a symbol at address 0 (used by
      GNU binutils when it interprets ',').
      
      Committer testing:
      
      Using:
      
        # perf trace --call-graph=dwarf -e lock:contention_*
        <SNIP>
        1244.615 TaskCon~ller #/2645281 lock:contention_begin(lock_addr: 0xffff8e6748da5ab0, flags: 2)
                                             __preempt_count_dec_and_test (inlined)
                                             trace_contention_begin (inlined)
                                             trace_contention_begin (inlined)
                                             rwsem_down_read_slowpath ([kernel.kallsyms])
                                             __preempt_count_dec_and_test (inlined)
                                             trace_contention_begin (inlined)
                                             trace_contention_begin (inlined)
                                             rwsem_down_read_slowpath ([kernel.kallsyms])
                                             __down_read_common (inlined)
                                             __down_read (inlined)
                                             down_read ([kernel.kallsyms])
                                             arch_static_branch (inlined)
                                             static_key_false (inlined)
                                             __mmap_lock_trace_acquire_returned (inlined)
                                             mmap_read_lock (inlined)
                                             do_user_addr_fault ([kernel.kallsyms])
                                             arch_local_irq_disable (inlined)
                                             handle_page_fault (inlined)
                                             exc_page_fault ([kernel.kallsyms])
                                             asm_exc_page_fault ([kernel.kallsyms])
                                             [0x4def008] (/usr/lib64/firefox/libxul.so)
        1244.619 TaskCon~ller #/2645281 lock:contention_end(lock_addr: 0xffff8e6748da5ab0)
                                             __preempt_count_dec_and_test (inlined)
                                             trace_contention_end (inlined)
                                             trace_contention_end (inlined)
                                             rwsem_down_read_slowpath ([kernel.kallsyms])
                                             __preempt_count_dec_and_test (inlined)
                                             trace_contention_end (inlined)
                                             trace_contention_end (inlined)
                                             rwsem_down_read_slowpath ([kernel.kallsyms])
                                             __down_read_common (inlined)
                                             __down_read (inlined)
                                             down_read ([kernel.kallsyms])
                                             arch_static_branch (inlined)
                                             static_key_false (inlined)
                                             __mmap_lock_trace_acquire_returned (inlined)
                                             mmap_read_lock (inlined)
                                             do_user_addr_fault ([kernel.kallsyms])
                                             arch_local_irq_disable (inlined)
                                             handle_page_fault (inlined)
                                             exc_page_fault ([kernel.kallsyms])
                                             asm_exc_page_fault ([kernel.kallsyms])
        <SNIP>
      
      Fixes: 8dc26b6f ("perf srcline: Make sentinel reading for binutils addr2line more robust")
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: llvm@lists.linux.dev
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Rix <trix@redhat.com>
      Link: https://lore.kernel.org/r/20230615025041.1982072-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e90208e9
  2. 14 Jun, 2023 30 commits
  3. 13 Jun, 2023 2 commits
  4. 12 Jun, 2023 4 commits