1. 10 Nov, 2022 1 commit
  2. 06 Oct, 2022 1 commit
  3. 04 Oct, 2022 3 commits
  4. 19 Aug, 2022 1 commit
  5. 29 Jul, 2022 1 commit
  6. 20 Jul, 2022 2 commits
  7. 28 Jun, 2022 1 commit
    • Namhyung Kim's avatar
      perf offcpu: Accept allowed sample types only · 49c692b7
      Namhyung Kim authored
      As offcpu-time event is synthesized at the end, it could not get the
      all the sample info.  Define OFFCPU_SAMPLE_TYPES for allowed ones and
      mask out others in evsel__config() to prevent parse errors.
      
      Because perf sample parsing assumes a specific ordering with the
      sample types, setting unsupported one would make it fail to read
      data like perf record -d/--data.
      
      Fixes: edc41a10
      
       ("perf record: Enable off-cpu analysis with BPF")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20220624231313.367909-3-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49c692b7
  8. 24 Jun, 2022 1 commit
    • Ravi Bangoria's avatar
      perf record ibs: Warn about sampling period skew · 9ab95b0b
      Ravi Bangoria authored
      
      Samples without an L3 miss are discarded and counter is reset with
      random value (between 1-15 for fetch PMU and 1-127 for op PMU) when IBS
      L3 miss filtering is enabled. This causes a sampling period skew but
      there is no way to reconstruct aggregated sampling period. So print a
      warning at perf record if user sets l3missonly=1.
      
      Ex:
      
        # perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
        WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
        and tagged operation does not cause L3 Miss. This causes sampling period skew.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <rrichter@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: like.xu.linux@gmail.com
      Cc: x86@kernel.org
      Link: http://lore.kernel.org/lkml/20220604044519.594-2-ravi.bangoria@amd.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9ab95b0b
  9. 26 May, 2022 3 commits
    • James Clark's avatar
      perf unwind: Use dynamic register set for DWARF unwind · 72105204
      James Clark authored
      
      Architectures can detect availability of extra registers at runtime so
      use this more complete set for unwinding. This will include the VG
      register on arm64 in a later commit.
      
      If the function isn't implemented then PERF_REGS_MASK is returned and
      there is no change.
      
      Committer notes:
      
      Added util/perf_regs.c to tools/perf/util/python-ext-sources so that
      'perf test python' passes, i.e. the perf python binding has all the
      symbols it needs, addressing:
      
        $ perf test -v python
         19: 'import perf' in python                                         :
        --- start ---
        test child forked, pid 2037817
        python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
        Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
        ImportError: /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so: undefined symbol: arch__user_reg_mask
        test child finished with -1
        ---- end ----
        'import perf' in python: FAILED!
        $
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Mark Brown <broonie@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20220525154114.718321-4-james.clark@arm.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      72105204
    • Namhyung Kim's avatar
      perf report: Do not extend sample type of bpf-output event · 303ead45
      Namhyung Kim authored
      
      Currently evsel__new_idx() sets more sample_type bits when it finds a
      BPF-output event.  But it should honor what's recorded in the perf
      data file rather than blindly sets the bits.  Otherwise it could lead
      to a parse error when it recorded with a modified sample_type.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20220518224725.742882-2-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      303ead45
    • Adrian Hunter's avatar
      perf stat: Add requires_cpu flag for uncore · d3345fec
      Adrian Hunter authored
      
      Uncore events require a CPU i.e. it cannot be -1.
      
      The evsel system_wide flag is intended for events that should be on every
      CPU, which does not make sense for uncore events because uncore events do
      not map one-to-one with CPUs.
      
      These 2 requirements are not exactly the same, so introduce a new flag
      'requires_cpu' for the uncore case.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Link: https://lore.kernel.org/r/20220524075436.29144-13-adrian.hunter@intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3345fec
  10. 20 May, 2022 1 commit
    • Kan Liang's avatar
      perf stat: Always keep perf metrics topdown events in a group · e8f4f794
      Kan Liang authored
      If any member in a group has a different cpu mask than the other
      members, the current perf stat disables group. when the perf metrics
      topdown events are part of the group, the below <not supported> error
      will be triggered.
      
        $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
        WARNING: grouped events cpus do not match, disabling group:
          anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }
      
         Performance counter stats for 'system wide':
      
               141,465,174      slots
           <not supported>      topdown-retiring
             1,605,330,334      uncore_imc_free_running_0/dclk/
      
      The perf metrics topdown events must always be grouped with a slots
      event as leader.
      
      Factor out evsel__remove_from_group() to only remove the regular events
      from the group.
      
      Remove evsel__must_be_in_group(), since no one use it anymore.
      
      With the patch, the topdown events aren't broken from the group for the
      splitting.
      
        $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1
        WARNING: grouped events cpus do not match, disabling group:
          anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ }
      
         Performance counter stats for 'system wide':
      
               346,110,588      slots
               124,608,256      topdown-retiring
             1,606,869,976      uncore_imc_free_running_0/dclk/
      
               1.003877592 seconds time elapsed
      
      Fixes: a9a17902
      
       ("perf stat: Ensure group is defined on top of the same cpu mask")
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220518143900.1493980-3-kan.liang@linux.intel.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8f4f794
  11. 17 May, 2022 1 commit
    • Ian Rogers's avatar
      perf evlist: Keep topdown counters in weak group · d98079c0
      Ian Rogers authored
      
      On Intel Icelake, topdown events must always be grouped with a slots
      event as leader. When a metric is parsed a weak group is formed and
      retried if perf_event_open fails. The retried events aren't grouped
      breaking the slots leader requirement. This change modifies the weak
      group "reset" behavior so that topdown events aren't broken from the
      group for the retry.
      
        $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
      
         Performance counter stats for 'system wide':
      
          47,867,188,483      slots                                                         (92.27%)
         <not supported>      topdown-bad-spec
         <not supported>      topdown-be-bound
         <not supported>      topdown-fe-bound
         <not supported>      topdown-retiring
           2,173,346,937      branch-instructions                                           (92.27%)
              10,540,253      branch-misses             #    0.48% of all branches          (92.29%)
              96,291,140      bus-cycles                                                    (92.29%)
               6,214,202      cache-misses              #   20.120 % of all cache refs      (92.29%)
              30,886,082      cache-references                                              (76.91%)
          11,773,726,641      cpu-cycles                                                    (84.62%)
          11,807,585,307      instructions              #    1.00  insn per cycle           (92.31%)
                       0      mem-loads                                                     (92.32%)
           2,212,928,573      mem-stores                                                    (84.69%)
          10,024,403,118      ref-cycles                                                    (92.35%)
              16,232,978      baclears.any                                                  (92.35%)
              23,832,633      ARITH.DIVIDER_ACTIVE                                          (84.59%)
      
             0.981070734 seconds time elapsed
      
      After:
      
        $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1
      
         Performance counter stats for 'system wide':
      
             31040189283      slots                                                         (92.27%)
              8997514811      topdown-bad-spec          #     28.2% bad speculation         (92.27%)
             10997536028      topdown-be-bound          #     34.5% backend bound           (92.27%)
              4778060526      topdown-fe-bound          #     15.0% frontend bound          (92.27%)
              7086628768      topdown-retiring          #     22.2% retiring                (92.27%)
              1417611942      branch-instructions                                           (92.26%)
                 5285529      branch-misses             #    0.37% of all branches          (92.28%)
                62922469      bus-cycles                                                    (92.29%)
                 1440708      cache-misses              #    8.292 % of all cache refs      (92.30%)
                17374098      cache-references                                              (76.94%)
              8040889520      cpu-cycles                                                    (84.63%)
              7709992319      instructions              #    0.96  insn per cycle           (92.32%)
                       0      mem-loads                                                     (92.32%)
              1515669558      mem-stores                                                    (84.68%)
              6542411177      ref-cycles                                                    (92.35%)
                 4154149      baclears.any                                                  (92.35%)
                20556152      ARITH.DIVIDER_ACTIVE                                          (84.59%)
      
             1.010799593 seconds time elapsed
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d98079c0
  12. 09 May, 2022 2 commits
    • Ian Rogers's avatar
      perf evsel: Add tool event helpers · 79932d16
      Ian Rogers authored
      Convert to and from a string. Fix evsel__tool_name() as array is
      off-by-1.  Support more than just duration_time as a metric-id.
      
      Fixes: 75eafc97
      
       ("perf list: Print all available tool events")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220507053410.3798748-4-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      79932d16
    • Ian Rogers's avatar
      perf evsel: Constify a few arrays · 545a96c9
      Ian Rogers authored
      
      Remove public definition of evsel__tool_names(). Not used outside
      util/evsel.c.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220507053410.3798748-3-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      545a96c9
  13. 20 Apr, 2022 1 commit
  14. 26 Mar, 2022 1 commit
    • Kim Phillips's avatar
      perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages · ab0809af
      Kim Phillips authored
      
      Improve the error message returned on failed perf_event_open() on AMD
      systems when using IBS (Instruction-Based Sampling).
      
      Output of executing 'perf record -e ibs_op// true' as a non root user
      BEFORE this patch (perf will add the 'u' modifier at the end to exclude
      kernel/hypervisor sampling):
      
        The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
        /bin/dmesg | grep -i perf may provide additional information.
      
      Output after:
      
        AMD IBS can't exclude kernel events.  Try running at a higher privilege level.
      
      Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:
      
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
        /bin/dmesg | grep -i perf may provide additional information.
      
      Output after:
      
        Error:
        Invalid event (ibs_op//) in per-thread mode, enable system wide with '-a'.
      
      Folowing the suggestion:
      
        $ sudo perf record -a -e ibs_op// true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.664 MB perf.data (194 samples) ]
        $
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: João Martins <joao.m.martins@oracle.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20220322221517.2510440-12-eranian@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab0809af
  15. 22 Mar, 2022 1 commit
    • Kim Phillips's avatar
      perf evsel: Make evsel__env() always return a valid env · 7b830875
      Kim Phillips authored
      
      It's possible to have an evsel and evsel->evlist populated without
      an evsel->evlist->env, when, e.g., cmd_record is in its error path.
      
      Future patches will add support for evsel__open_strerror to be able
      to customize error messaging based on perf_env__{arch,cpuid}, so
      let's have evsel__env return &perf_env instead of NULL in that case.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Signed-off-by: default avatarKim Phillips <kim.phillips@amd.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Joao Martins <joao.m.martins@oracle.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211004214114.188477-1-kim.phillips@amd.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7b830875
  16. 07 Mar, 2022 1 commit
    • James Clark's avatar
      perf evsel: Add error message for unsupported branch stack cases · 8f431a28
      James Clark authored
      
      EOPNOTSUPP is a possible return value when branch stacks are requested
      but they aren't enabled in the kernel or hardware. It's also returned if
      they aren't supported on the specific event type. The currently printed
      error message about sampling/overflow-interrupts is not correct in this
      case.
      
      Add a check for branch stacks before sample_period is checked because
      sample_period is also set (to the default value) when using branch
      stacks.
      
      Before this change (when branch stacks aren't supported):
      
        perf record -j any
        Error:
        cycles: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
      
      After this change:
      
        perf record -j any
        Error:
        cycles: PMU Hardware or event type doesn't support branch stack sampling.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20220307171917.2555829-2-james.clark@arm.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f431a28
  17. 22 Jan, 2022 2 commits
    • German Gomez's avatar
      perf evsel: Override attr->sample_period for non-libpfm4 events · 3606c0e1
      German Gomez authored
      A previous patch preventing "attr->sample_period" values from being
      overridden in pfm events changed a related behaviour in arm-spe.
      
      Before said patch:
      
        perf record -c 10000 -e arm_spe_0// -- sleep 1
      
      Would yield an SPE event with period=10000. After the patch, the period
      in "-c 10000" was being ignored because the arm-spe code initializes
      sample_period to a non-zero value.
      
      This patch restores the previous behaviour for non-libpfm4 events.
      
      Fixes: ae5dcc8a
      
       (“perf record: Prevent override of attr->sample_period for libpfm4 events”)
      Reported-by: default avatarChase Conklin <chase.conklin@arm.com>
      Signed-off-by: default avatarGerman Gomez <german.gomez@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: bpf@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20220118144054.2541-1-german.gomez@arm.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3606c0e1
    • Ian Rogers's avatar
      perf cpumap: Migrate to libperf cpumap api · 44028699
      Ian Rogers authored
      
      Switch from directly accessing the perf_cpu_map to using the appropriate
      libperf API when possible. Using the API simplifies the job of
      refactoring use of perf_cpu_map.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: André Almeida <andrealmeid@collabora.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: http://lore.kernel.org/lkml/20220122045811.3402706-3-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44028699
  18. 12 Jan, 2022 7 commits
    • Ian Rogers's avatar
      perf cpumap: Give CPUs their own type · 6d18804b
      Ian Rogers authored
      
      A common problem is confusing CPU map indices with the CPU, by wrapping
      the CPU with a struct then this is avoided. This approach is similar to
      atomic_t.
      
      Committer notes:
      
      To make it build with BUILD_BPF_SKEL=1 these files needed the
      conversions to 'struct perf_cpu' usage:
      
        tools/perf/util/bpf_counter.c
        tools/perf/util/bpf_counter_cgroup.c
        tools/perf/util/bpf_ftrace.c
      
      Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
      cpu_map__build_map to cpu function".
      
      Additionally these needed to be fixed for the ARM builds to complete:
      
        tools/perf/arch/arm/util/cs-etm.c
        tools/perf/arch/arm64/util/pmu.c
      Suggested-by: default avatarJohn Garry <john.garry@huawei.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6d18804b
    • Ian Rogers's avatar
      perf evsel: Rename variable cpu to index · 6f844b1f
      Ian Rogers authored
      
      Make naming less error prone.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-40-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6f844b1f
    • Ian Rogers's avatar
      perf evsel: Reduce scope of evsel__ignore_missing_thread · 1fa497d4
      Ian Rogers authored
      
      Move to being static.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-39-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1fa497d4
    • Ian Rogers's avatar
      perf evsel: Rename CPU around get_group_fd · 2daa08c4
      Ian Rogers authored
      
      CPU is really a cpu map index, change names to make code more intention
      revealing.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-38-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2daa08c4
    • Ian Rogers's avatar
      perf stat: Correct variable name for read counter · da8c94c0
      Ian Rogers authored
      
      Switch from cpu to cpu_map_idx to reduce confusion.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-37-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      da8c94c0
    • Ian Rogers's avatar
      perf evsel: Derive CPUs and threads in alloc_counts · 2ca0a371
      Ian Rogers authored
      
      Passing the number of CPUs and threads allows for an evsel's counts to
      be mismatched to its cpu map. To avoid this always derive the counts
      size from the cpu map. Change openat-syscall-all-cpus to set the cpus
      to allow for this to work.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Vineet Singh <vineet.singh@intel.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: zhengjun.xing@intel.com
      Link: https://lore.kernel.org/r/20220105061351.120843-27-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2ca0a371
    • Ian Rogers's avatar
      perf evsel: Improve error message for uncore events · dcffc5eb
      Ian Rogers authored
      
      When a group has multiple events and the leader fails it can yield
      errors like:
      
        $ perf stat -e '{uncore_imc/cas_count_read/},instructions' /bin/true
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_imc/cas_count_read/).
        /bin/dmesg | grep -i perf may provide additional information.
      
      However, when not the group leader <not supported> is given:
      
        $ perf stat -e '{instructions,uncore_imc/cas_count_read/}' /bin/true
        ...
                 1,619,057      instructions
           <not supported> MiB  uncore_imc/cas_count_read/
      
      This is necessary because get_group_fd will fail if the leader fails and
      is the direct result of the check on line 750 of builtin-stat.c in
      stat_handle_error that returns COUNTER_SKIP for the latter case.
      
      This patch improves the error message to:
      
        $ perf stat -e '{uncore_imc/cas_count_read/},instructions' /bin/true
        Error:
        Invalid event (uncore_imc/cas_count_read/) in per-thread mode, enable system wide with '-a'.
      
      v2. Changed the test to use !target__has_cpu as suggested by Namhyung Kim.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20211223183948.3423989-2-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dcffc5eb
  19. 08 Dec, 2021 1 commit
  20. 06 Dec, 2021 1 commit
  21. 18 Nov, 2021 1 commit
  22. 13 Nov, 2021 1 commit
    • Ian Rogers's avatar
      perf expr: Add source_count for aggregating events · 9aba0ada
      Ian Rogers authored
      
      Events like uncore_imc/cas_count_read/ on Skylake open multiple events
      and then aggregate in the metric leader. To determine the average value
      per event the number of these events is needed. Add a source_count
      function that returns this value by counting the number of events with
      the given metric leader. For most events the value is 1 but for
      uncore_imc/cas_count_read/ it can yield values like 6.
      
      Add a generic test, but manually tested with a test metric that uses
      the function.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul A . Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Wan Jiabing <wanjiabing@vivo.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20211111002109.194172-9-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9aba0ada
  23. 07 Nov, 2021 1 commit
    • Ravi Bangoria's avatar
      perf evsel: Don't set exclude_guest by default · eb39bf32
      Ravi Bangoria authored
      
      Perf tool sets exclude_guest by default while calling perf_event_open().
      Because IBS does not have filtering capability, it always gets rejected
      by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
      it by not setting exclude_guest by default on AMD.
      
      Before:
        $ sudo ./perf record -C 0 -vvv true |& grep precise
          precise_ip                       3
        decreasing precise_ip by one (2)
          precise_ip                       2
        decreasing precise_ip by one (1)
          precise_ip                       1
        decreasing precise_ip by one (0)
      
      After:
        $ sudo ./perf record -C 0 -vvv true |& grep precise
          precise_ip                       3
        decreasing precise_ip by one (2)
          precise_ip                       2
      
      Committer notes:
      
      Fixup init to zero for perf_env in older compilers:
      
        arch/x86/util/evsel.c:15:26: error: missing field 'os_release' initializer [-Werror,-Wmissing-field-initializers]
                struct perf_env env = {0};
                                        ^
      
      Committer notes:
      
      Namhyung remarked:
      
        It'd be nice if it can cover explicit "-e cycles:pp" as well.
      
      Ravi clarified:
      
        For explicit :pp modifier, evsel->precise_max does not get set and thus perf
        does not try with different attr->precise_ip values while exclude_guest set.
        So no issue with explicit :pp:
      
          $ sudo ./perf record -C 0 -e cycles:pp -vvv |& grep "precise_ip\|exclude_guest"
            precise_ip                       2
            exclude_guest                    1
            precise_ip                       2
            exclude_guest                    1
          switching off exclude_guest, exclude_host
            precise_ip                       2
          ^C
      
        Also, with :P modifier, evsel->precise_max gets set but exclude_guest does
        not and thus :P also works fine:
      
          $ sudo ./perf record -C 0 -e cycles:P -vvv |& grep "precise_ip\|exclude_guest"
            precise_ip                       3
          decreasing precise_ip by one (2)
            precise_ip                       2
          ^C
      Reported-by: default avatarKim Phillips <kim.phillips@amd.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20211103072112.32312-1-ravi.bangoria@amd.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb39bf32
  24. 06 Nov, 2021 1 commit
    • Namhyung Kim's avatar
      perf evsel: Fix missing exclude_{host,guest} setting · 3500eeeb
      Namhyung Kim authored
      
      The current logic for the perf missing feature has a bug that it can
      wrongly clear some modifiers like G or H.  Actually some PMUs don't
      support any filtering or exclusion while others do.  But we check it as
      a global feature.
      
      For example, the cycles event can have 'G' modifier to enable it only in
      the guest mode on x86.  When you don't run any VMs it'll return 0.
      
        # perf stat -a -e cycles:G sleep 1
      
          Performance counter stats for 'system wide':
      
                          0      cycles:G
      
                1.000721670 seconds time elapsed
      
      But when it's used with other pmu events that don't support G modifier,
      it'll be reset and return non-zero values.
      
        # perf stat -a -e cycles:G,msr/tsc/ sleep 1
      
          Performance counter stats for 'system wide':
      
                538,029,960      cycles:G
             16,924,010,738      msr/tsc/
      
                1.001815327 seconds time elapsed
      
      This is because of the missing feature detection logic being global.
      Add a hashmap to set pmu-specific exclude_host/guest features.
      
      Committer notes:
      
      Fix 'perf test python' by adding a stub for evsel__find_pmu() in
      tools/perf/util/python.c, document that it is used so far only for the
      above reasons so that if anybody needs this in the python binding
      usecases, we can revisit this.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Link: http://lore.kernel.org/lkml/20211105205847.120950-1-namhyung@kernel.org
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3500eeeb
  25. 28 Oct, 2021 1 commit
  26. 20 Oct, 2021 1 commit
    • Ian Rogers's avatar
      perf parse-events: Add new "metric-id" term · 2b62b3a6
      Ian Rogers authored
      
      Add a new "metric-id" term to events so that metric parsing can set an
      ID that can be reliably looked up.
      
      Metric parsing currently will turn a metric like "instructions/cycles"
      into a parse events string of "{instructions,cycles}:W".
      
      However, parse-events may change "instructions" into "instructions:u" if
      perf_event_paranoid=2.
      
      When this happens expr__resolve_id currently fails as stat-shadow adds
      the ID "instructions:u" to match with the counter value and the metric
      tries to look up the ID just "instructions".
      
      A later patch will use the new term.
      
      An example of the current problem:
      
        $ echo -1 > /proc/sys/kernel/perf_event_paranoid
        $ perf stat -M IPC /bin/true
         Performance counter stats for '/bin/true':
      
                 1,217,161      inst_retired.any          #     0.97 IPC
                 1,250,389      cpu_clk_unhalted.thread
      
               0.002064773 seconds time elapsed
      
               0.002378000 seconds user
               0.000000000 seconds sys
      
        $ echo 2 > /proc/sys/kernel/perf_event_paranoid
        $ perf stat -M IPC /bin/true
         Performance counter stats for '/bin/true':
      
                   150,298      inst_retired.any:u        #      nan IPC
                   187,095      cpu_clk_unhalted.thread:u
      
               0.002042731 seconds time elapsed
      
               0.000000000 seconds user
               0.002377000 seconds sys
      
      Note: nan IPC is printed as an effect of "perf metric: Use NAN for
      missing event IDs." but earlier versions of perf just fail with a parse
      error and display no value.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrew Kilroy <andrew.kilroy@arm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Changbin Du <changbin.du@intel.com>
      Cc: Denys Zagorui <dzagorui@cisco.com>
      Cc: Fabian Hemmer <copy@copy.sh>
      Cc: Felix Fietkau <nbd@nbd.name>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jacob Keller <jacob.e.keller@intel.com>
      Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kees Kook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nicholas Fraser <nfraser@codeweavers.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Sami Tolvanen <samitolvanen@google.com>
      Cc: ShihCheng Tu <mrtoastcheng@gmail.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Wan Jiabing <wanjiabing@vivo.com>
      Cc: Zhen Lei <thunder.leizhen@huawei.com>
      Link: https://lore.kernel.org/r/20211015172132.1162559-15-irogers@google.com
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2b62b3a6
  27. 11 Sep, 2021 1 commit