1. 07 Apr, 2023 1 commit
    • Ian Rogers's avatar
      perf pmu: Make parser reentrant · 3d88aec0
      Ian Rogers authored
      By default bison uses global state for compatibility with yacc. Make
      the parser reentrant so that it may be used in asynchronous and
      multithreaded situations.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Link: https://lore.kernel.org/r/20230406065224.2553640-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3d88aec0
  2. 04 Apr, 2023 39 commits
    • Ian Rogers's avatar
      perf map: Add accessor for start and end · e5116f46
      Ian Rogers authored
      Later changes will add reference count checking for struct map, start
      and end are frequently accessed variables. Add an accessor so that the
      reference count check is only necessary in one place.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5116f46
    • Ian Rogers's avatar
      perf map: Add accessor for dso · 63df0e4b
      Ian Rogers authored
      Later changes will add reference count checking for struct map, with
      dso being the most frequently accessed variable. Add an accessor so
      that the reference count check is only necessary in one place.
      
      Additional changes:
       - add a dso variable to avoid repeated map__dso calls.
       - in builtin-mem.c dump_raw_samples, code only partially tested for
         dso == NULL. Make the possibility of NULL consistent.
       - in thread.c thread__memcpy fix use of spaces and use tabs.
      
      Committer notes:
      
      Did missing conversions on these files:
      
         tools/perf/arch/powerpc/util/skip-callchain-idx.c
         tools/perf/arch/powerpc/util/sym-handling.c
         tools/perf/ui/browsers/hists.c
         tools/perf/ui/gtk/annotate.c
         tools/perf/util/cs-etm.c
         tools/perf/util/thread.c
         tools/perf/util/unwind-libunwind-local.c
         tools/perf/util/unwind-libunwind.c
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      63df0e4b
    • Ian Rogers's avatar
      perf maps: Add functions to access maps · 5ab6d715
      Ian Rogers authored
      Introduce functions to access struct maps. These functions reduce the
      number of places reference counting is necessary. While tidying APIs do
      some small const-ification, in particlar to unwind_libunwind_ops.
      
      Committer notes:
      
      Fixed up tools/perf/util/unwind-libunwind.c:
      
      -               return ops->get_entries(cb, arg, thread, data, max_stack);
      +               return ops->get_entries(cb, arg, thread, data, max_stack, best_effort);
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ab6d715
    • Ian Rogers's avatar
      perf maps: Remove rb_node from struct map · ff583dc4
      Ian Rogers authored
      struct map is reference counted, having it also be a node in an
      red-black tree complicates the reference counting. Switch to having a
      map_rb_node which is a red-block tree node but points at the reference
      counted struct map. This reference is responsible for a single reference
      count.
      
      Committer notes:
      
      Fixed up tools/perf/util/unwind-libunwind-local.c to use map_rb_node as
      well.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ff583dc4
    • Ian Rogers's avatar
      perf map: Move map list node into symbol · 83720209
      Ian Rogers authored
      Using a perf map as a list node is only done in symbol. Move the
      list_node struct into symbol as a single pointer to the map. This makes
      reference count behavior more obvious and easy to check.
      
      Committer notes:
      
      Some changes to reduce the number of lines touched by keeping, for
      instance, the 'new_map' variable and setting it to new_node->map, so
      that we keep more of the project history in place and keep as much
      as possible the value of the 'git blame' tool.
      
      Also use map__zput() when putting a struct members, so that when we free
      the container struct we can get use-after-free errors as NULL pointer
      derefs sometimes.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miaoqian Lin <linmq006@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
      Cc: Song Liu <song@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Yury Norov <yury.norov@gmail.com>
      Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      83720209
    • Ian Rogers's avatar
      perf jit: Fix a few memory leaks · dc67c783
      Ian Rogers authored
      As reported by leak sanitizer.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Brian Robbins <brianrob@linux.microsoft.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Yuan Can <yuancan@huawei.com>
      Link: https://lore.kernel.org/r/20230403203545.1872196-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dc67c783
    • Ian Rogers's avatar
      perf build: Allow C++ demangle without libelf · 3ad45105
      Ian Rogers authored
      The cxa demangle support isn't dependent on libelf and so we no longer
      need to disable demangling if libelf isn't present.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230403211021.1892231-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3ad45105
    • Ian Rogers's avatar
      perf srcline: Avoid addr2line SIGPIPEs · 75a616c6
      Ian Rogers authored
      Ignore SIGPIPEs when addr2line is configured.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      75a616c6
    • Ian Rogers's avatar
      perf srcline: Support for llvm-addr2line · 2c4b9280
      Ian Rogers authored
      The sentinel value differs for llvm-addr2line. Configure this once and
      then detect when reading records.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c4b9280
    • Ian Rogers's avatar
      perf srcline: Simplify addr2line subprocess · b3801e79
      Ian Rogers authored
      Don't wrap stdin and stdout of subprocess with streams, use the api/io
      library for buffering.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b3801e79
    • Ian Rogers's avatar
      tools api: Add io__getline · c9dc580c
      Ian Rogers authored
      Reads a line to allocated memory up to a newline following the getline
      API.
      
      Committer notes:
      
      It also adds this new function to the 'api io' 'perf test' entry:
      
        $ perf test "api io"
         64: Test api io                                                     : Ok
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230403184033.1836023-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c9dc580c
    • Namhyung Kim's avatar
      perf intel-pt: Use perf_pmu__scan_file_at() if possible · 98b7ce0e
      Namhyung Kim authored
      Intel-PT calls perf_pmu__scan_file() a lot, let's use relative address
      when it accesses multiple files at one place.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      98b7ce0e
    • Namhyung Kim's avatar
      perf pmu: Add perf_pmu__{open,scan}_file_at() · 3a69672e
      Namhyung Kim authored
      These two helpers will also use openat() to reduce the overhead with
      relative pathnames.  Convert other functions in pmu_lookup() to use
      the new helpers.
      
      Committer testing:
      
      Before:
      
        ⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan
        # Running 'internals/pmu-scan' benchmark:
        Computing performance of sysfs PMU event scan for 100 times
          Average PMU scanning took: 2729.040 usec (+- 7.117 usec)
        ⬢[acme@toolbox perf-tools-next]$
      
      After:
      
        ⬢[acme@toolbox perf-tools-next]$ perf bench internals pmu-scan
        # Running 'internals/pmu-scan' benchmark:
        Computing performance of sysfs PMU event scan for 100 times
          Average PMU scanning took: 2419.870 usec (+- 9.057 usec)
        ⬢[acme@toolbox perf-tools-next]$
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3a69672e
    • Namhyung Kim's avatar
      perf pmu: Use relative path in setup_pmu_alias_list() · 46378665
      Namhyung Kim authored
      Likewise, x86 needs to traverse the PMU list to build alias.
      Let's use the new helpers to use relative paths.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      46378665
    • Namhyung Kim's avatar
      perf pmu: Use relative path in perf_pmu__caps_parse() · b39094d3
      Namhyung Kim authored
      Likewise, it needs to traverse the pmu/caps directory, let's use
      openat() with the dirfd instead of open() using the absolute path.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: linux-perf-users@vger.kernel.org
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b39094d3
    • Namhyung Kim's avatar
      perf pmu: Use relative path for sysfs scan · e293a5e8
      Namhyung Kim authored
      The PMU information is in the kernel sysfs so it needs to scan the
      directory to get the whole information like event aliases, formats and
      so on.  During the traversal, it opens a lot of files and directories
      like below:
      
        dir = opendir("/sys/bus/event_source/devices");
        while (dentry = readdir(dir)) {
          char buf[PATH_MAX];
      
          snprintf(buf, sizeof(buf), "%s/%s",
                   "/sys/bus/event_source/devices", dentry->d_name);
          fd = open(buf, O_RDONLY);
          ...
        }
      
      But this is not good since it needs to copy the string to build the
      absolute pathname, and it makes redundant pathname walk (from the /sys)
      unnecessarily.  We can use openat(2) to open the file in the given
      directory.  While it's not a problem ususally, it can be a problem when
      the kernel has contentions on the sysfs.
      
      Add a couple of new helper to return the file descriptor of PMU
      directory so that it can use it with relative paths.
      
       * perf_pmu__event_source_devices_fd()
         - returns a fd for the PMU root ("/sys/bus/event_source/devices")
      
       * perf_pmu__pathname_fd()
         - returns a fd for "<pmu>/<file>" under the PMU root
      
      Now the above code can be converted something like below:
      
        dirfd = perf_pmu__event_source_devices_fd();
        dir = fdopendir(dirfd);
        while (dentry = readdir(dir)) {
          fd = openat(dirfd, dentry->d_name, O_RDONLY);
          ...
        }
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e293a5e8
    • Namhyung Kim's avatar
      perf bench: Add pmu-scan benchmark · f6a7bbbf
      Namhyung Kim authored
      The pmu-scan benchmark will repeatedly scan the sysfs to get the
      available PMU information.
      
        $ ./perf bench internals pmu-scan
        # Running 'internals/pmu-scan' benchmark:
        Computing performance of sysfs PMU event scan for 100 times
          Average PMU scanning took: 6850.990 usec (+- 48.445 usec)
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f6a7bbbf
    • Namhyung Kim's avatar
      perf pmu: Add perf_pmu__destroy() function · eec11310
      Namhyung Kim authored
      It seems there's no function to delete the perf pmu struct.  Add the
      perf_pmu__destroy() to do the job.  While at it, add some more helper
      functions to delete pmu aliases and caps.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eec11310
    • Namhyung Kim's avatar
      perf tools: Fix a asan issue in parse_events_multi_pmu_add() · 66c9598b
      Namhyung Kim authored
      In the parse_events_multi_pmu_add() it passes the 'config' variable
      twice to parse_events_term__num() - one for config and another for
      loc_term.  I'm not sure about the second one as it's converted to
      YYLTYPE variable.  Asan reports it like below:
      
        In function ‘parse_events_term__num’,
            inlined from ‘parse_events_multi_pmu_add’ at util/parse-events.c:1602:6:
        util/parse-events.c:2653:64: error: array subscript ‘YYLTYPE[0]’ is partly outside
                                            array bounds of ‘char[8]’ [-Werror=array-bounds]
         2653 |                 .err_term  = loc_term ? loc_term->first_column : 0,
              |                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~
        util/parse-events.c: In function ‘parse_events_multi_pmu_add’:
        util/parse-events.c:1587:15: note: object ‘config’ of size 8
         1587 |         char *config;
              |               ^~~~~~
        cc1: all warnings being treated as errors
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      66c9598b
    • Namhyung Kim's avatar
      perf list: Use relative path for tracepoint scan · 00462d8e
      Namhyung Kim authored
      Committer notes:
      
      Added missing #include <unistd.h> for the close() prototype to fix this
      on Alma Linux 8:
      
         1    21.54 almalinux:8                   : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC)
          util/print-events.c: In function 'print_tracepoint_events':
          util/print-events.c:103:4: error: implicit declaration of function 'close'; did you mean 'clone'? [-Werror=implicit-function-declaration]
              close(evt_fd);
              ^~~~~
              clone
      
      Also use the newly added scandirat feature test to check if that
      function is available, providing a HAVE_SCANDIRAT_SUPPORT conditional
      warning to the user if it isn't available.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230331202949.810326-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      00462d8e
    • Arnaldo Carvalho de Melo's avatar
      tools build: Add a feature test for scandirat(), that is not implemented so far in musl and uclibc · 9e03608e
      Arnaldo Carvalho de Melo authored
      We use it just when listing tracepoint events, and for root, so just
      emit a warning about it to get users to ask the library maintainers to
      implement it, as suggested in this systemd ticket:
      
       https://github.com/systemd/casync/issues/129
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/ZCwv4z5Dh%2FdHUMG6@kernel.org/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9e03608e
    • Adrian Hunter's avatar
      perf intel-pt: Fix CYC timestamps after standalone CBR · 430635a0
      Adrian Hunter authored
      After a standalone CBR (not associated with TSC), update the cycles
      reference timestamp and reset the cycle count, so that CYC timestamps
      are calculated relative to that point with the new frequency.
      
      Fixes: cc336186 ("perf tools: Add Intel PT support for decoding CYC packets")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      430635a0
    • Adrian Hunter's avatar
      perf auxtrace: Fix address filter entire kernel size · 1f9f33cc
      Adrian Hunter authored
      kallsyms is not completely in address order.
      
      In find_entire_kern_cb(), calculate the kernel end from the maximum
      address not the last symbol.
      
      Example:
      
       Before:
      
          $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1
          ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530    [bpf]
          $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1
          ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm]
          $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter
          Address filter: filter 0xffffffff93200000/0x2ceba000
      
       After:
      
          $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter
          Address filter: filter 0xffffffff93200000/0x2e3e2000
      
      Fixes: 1b36c03e ("perf record: Add support for using symbols in address filters")
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1f9f33cc
    • Rob Herring's avatar
      perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/store · 34fb6040
      Rob Herring authored
      Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging
      Extension (MTE) and memory operations (MOPS). The memory operations
      are memcpy and memset. Add support for decoding these new subclasses in
      the raw decoding.
      
      Reviewed-by: Leo Yan <leo.yan@linaro.org
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20230327162057.4057188-1-robh@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      34fb6040
    • Mike Leach's avatar
      perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packet · b6521ea2
      Mike Leach authored
      When using dynamically assigned CoreSight trace IDs the drivers can output
      the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet.
      
      Update cs-etm decoder to handle this packet by setting the CPU/Trace ID
      mapping.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarMike Leach <mike.leach@linaro.org>
      Acked-by: default avatarSuzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Darren Hart <darren@os.amperecomputing.com>
      Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b6521ea2
    • Mike Leach's avatar
      perf cs-etm: Update record event to use new Trace ID protocol · e5fa5b41
      Mike Leach authored
      Trace IDs are now dynamically allocated.
      
      Previously used the static association algorithm that is no longer
      used. The 'cpu * 2 + seed' was outdated and broken for systems with high
      core counts (>46). as it did not scale and was broken for larger
      core counts.
      
      Trace ID will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record.
      
      Legacy ID algorithm renamed and retained for limited backward
      compatibility use.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarMike Leach <mike.leach@linaro.org>
      Acked-by: default avatarSuzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Darren Hart <darren@os.amperecomputing.com>
      Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5fa5b41
    • Mike Leach's avatar
      perf cs-etm: Move mapping of Trace ID and cpu into helper function · 09277295
      Mike Leach authored
      The information to associate Trace ID and CPU will be changing.
      
      Drivers will start outputting this as a hardware ID packet in the data
      file which if present will be used in preference to the AUXINFO values.
      
      To prepare for this we provide a helper functions to do the individual ID
      mapping, and one to extract the IDs from the completed metadata blocks.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarMike Leach <mike.leach@linaro.org>
      Acked-by: default avatarSuzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Darren Hart <darren@os.amperecomputing.com>
      Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230331055645.26918-2-mike.leach@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      09277295
    • Namhyung Kim's avatar
      perf lock contention: Show detail failure reason for BPF · 84c3a2bb
      Namhyung Kim authored
      It can fail to collect lock stat from BPF for various reasons.  For
      example, I've got a report that sometimes time calculation seems wrong
      in case of contended spinlocks.  I suspect the time delta went negative
      for some reason.
      
      Count them separately and show in the output like below:
      
      $ sudo perf lock contention -abE5 sleep 10
       contended   total wait     max wait     avg wait         type   caller
      
              13    785.61 us     79.36 us     60.43 us     spinlock   remove_wait_queue+0x14
              10    469.02 us     87.51 us     46.90 us     spinlock   prepare_to_wait+0x27
               9    289.09 us     69.08 us     32.12 us     spinlock   finish_wait+0x36
             114    251.05 us      8.56 us      2.20 us     spinlock   try_to_wake_up+0x1f5
             132    188.63 us      5.01 us      1.43 us     spinlock   __wake_up_common_lock+0x62
      
      === output for debug ===
      
      bad: 1, total: 279
      bad rate: 0.36 %
      histogram of failure reasons
             task: 1
            stack: 0
             time: 0
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84c3a2bb
    • Namhyung Kim's avatar
      perf lock contention: Fix debug stat if no contention · 35bf007e
      Namhyung Kim authored
      It should not divide if the total number is 0.  Otherwise it'd show
      NaN in the bad rate output.  Also add a whitespace in the "output
      for debug" message.
      
        $ sudo perf lock contention -abv true
        Looking at the vmlinux_path (8 entries long)
        symsrc__init: cannot get elf header.
        Using /proc/kcore for kernel data
        Using /proc/kallsyms for symbols
         contended   total wait     max wait     avg wait         type   caller
      
        === output for debug===
      
        bad: 0, total: 0
        bad rate: -nan %     <-------------------------  (here)
        histogram of events caused bad sequence
            acquire: 0
           acquired: 0
          contended: 0
            release: 0
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230327225711.245738-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      35bf007e
    • Ian Rogers's avatar
      perf vendor events intel: Update ivybridge and ivytown · 31959321
      Ian Rogers authored
      Update to versions 24 and 23 respectively. Adds the event
      BR_MISP_EXEC.INDIRECT.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230328234142.1080045-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      31959321
    • Andreas Herrmann's avatar
      perf bench numa: Fix type of loop iterator in do_work, it should be 'long' · 337fa2db
      Andreas Herrmann authored
      'j' is of type int and start/end are of type 'long'. Thus 'j' might become
      negative and cause segfault in access_data(). Fix it by using 'long' for
      'j' as well.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarAndreas Herrmann <aherrmann@suse.de>
      Link: https://lore.kernel.org/r/20230330074202.14052-1-aherrmann@suse.deSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      337fa2db
    • Adrian Hunter's avatar
      perf symbol: Remove unused branch_callstack · 5a892c3d
      Adrian Hunter authored
      branch_callstack was added by commit 8b7bad58 ("perf callchain: Support
      handling complete branch stacks as histograms") but never used.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5a892c3d
    • Adrian Hunter's avatar
      perf top: Add --branch-history option · 5ef50613
      Adrian Hunter authored
      Add --branch-history option, to act the same as that option does for
      perf report.
      
      Example:
      
        $ cat tcallf.c
        volatile a = 10000, b = 100000, c;
      
        __attribute__((noinline)) f2()
        {
                c = a / b;
        }
      
        __attribute__((noinline)) f1()
        {
                f2();
                f2();
        }
        main()
        {
                while (1)
                        f1();
        }
        $ gcc -w -g -o tcallf tcallf.c
        $ ./tcallf &
        [1] 29409
        $ perf top -e cycles:u  -t $(pidof tcallf) --stdio --no-children --branch-history
           PerfTop:    3819 irqs/sec  kernel: 0.0%  exact:  0.0% lost: 0/0 drop: 0/0 [4000Hz cycles:u],  (target_tid: 29409)
        --------------------------------------------------------------------------------------------------------------------
      
            49.01%  tcallf.c:5   [.] f2    tcallf
                    |
                    |--24.91%--f2 tcallf.c:4
                    |          |
                    |          |--17.14%--f1 tcallf.c:11 (cycles:1)
                    |          |          f1 tcallf.c:11
                    |          |          f2 tcallf.c:6 (cycles:3)
                    |          |          f2 tcallf.c:4
                    |          |          f1 tcallf.c:10 (cycles:2)
                    |          |          f1 tcallf.c:9
                    |          |          main tcallf.c:16 (cycles:1)
                    |          |          main tcallf.c:16
                    |          |          main tcallf.c:16 (cycles:1)
                    |          |          main tcallf.c:16
                    |          |          f1 tcallf.c:12 (cycles:1)
                    |          |          f1 tcallf.c:12
                    |          |          f2 tcallf.c:6 (cycles:3)
                    |          |          f2 tcallf.c:4
                    |          |          f1 tcallf.c:11 (cycles:1 iter:1 avg_cycles:12)
                    |          |          f1 tcallf.c:11
                    |          |          f2 tcallf.c:6 (cycles:3 iter:1 avg_cycles:12)
                    |          |          f2 tcallf.c:4
                    |          |          f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12)
                    |          |
                    |           --7.78%--f1 tcallf.c:10 (cycles:2)
                    |                     f1 tcallf.c:9
                    |                     main tcallf.c:16 (cycles:1)
                    |                     main tcallf.c:16
                    |                     main tcallf.c:16 (cycles:1)
                    |                     main tcallf.c:16
                    |                     f1 tcallf.c:12 (cycles:1)
                    |                     f1 tcallf.c:12
                    |                     f2 tcallf.c:6 (cycles:3)
                    |                     f2 tcallf.c:4
                    |                     f1 tcallf.c:11 (cycles:1)
                    |                     f1 tcallf.c:11
                    |                     f2 tcallf.c:6 (cycles:3)
                    |                     f2 tcallf.c:4
                    |                     f1 tcallf.c:10 (cycles:2 iter:1 avg_cycles:12)
                    |                     f1 tcallf.c:9
                    |                     main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12)
                    |                     main tcallf.c:16
                    |                     main tcallf.c:16 (cycles:1 iter:1 avg_cycles:12)
        ...
      
        $ pkill tcallf
        [1]+  Terminated              ./tcallf
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20230330131833.12864-2-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ef50613
    • Ian Rogers's avatar
      perf build: Conditionally define NDEBUG · 616b14b4
      Ian Rogers authored
      When a build is done without DEBUG=1 then define NDEBUG. This will
      compile out asserts and other debug code.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      616b14b4
    • Ian Rogers's avatar
      perf block-range: Move debug code behind ifndef NDEBUG · 984a785f
      Ian Rogers authored
      Make good on a comment and avoid a unused-but-set-variable warning.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      984a785f
    • Ian Rogers's avatar
      perf bench: Avoid NDEBUG warning · d1babea9
      Ian Rogers authored
      With NDEBUG set the asserts are compiled out. This yields
      "unused-but-set-variable" variables. Move these variables behind
      NDEBUG to avoid the warning.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Link: https://lore.kernel.org/r/20230330183827.1412303-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d1babea9
    • Ian Rogers's avatar
      perf vendor events: Update Alderlake for E-Core TMA v2.3 · 0372358a
      Ian Rogers authored
      https://github.com/intel/perfmon/pull/65
      Generated by:
      https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py
      
      The PR notes state:
       - E-Core TMA version 2.3.
         - FP_UOPS changed to FPDIV_Uops
         - Added BR_MISP breakdown stats
         - Frontend_Bandwidth/Latency changed to Fetch_Bandwidth/Latency
         - Load_Store_Bound changed to Memory_Bound
         - Icache changed to ICache_Misses
         - ITLB changed to ITLB_Misses
         - Store_Fwd changed to Store_Fwd_Blk
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Edward Baker <edward.baker@intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230329162318.1227114-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0372358a
    • Ian Rogers's avatar
      perf symbol: Add command line support for addr2line path · 57594454
      Ian Rogers authored
      Allow addr2line to be set either on the command line or via the
      perfconfig file. This doesn't currently work with llvm-addr2line as
      the addr2line code emits two things:
      1) the address to decode,
      2) a bogus ',' value.
      The expectation is the bogus value will generate:
      ??
      ??:0
      that terminates the addr2line reading. However, the output from
      llvm-addr2line is a single line with just the input ',' locking up the
      addr2line reading that is expecting a second line.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andres Freund <andres@anarazel.de>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      57594454
    • Ian Rogers's avatar
      perf annotate: Allow objdump to be set in perfconfig · 0b02b47e
      Ian Rogers authored
      Allow the setting of the objdump command in the perfconfig. Update man
      page for this new option.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andres Freund <andres@anarazel.de>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20230328235543.1082207-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b02b47e