1. 14 Dec, 2022 36 commits
    • James Clark's avatar
      perf test: Add ability to test exit code for attr tests · a8f26192
      James Clark authored
      Currently the return value is used to skip the test, but sometimes it
      can be useful to test if a certain command should return a certain exit
      code.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221213114739.2312862-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a8f26192
    • Petar Gligoric's avatar
      perf test: add new task-analyzer tests · e8478b84
      Petar Gligoric authored
      Provide task-analyzer test cases for all possible arguments and a subset of possible
      combinations.
      
      12 Tests in total.
      
      test_basic:
       - cmd:"perf script report task-analyzer"
       - Fundamental test of script without arguments.
       - Check for standard output.
      
      test_ns_rename:
       - cmd:"perf script report task-analyzer --ns --rename-comms-by-tids 0:random"
       - Standard task with timestamps in nanoseconds and comm renamed.
       - Check for standard output.
      
      test_ms_filtertasks_highlight:
       - cmd:"perf script report task-analyzer --ms --filter-tasks perf --highlight-tasks perf"
       - Standard task with timestamps in milliseconds, task filtered out and highlighted.
       - Check for standard output.
      
      test_extended_times_timelimit_limittasks:
       - cmd "perf script report task-analyzer --extended-times --time-limit :99999"
       - Standard task with additional schedule out/in info and timlimit active at 99999.
       - Check for extended table output.
      
      test_summary:
       - cmd:"perf script report task-analyzer --summary"
       - Standard task with additional summary output.
       - Check for summary print.
      
      test_summary_extended:
       - cmd:"perf script report task-analyzer --summary-extended"
       - Standard task with summary and additional schedule in/out info.
       - Chceck for extended table print.
      
      test_summaryonly:
       - cmd:"perf script report task-analyzer --summary-only"
       - Only summary should be printed.
       - Check for summary print.
      
      test_extended_times_summary_ns:
       - cmd:"perf script report task-analyzer --extended-times --summary --ns"
       - Standard task with extended schedule in/out information and summary in ns.
       - Check for extended table and summary.
      
      test_csv:
       - cmd:"perf script report task-analyzer --csv csv"
       - Print standard task to csv file in csv format.
       - Check for csv format.
      
      test_csv_extended_times:
       - cmd:"perf script report task-analyzer --csv csv --extended-times"
       - Print standard task to csv file in csv format with additional schedule in/out
         information.
       - Check for additional information and csv format.
      
      test_csvsummary:
       - cmd:"perf script report task-analyzer --csv-summary csvsummary"
       - Print summary to csvsummary file in csv format.
       - Check for csv format.
      
      test_csvsummary_extended:
       - cmd:"perf script report task-analyzer --csv-summary csvsummary --summary-extended"
       - Print summary to csvsummary file in csv format with additional schedule in/out
         information.
       - Check for additional information and csv format.
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarPetar Gligoric <petar.gligoric@rohde-schwarz.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20221206154406.41941-4-petar.gligor@gmail.comSigned-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8478b84
    • Petar Gligoric's avatar
      perf script: task-analyzer add csv support · fdd0f81f
      Petar Gligoric authored
      This patch adds the possibility to write the trace and the summary as csv files
      to a user specified file. A format as such simplifies further data processing.
      This is achieved by having ";" as separators instead of spaces and solely one
      header per file.
      
      Additional parameters are being considered, like in the normal usage of the
      script. Colors are turned off in the case of a csv output, thus the highlight
      option is also being ignored.
      
      Usage:
      
      Write standard task to csv file:
      
        $ perf script report tasks-analyzer --csv <file>
      
      write limited output to csv file in nanoseconds:
      
        $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337
      
      Write summary to a csv file:
      
        $ perf script report tasks-analyzer --csv-summary <file>
      
      Write summary to csv file with additional schedule information:
      
        $ perf script report tasks-analyzer --csv-summary <file> --summary-extended
      
      Write both summary and standard task to a csv file:
      
        $ perf script report tasks-analyzer --csv --csv-summary
      
      The following examples illustrate what is possible with the CSV output.  The
      first command sequence will record all scheduler switch events for 10 seconds,
      the task-analyzer calculates task information like runtimes as CSV.  A small
      python snippet using pandas and matplotlib will visualize the most frequent
      task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart:
      
        $ perf record -e sched:sched_switch -a -- sleep 10
        $ perf script report tasks-analyzer --ns --csv tasks.csv
        $ cat << EOF > /tmp/freq-comm-runtimes-bar.py
          import pandas as pd
          import matplotlib.pyplot as plt
      
          df = pd.read_csv("tasks.csv", sep=';')
          most_freq_comm = df["COMM"].value_counts().idxmax()
          most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"]
          plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds")
          plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes)
          plt.show()
        $ python3 /tmp/freq-comm-runtimes-bar.py
      
      As a seconds example, the subsequent script generates a pie chart of all
      accumulated tasks runtimes for 10 seconds of system recordings:
      
        $ perf record -e sched:sched_switch -a -- sleep 10
        $ perf script report tasks-analyzer --csv-summary task-summary.csv
        $ cat << EOF > /tmp/accumulated-task-pie.py
          import pandas as pd
          from matplotlib.pyplot import pie, axis, show
      
          df = pd.read_csv("task-summary.csv", sep=';')
          sums = df.groupby(df["Comm"])["Accumulated"].sum()
          axis("equal")
          pie(sums, labels=sums.index);
          show()
        EOF
        $ python3 /tmp/accumulated-task-pie.py
      
      A variety of other visualizations are possible in matplotlib and other
      environments. Of course, pandas, numpy and co. also allow easy
      statistical analysis of the data!
      Signed-off-by: default avatarPetar Gligoric <petar.gligoric@rohde-schwarz.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.comSigned-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fdd0f81f
    • Hagen Paul Pfeifer's avatar
      perf script: Introduce task analyzer python script · e76aff05
      Hagen Paul Pfeifer authored
      Introduce a new 'perf script' to analyze task scheduling behavior.
      
      During the task analysis, some data is always needed - which goes beyond
      the simple time of switching on and off a task (process/thread). This
      concerns for example the runtime of a process or the frequency with
      which the process was called. This script serves to simplify this
      recurring analyze process. It immediately provides the user with helpful
      task characteristic information about the tasks runtimes.
      
      Usage:
      
      Recorded can be in two ways:
      
        $ perf script record tasks-analyzer -- sleep 10
        $ perf record -e sched:sched_switch -a -- sleep 10
      
      The script can parse all perf.data files, most important: sched:sched_switch
      events are mandatory, other events will be ignored.
      
      Most simple report use case is to just call the script without arguments:
      
        $ perf script report tasks-analyzer
            Switched-In      Switched-Out CPU      PID      TID             Comm    Runtime     Time Out-In
        15576.658891407   15576.659156086   4     2412     2428            gdbus        265            1949
        15576.659111320   15576.659455410   0     2412     2412      gnome-shell        344            2267
        15576.659491326   15576.659506173   2       74       74      kworker/2:1         15           13145
        15576.659506173   15576.659825748   2     2858     2858  gnome-terminal-        320           63263
        15576.659871270   15576.659902872   6    20932    20932    kworker/u16:0         32         2314582
        15576.659909951   15576.659945501   3    27264    27264               sh         36              -1
        15576.659853285   15576.659971052   7    27265    27265             perf        118         5050741
        [...]
      
      What is not shown here are the ASCII color sequences. For example, if
      the task consists of only one thread, the TID is grayed out.
      
      Runtime is the time the task was running on the CPU, Time Out-In is the
      time between the process being scheduled *out* and scheduled back *in*.
      So the last time span between two executions. If -1 is printed, then the
      task simply ran the first time in the measurements - a Out-In delta
      could not be calculated.
      
      In addition to the chronological representation, there is a summary on
      task level. This output can be additionally switched on via the
      --summary option and provides information such as max, min & average
      runtime per process. The maximum runtime is often important for
      debugging. The call looks like this:
      
        $ perf script report tasks-analyzer --summary
        Summary
             Task Information                       Runtime Information
          PID   TID            Comm Runs Accumulated    Mean  Median  Min   Max          Max At
           14    14     ksoftirqd/0   13         334      26      15    9   127 15571.621211956
           15    15     rcu_preempt  133        1778      13      13    2    33 15572.581176024
           16    16     migration/0    3          49      16      13   12    24 15571.608915425
           20    20     migration/1    3          34      11      13    8    13 15571.639101555
           25    25     migration/2    3          32      11      12    9    12 15575.639239896
        [...]
      
      Besides these two options, there are a number of other options that change the
      output and behavior. This can be queried via --help. Options worth mentioning include:
      
      - filter-tasks         - filter out unneeded tasks, --filter-task 1337,/sbin/init
      - highlight-tasks      - more pleasant focusing, --highlight-tasks 1:red,mutt:yellow
      - extended-times       - show combinations of elapsed times between schedule in/schedule out
      - summary-extended     - summary with additional information, like maximum delta time statistics
      - rename-comms-by-tids - handy for inexpressive processnames like python, --rename 1337:my-python-app
      - ms                   - show timestamps in milliseconds, nanoseconds is also possible (--ns)
      - time-limit           - limit the analyzer to a time range, --time-limit 15576.0:15576.1
      
      Script is tested and prime time ready for python2 & python3:
      
      - make PYTHON=python3 prefix=/usr/local install
      - make PYTHON=python2 prefix=/usr/local install
      Signed-off-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/r/20221206154406.41941-2-petar.gligor@gmail.comSigned-off-by: default avatarPetar Gligoric <petar.gligoric@rohde-schwarz.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e76aff05
    • James Clark's avatar
      perf cs-etm: Print auxtrace info even if OpenCSD isn't linked · 55c1de99
      James Clark authored
      Printing the info doesn't have any dependency on OpenCSD, and neither
      does recording Coresight data. Because it's sometimes useful to look at
      the info for debugging, it makes sense to be able to see it on the same
      platform that the recording was made on.
      
      So pull the auxtrace info printing parts into a new file that is always
      compiled into Perf.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20221212155513.2259623-6-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55c1de99
    • James Clark's avatar
      perf cs-etm: Cleanup cs_etm__process_auxtrace_info() · fd63091f
      James Clark authored
      hdr is a copy of 3 values of ptr and doesn't need to be long lived. So
      just use ptr instead which means the malloc and the extra error path can
      be removed to simplify things.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20221212155513.2259623-5-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd63091f
    • James Clark's avatar
      perf cs-etm: Tidy up auxtrace info header printing · b00204f5
      James Clark authored
      cs_etm__print_auxtrace_info() is called twice in case there is an error
      somewhere in cs_etm__process_auxtrace_info(), but all the info is
      already available at the beginning so just print it there instead.
      
      Also use u64 and the already cast ptr variable to make it more
      consistent with the rest of the etm code.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20221212155513.2259623-4-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b00204f5
    • James Clark's avatar
      perf cs-etm: Remove unused stub methods · fe55ba18
      James Clark authored
      These aren't used outside of cs-etm so don't need stubs. Leave
      cs_etm__process_auxtrace_info() which is used externally, and add an
      error message so that it's obvious to users why it causes errors.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20221212155513.2259623-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fe55ba18
    • James Clark's avatar
      perf cs-etm: Print unknown header version as an error · ab6bd55e
      James Clark authored
      This is an error rather than just for the raw trace dump so always print
      it as an error. Also remove the duplicate header version check.
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Al Grant <Al.Grant@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20221212155513.2259623-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab6bd55e
    • Namhyung Kim's avatar
      perf test: Update perf lock contention test · 22ddcb6b
      Namhyung Kim authored
      Add test cases for the task and addr aggregation modes.
      
        $ sudo ./perf test -v contention
         86: kernel lock contention analysis test                            :
        --- start ---
        test child forked, pid 680006
        Testing perf lock record and perf lock contention
        Testing perf lock contention --use-bpf
        Testing perf lock record and perf lock contention at the same time
        Testing perf lock contention --threads
        Testing perf lock contention --lock-addr
        test child finished with 0
        ---- end ----
        kernel lock contention analysis test: Ok
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221209190727.759804-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      22ddcb6b
    • Namhyung Kim's avatar
      perf lock contention: Add -l/--lock-addr option · 688d2e8d
      Namhyung Kim authored
      The -l/--lock-addr option is to implement per-lock-instance contention
      stat using LOCK_AGGR_ADDR.  It displays lock address and optionally
      symbol name if exists.
      
        $ sudo ./perf lock con -abl sleep 1
         contended   total wait     max wait     avg wait            address   symbol
      
                 1     36.28 us     36.28 us     36.28 us   ffff92615d6448b8
                 9     10.91 us      1.84 us      1.21 us   ffffffffbaed50c0   rcu_state
                 1     10.49 us     10.49 us     10.49 us   ffff9262ac4f0c80
                 8      4.68 us      1.67 us       585 ns   ffffffffbae07a40   jiffies_lock
                 3      3.03 us      1.45 us      1.01 us   ffff9262277861e0
                 1       924 ns       924 ns       924 ns   ffff926095ba9d20
                 1       436 ns       436 ns       436 ns   ffff9260bfda4f60
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      688d2e8d
    • Namhyung Kim's avatar
      perf lock contention: Implement -t/--threads option for BPF · eca949b2
      Namhyung Kim authored
      The BPF didn't show the per-thread stat properly.  Use task's thread id (PID)
      as a key instead of stack_id and add a task_data map to save task comm names.
      
        $ sudo ./perf lock con -abt -E 5 sleep 1
         contended   total wait     max wait     avg wait          pid   comm
      
                 1    740.66 ms    740.66 ms    740.66 ms         1950   nv_queue
                 3    305.50 ms    298.19 ms    101.83 ms         1884   nvidia-modeset/
                 1     25.14 us     25.14 us     25.14 us      2725038   EventManager_De
                12     23.09 us      9.30 us      1.92 us            0   swapper
                 1     20.18 us     20.18 us     20.18 us      2725033   EventManager_De
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221209190727.759804-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eca949b2
    • Namhyung Kim's avatar
      perf lock contention: Add lock_data.h for common data · fd507d3e
      Namhyung Kim authored
      Accessing BPF maps should use the same data types.  Add bpf_skel/lock_data.h
      to define the common data structures.  No functional changes.
      
      Committer notes:
      
      Fixed contention_key.stack_id missing rename to contention_key.stack_or_task_id.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221209190727.759804-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd507d3e
    • Khem Raj's avatar
      perf python: Account for multiple words in CC · 3cad53a6
      Khem Raj authored
      Sometimes build systems may append options e.g. --sysroot etc. to CC
      variable especially in cross-compile environments like yocto project
      where CC varable is composed of cross-compiler name and some needed
      options for it to work in a relocatable environment.
      
      Therefore separate out the compiler name from rest of the options in CC,
      then add the options via second argument to Popen() API
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarKhem Raj <raj.khem@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Keeping <john@metanate.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sedat Dilek <sedat.dilek@gmail.com>
      Link: https://lore.kernel.org/r/20221205025534.150006-1-raj.khem@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3cad53a6
    • Namhyung Kim's avatar
      perf off_cpu: Fix a typo in BTF tracepoint name, it should be 'btf_trace_sched_switch' · 167b266b
      Namhyung Kim authored
      In BTF, tracepoint definitions have the "btf_trace_" prefix.  The
      off-cpu profiler needs to check the signature of the sched_switch event
      using that definition.  But there's a typo (s/bpf/btf/) so it failed
      always.
      
      Fixes: b36888f7 ("perf record: Handle argument change in sched_switch")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: bpf@vger.kernel.org
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20221208182636.524139-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      167b266b
    • Athira Rajeev's avatar
      perf test: Update event group check for support of uncore event · 232b82d2
      Athira Rajeev authored
      The event group test checks group creation for combinations of hw, sw
      and uncore PMU events. Some of the uncore pmus may require additional
      permission to access the counters.
      
      For example, in case of hv_24x7, partition need to have permissions to
      access hv_24x7 pmu counters. If not, event_open will fail. Hence add a
      sanity check to see if event_open succeeds before proceeding with the
      test.
      
      Fixes: 9d9b22be ("perf test: Add event group test for events in multiple PMUs")
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Link: https://lore.kernel.org/r/20221207165815.774-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      232b82d2
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Check if libtracevent has TEP_FIELD_IS_RELATIVE · b9a49f8c
      Arnaldo Carvalho de Melo authored
      Some distros have older versions of libtraceevent where
      TEP_FIELD_IS_RELATIVE and its associated semantics are not present, so
      we need to check if the version has it, it was introduced in
      libtraceevent 1.5.0.
      Reported-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>,
      Cc: Stephane Eranian <eranian@google.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b9a49f8c
    • Ian Rogers's avatar
      tools lib traceevent: Remove libtraceevent · 4171925a
      Ian Rogers authored
      libtraceevent is now out-of-date and it is better to depend on the
      system version. Remove this code that is no longer depended upon by
      any builds.
      
      Committer notes:
      
      Removed the removed tools/lib/traceevent/ from tools/perf/MANIFEST, so
      that 'make perf-tar-src-pkg' works.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20221130062935.2219247-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4171925a
    • Ian Rogers's avatar
      perf build: Use libtraceevent from the system · 378ef0f5
      Ian Rogers authored
      Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
      line variables.
      
      If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
      build, don't compile in libtraceevent and libtracefs support.
      
      This also disables CONFIG_TRACE that controls "perf trace".
      
      CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
      HAVE_LIBTRACEEVENT is used in C code.
      
      Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
      commands kmem, kwork, lock, sched and timechart are removed.  The
      majority of commands continue to work including "perf test".
      
      Committer notes:
      
      Fixed up a tools/perf/util/Build reject and added:
      
        #include <traceevent/event-parse.h>
      
      to tools/perf/util/scripting-engines/trace-event-perl.c.
      
      Committer testing:
      
        $ rpm -qi libtraceevent-devel
        Name        : libtraceevent-devel
        Version     : 1.5.3
        Release     : 2.fc36
        Architecture: x86_64
        Install Date: Mon 25 Jul 2022 03:20:19 PM -03
        Group       : Unspecified
        Size        : 27728
        License     : LGPLv2+ and GPLv2+
        Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
        Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
        Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
        Build Host  : buildvm-x86-05.iad2.fedoraproject.org
        Packager    : Fedora Project
        Vendor      : Fedora Project
        URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
        Bug URL     : https://bugz.fedoraproject.org/libtraceevent
        Summary     : Development headers of libtraceevent
        Description :
        Development headers of libtraceevent-libs
        $
      
      Default build:
      
        $ ldd ~/bin/perf | grep tracee
        	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
        $
      
        # perf trace -e sched:* --max-events 10
             0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
             0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
             0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
             1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
             1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
             0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
             0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
             0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
             1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
             1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
        #
      
      Had to tweak tools/perf/util/setup.py to make sure the python binding
      shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
      present in CFLAGS.
      
      Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
      
      - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
      
      - perf-$(CONFIG_LIBTRACEEVENT) += scripts/
      
      - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
      
      - The python binding needed some fixups and util/trace-event.c can't be
        built and linked with the python binding shared object, so remove it
        in tools/perf/util/setup.py and exclude it from the list of
        dependencies in the python/perf.so Makefile.perf target.
      
      Building without libtraceevent-devel installed uncovered more build
      failures:
      
      - The python binding tools/perf/util/python.c was assuming that
        traceevent/parse-events.h was always available, which was the case
        when we defaulted to using the in-kernel tools/lib/traceevent/ files,
        now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
        the other parts of it that deal with tracepoints.
      
      - We have to ifdef the rules in the Build files with
        CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
        tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
        setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
        detect libtraceevent-devel installed in the system. Simplification here
        to avoid these two ways of disabling builtin-trace.c and not having
        CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
        way.
      
      From Athira:
      
      <quote>
      tools/perf/arch/powerpc/util/Build
      -perf-y += kvm-stat.o
      +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
      </quote>
      
      Then, ditto for arm64 and s390, detected by container cross build tests.
      
      - s/390 uses test__checkevent_tracepoint() that is now only available if
        HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
      
      Also from Athira:
      
      <quote>
      With this change, I could successfully compile in these environment:
      - Without libtraceevent-devel installed
      - With libtraceevent-devel installed
      - With “make NO_LIBTRACEEVENT=1”
      </quote>
      
      Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
      consistency with other libraries detected in tools/perf/.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      378ef0f5
    • Ian Rogers's avatar
      perf jevents: Parse metrics during conversion · 40769665
      Ian Rogers authored
      Currently the 'MetricExpr' json value is passed from the json
      file to the pmu-events.c. This change introduces an expression
      tree that is parsed into. The parsing is done largely by using
      operator overloading and python's 'eval' function. Two advantages
      in doing this are:
      
      1) Broken metrics fail at compile time rather than relying on
         `perf test` to detect. `perf test` remains relevant for checking
         event encoding and actual metric use.
      
      2) The conversion to a string from the tree can minimize the metric's
         string size, for example, preferring 1e6 over 1000000, avoiding
         multiplication by 1 and removing unnecessary whitespace. On x86
         this reduces the string size by 2,930bytes (0.07%).
      
      In future changes it would be possible to programmatically
      generate the json expressions (a single line of text and so a
      pain to write manually) for an architecture using the expression
      tree. This could avoid copy-pasting metrics for all architecture
      variants.
      
      v4. Doesn't simplify "0*SLOTS" to 0, as the pattern is used to fix
          Intel metrics with topdown events.
      v3. Avoids generic types on standard types like set that aren't
          supported until Python 3.9, fixing an issue with Python 3.6
          reported-by John Garry. v3 also fixes minor pylint issues and adds
          a call to Simplify on the read expression tree.
      v2. Improvements to type information.
      
      Committer notes:
      
      Added one-line fixer from Ian, see first Link: tag below.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarJohn Garry <john.g.garry@oracle.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/CAP-5=fWa=zNK_ecpWGoGggHCQx7z-oW0eGMQf19Maywg0QK=4g@mail.gmail.com
      Link: https://lore.kernel.org/r/20221207055908.1385448-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      40769665
    • Namhyung Kim's avatar
      perf stat: Update event skip condition for system-wide per-thread mode and... · b8976135
      Namhyung Kim authored
      perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events
      
      In print_counter_aggrdata(), it skips some events that has no aggregate
      count.  It's actually for system-wide per-thread mode and merged uncore
      and hybrid events.
      
      Let's update the condition to check them explicitly.
      
      Fixes: 91f85f98 ("perf stat: Display event stats using aggr counts")
      Reported-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221206175804.391387-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8976135
    • Ian Rogers's avatar
      perf build: Fixes for LIBTRACEEVENT_DYNAMIC · 616aa32d
      Ian Rogers authored
      If LIBTRACEEVENT_DYNAMIC is enabled then avoid the install step for
      the plugins. If disabled correct DESTDIR so that the plugins are
      installed under <lib>/traceevent/plugins.
      
      Fixes: ef019df0 ("perf build: Install libtraceevent locally when building")
      Reported-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Gordeev <agordeev@linux.ibm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20221205225940.3079667-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      616aa32d
    • Arnaldo Carvalho de Melo's avatar
      machine: Adopt is_lock_function() from builtin-lock.c · cc2367ee
      Arnaldo Carvalho de Melo authored
      It is used in bpf_lock_contention.c and builtin-lock.c will be made
      CONFIG_LIBTRACEEVENT=y conditional, so move it to machine.c, that is
      always available.
      
      This makes those 4 global variables for sched and lock text start and
      end to move to 'struct machine' too, as conceivably we can have that
      info for several machine instances, say some 'perf diff' like tool.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cc2367ee
    • Ravi Bangoria's avatar
      perf test: Add event group test for events in multiple PMUs · 9d9b22be
      Ravi Bangoria authored
      Multiple events in a group can belong to one or more PMUs, however
      there are some limitations.
      
      One of the limitations is that perf doesn't allow creating a group of
      events from different hw PMUs.
      
      Write a simple test to create various combinations of hw, sw and uncore
      PMU events and verify group creation succeeds or fails as expected.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Carsten Haitzler <carsten.haitzler@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20221206043237.12159-3-ravi.bangoria@amd.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d9b22be
    • Ravi Bangoria's avatar
      perf tool: Move pmus list variable to a new file · 336b92da
      Ravi Bangoria authored
      The 'pmus' list variable is defined as static variable under pmu.c file.
      
      Introduce a new pmus.c file and migrate this variable to it. Also make
      it non static so that it can be accessed from outside.
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: carsten.haitzler@arm.com
      Link: https://lore.kernel.org/r/20221206043237.12159-2-ravi.bangoria@amd.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      336b92da
    • Ian Rogers's avatar
      perf util: Add host_is_bigendian to util.h · 5b7a29fb
      Ian Rogers authored
      Avoid libtraceevent dependency for tep_is_bigendian or trace-event.h
      dependency for bigendian. Add a new host_is_bigendian to util.h, using
      the compiler defined __BYTE_ORDER__ when available.
      
      Committer notes:
      
      Added:
      
       #else  /* !__BYTE_ORDER__ */
      
      On that nested #ifdef block, as per Namhyung's suggestion.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/r/20221130062935.2219247-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5b7a29fb
    • Ian Rogers's avatar
      perf util: Make header guard consistent with tool · fce9a619
      Ian Rogers authored
      Remove git reference by changing GIT_COMPAT_UTIL_H to __PERF_UTIL_H.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Link: https://lore.kernel.org/r/20221130062935.2219247-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fce9a619
    • James Clark's avatar
      perf stat: Fix invalid output handle · 3f81f72d
      James Clark authored
      In this context, 'os' is already a pointer so the extra dereference
      isn't required. This fixes the following test failure on aarch64:
      
        $ ./perf test "json output" -vvv
        92: perf stat JSON output linter                                    :
        --- start ---
        Checking json output: no args Test failed for input:
        ...
        Fatal error: glibc detected an invalid stdio handle
        ---- end ----
        perf stat JSON output linter: FAILED!
      
      Fixes: e7f4da31 ("perf stat: Pass struct outstate to printout()")
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20221130111521.334152-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3f81f72d
    • Namhyung Kim's avatar
      perf stat: Fix multi-line metric output in JSON · 117195d9
      Namhyung Kim authored
      When a metric produces more than one values, it missed to print the opening
      bracket.
      
      Fixes: ab6baaae ("perf stat: Fix JSON output in metric-only mode")
      Reported-by: default avatarWeilin Wang <weilin.wang@intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarWeilin Wang <weilin.wang@intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221202190447.1588680-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      117195d9
    • Ian Rogers's avatar
      tools lib symbol: Add dependency test to install_headers · 113bb396
      Ian Rogers authored
      Compute the headers to be installed from their source headers and make
      each have its own build target to install it. Using dependencies
      avoids headers being reinstalled and getting a new timestamp which
      then causes files that depend on the header to be rebuilt.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20221202045743.2639466-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      113bb396
    • Ian Rogers's avatar
      tools lib subcmd: Add dependency test to install_headers · 5d890591
      Ian Rogers authored
      Compute the headers to be installed from their source headers and make
      each have its own build target to install it. Using dependencies
      avoids headers being reinstalled and getting a new timestamp which
      then causes files that depend on the header to be rebuilt.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20221202045743.2639466-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5d890591
    • Ian Rogers's avatar
      tools lib perf: Add dependency test to install_headers · 47e02b94
      Ian Rogers authored
      Compute the headers to be installed from their source headers and make
      each have its own build target to install it. Using dependencies
      avoids headers being reinstalled and getting a new timestamp which
      then causes files that depend on the header to be rebuilt.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20221202045743.2639466-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      47e02b94
    • Ian Rogers's avatar
      tools lib api: Add dependency test to install_headers · 1849f9f0
      Ian Rogers authored
      Compute the headers to be installed from their source headers and make
      each have its own build target to install it. Using dependencies
      avoids headers being reinstalled and getting a new timestamp which
      then causes files that depend on the header to be rebuilt.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: bpf@vger.kernel.org
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20221202045743.2639466-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1849f9f0
    • Athira Rajeev's avatar
      perf stat: Fix printing field separator in CSV metrics output · 8f4b1e3c
      Athira Rajeev authored
      In 'perf stat' with CSV output option, number of fields in metrics
      output is not matching with number of fields in other event output
      lines.
      
      Sample output below after applying patch to fix printing os->prefix.
      
      	# ./perf stat -x, --per-socket -a -C 1 ls
      	S0,1,82.11,msec,cpu-clock,82111626,100.00,1.000,CPUs utilized
      	S0,1,2,,context-switches,82109314,100.00,24.358,/sec
      	------
      ====>	S0,1,,,,,,,1.71,stalled cycles per insn
      
      The above command line uses field separator as "," via "-x," option and
      per-socket option displays socket value as first field. But here the
      last line for "stalled cycles per insn" has more separators.  Each csv
      output line is expected to have 8 field separators (for the 9 fields),
      where as last line has 9 "," in the result. Patch fixes this issue.
      
      The counter stats are displayed by function
      "perf_stat__print_shadow_stats" in code "util/stat-shadow.c". While
      printing the stats info for "stalled cycles per insn", function
      "new_line_csv" is used as new_line callback.
      
      The fields printed in each line contains: "Socket_id,aggr
      nr,Avg,unit,event_name,run,enable_percent,ratio,unit"
      
      The metric output prints Socket_id, aggr nr, ratio and unit. It has to
      skip through remaining five fields ie,
      Avg,unit,event_name,run,enable_percent. The csv line callback uses
      "os->nfields" to know the number of fields to skip to match with other
      lines.
      
      Currently it is set as:
      
      	os.nfields = 3 + aggr_fields[config->aggr_mode] + (counter->cgrp ? 1 : 0);
      
      But in case of aggregation modes, csv_sep already gets printed along
      with each field (Function "aggr_printout" in util/stat-display.c). So
      aggr_fields can be removed from nfields. And fixed number of fields to
      skip has to be "4". This is to skip fields for: "avg, unit, event name,
      run, enable_percent"
      
      This needs 4 csv separators. Patch removes aggr_fields
      and uses 4 as fixed number of os->nfields to skip.
      
      After the patch:
      
      	# ./perf stat -x, --per-socket -a -C 1 ls
      	S0,1,79.08,msec,cpu-clock,79085956,100.00,1.000,CPUs utilized
      	S0,1,7,,context-switches,79084176,100.00,88.514,/sec
      	------
      ====>	S0,1,,,,,,0.81,stalled cycles per insn
      
      Fixes: 92a61f64 ("perf stat: Implement CSV metrics output")
      Reported-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20221205042852.83382-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f4b1e3c
    • Anshuman Khandual's avatar
      perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index" · 955f6def
      Anshuman Khandual authored
      This adds all remaining branch filters i.e "no_cycles", "no_flags" and
      "hw_index". While here, also updates the documentation.
      Signed-off-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lore.kernel.org/lkml/20221205064443.533587-1-anshuman.khandual@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      955f6def
    • Ian Rogers's avatar
      perf stat: Check existence of os->prefix, fixing a segfault · 3c97d25c
      Ian Rogers authored
      We need to check if we have a OS prefix, otherwise we stumble on a
      metric segv that I'm now seeing in Arnaldo's tree:
      
        $ gdb --args perf stat -M Backend true
        ...
        Performance counter stats for 'true':
      
                4,712,355      TOPDOWN.SLOTS                    #     17.3 % tma_core_bound
      
        Program received signal SIGSEGV, Segmentation fault.
        __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
        77      ../sysdeps/x86_64/multiarch/strlen-evex.S: No such file or directory.
        (gdb) bt
        #0  __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
        #1  0x00007ffff74749a5 in __GI__IO_fputs (str=0x0, fp=0x7ffff75f5680 <_IO_2_1_stderr_>)
        #2  0x0000555555779f28 in do_new_line_std (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10) at util/stat-display.c:356
        #3  0x000055555577a081 in print_metric_std (config=0x555555e077c0 <stat_config>, ctx=0x7fffffffbf10, color=0x0, fmt=0x5555558b77b5 "%8.1f", unit=0x7fffffffbb10 "%  tma_memory_bound", val=13.165355724442199) at util/stat-display.c:380
        #4  0x00005555557768b6 in generic_metric (config=0x555555e077c0 <stat_config>, metric_expr=0x55555593d5b7 "((CYCLE_ACTIVITY.STALLS_MEM_ANY + EXE_ACTIVITY.BOUND_ON_STORES) / (CYCLE_ACTIVITY.STALLS_TOTAL + (EXE_ACTIVITY.1_PORTS_UTIL + tma_retiring * EXE_ACTIVITY.2_PORTS_UTIL) + EXE_ACTIVITY.BOUND_ON_STORES))"..., metric_events=0x555555f334e0, metric_refs=0x555555ec81d0, name=0x555555f32e80 "TOPDOWN.SLOTS", metric_name=0x555555f26c80 "tma_memory_bound", metric_unit=0x55555593d5b1 "100%", runtime=0, map_idx=0, out=0x7fffffffbd90, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:934
        #5  0x0000555555778cac in perf_stat__print_shadow_stats (config=0x555555e077c0 <stat_config>, evsel=0x555555f289d0, avg=4712355, map_idx=0, out=0x7fffffffbd90, metric_events=0x555555e078e8 <stat_config+296>, st=0x555555e9e620 <rt_stat>) at util/stat-shadow.c:1329
        #6  0x000055555577b6a0 in printout (config=0x555555e077c0 <stat_config>, os=0x7fffffffbf10, uval=4712355, run=325322, ena=325322, noise=4712355, map_idx=0) at util/stat-display.c:741
        #7  0x000055555577bc74 in print_counter_aggrdata (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, s=0, os=0x7fffffffbf10) at util/stat-display.c:838
        #8  0x000055555577c1d8 in print_counter (config=0x555555e077c0 <stat_config>, counter=0x555555f289d0, os=0x7fffffffbf10) at util/stat-display.c:957
        #9  0x000055555577dba0 in evlist__print_counters (evlist=0x555555ec3610, config=0x555555e077c0 <stat_config>, _target=0x555555e01c80 <target>, ts=0x0, argc=1, argv=0x7fffffffe450) at util/stat-display.c:1413
        #10 0x00005555555fc821 in print_counters (ts=0x0, argc=1, argv=0x7fffffffe450) at builtin-stat.c:1040
        #11 0x000055555560091a in cmd_stat (argc=1, argv=0x7fffffffe450) at builtin-stat.c:2665
        #12 0x00005555556b1eea in run_builtin (p=0x555555e11f70 <commands+336>, argc=4, argv=0x7fffffffe450) at perf.c:322
        #13 0x00005555556b2181 in handle_internal_command (argc=4, argv=0x7fffffffe450) at perf.c:376
        #14 0x00005555556b22d7 in run_argv (argcp=0x7fffffffe27c, argv=0x7fffffffe270) at perf.c:420
        #15 0x00005555556b26ef in main (argc=4, argv=0x7fffffffe450) at perf.c:550
        (gdb)
      
      Fixes: f123b2d8 ("perf stat: Remove prefix argument in print_metric_headers()")
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: http://lore.kernel.org/lkml/CAP-5=fUOjSM5HajU9TCD6prY39LbX4OQbkEbtKPPGRBPBN=_VQ@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3c97d25c
  2. 05 Dec, 2022 4 commits