1. 02 Mar, 2023 1 commit
    • Arnaldo Carvalho de Melo's avatar
      tools headers svm: Sync svm headers with the kernel sources · a98c0710
      Arnaldo Carvalho de Melo authored
      To pick the changes in:
      
        8c29f016 ("x86/sev: Add SEV-SNP guest feature negotiation support")
      
      That triggers:
      
        CC      /tmp/build/perf-tools/arch/x86/util/kvm-stat.o
        CC      /tmp/build/perf-tools/util/header.o
        LD      /tmp/build/perf-tools/arch/x86/util/perf-in.o
        LD      /tmp/build/perf-tools/arch/x86/perf-in.o
        LD      /tmp/build/perf-tools/arch/perf-in.o
        LD      /tmp/build/perf-tools/util/perf-in.o
        LD      /tmp/build/perf-tools/perf-in.o
        LINK    /tmp/build/perf-tools/perf
      
      But this time causes no changes in tooling results, as the introduced
      SVM_VMGEXIT_TERM_REQUEST exit reason wasn't added to SVM_EXIT_REASONS,
      that is used in kvm-stat.c.
      
      And addresses this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/svm.h' differs from latest version at 'arch/x86/include/uapi/asm/svm.h'
        diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nikunj A Dadhania <nikunj@amd.com>
      Link: http://lore.kernel.org/lkml/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a98c0710
  2. 23 Feb, 2023 3 commits
    • Ian Rogers's avatar
      perf test: Avoid counting commas in json linter · 3de34f85
      Ian Rogers authored
      Commas may appear in events like:
      
        cpu/INT_MISC.RECOVERY_CYCLES,cmask=1,edge/
      
      which causes the count of commas to see more items than expected. Switch
      to counting the entries in the dictionary, which is 1 more than the
      number of commas.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Claire Jensen <cjense@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230223071818.329671-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3de34f85
    • Ian Rogers's avatar
      perf tests stat+csv_output: Switch CSV separator to @ · d3e104bb
      Ian Rogers authored
      Commas may appear in events like:
      
        cpu/INT_MISC.RECOVERY_CYCLES,cmask=1,edge/
      
      which causes the commachecker to see more fields than expected. Use @ as
      the CSV separator to avoid this.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Claire Jensen <cjense@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20230223071818.329671-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3e104bb
    • Namhyung Kim's avatar
      perf inject: Fix --buildid-all not to eat up MMAP2 · ce9f1c05
      Namhyung Kim authored
      When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the
      record already has the build-id info.  So it marks the DSO as hit, to
      skip if the same DSO is not processed if it happens to miss the build-id
      later.
      
      But it missed to copy the MMAP2 record itself so it'd fail to symbolize
      samples for those regions.
      
      For example, the following generates 249 MMAP2 events.
      
        $ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.8%)
      
      Adding perf inject should not change the number of events like this
      
        $ perf record --buildid-mmap -o- true | perf inject -b | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.5%)
      
      But when --buildid-all is used, it eats most of the MMAP2 events.
      
        $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:          1  ( 2.5%)
      
      With this patch, it shows the original number now.
      
        $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \
        > perf report --stat -i- | grep MMAP2
                 MMAP2 events:        249  (86.5%)
      
      Committer testing:
      
      Before:
      
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (36.2%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (36.2%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
                 MMAP2 events:          2  ( 1.9%)
        $
      
      After:
      
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (29.3%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (34.3%)
        $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2
                 MMAP2 events:         58  (38.4%)
        $
      
      Fixes: f7fc0d1c ("perf inject: Do not inject BUILD_ID record if MMAP2 has it")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ce9f1c05
  3. 22 Feb, 2023 2 commits
  4. 17 Feb, 2023 3 commits
    • Kajol Jain's avatar
      perf tests stat_all_metrics: Change true workload to sleep workload for system wide check · f9fa0778
      Kajol Jain authored
      Testcase stat_all_metrics.sh fails in powerpc:
      
      98: perf all metrics test : FAILED!
      
      Logs with verbose:
      
        [command]# ./perf test 98 -vv
         98: perf all metrics test                                           :
         --- start ---
        test child forked, pid 13262
        Testing BRU_STALL_CPI
        Testing COMPLETION_STALL_CPI
         ----
        Testing TOTAL_LOCAL_NODE_PUMPS_P23
        Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in:
        Error:
        Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable system wide with '-a'.
        Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01
        Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in:
        Error:
        Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, enable system wide with '-a'.
         ----
      
      Based on above logs, we could see some of the hv-24x7 metric events
      fails, and logs suggest to run the metric event with -a option.  This
      change happened after the commit a4b8cfca ("perf stat: Delay
      metric parsing"), which delayed the metric parsing phase and now before
      metric parsing phase perf tool identifies, whether target is system-wide
      or not. With this change, perf_event_open will fails with workload
      monitoring for uncore events as expected.
      
      The perf all metric test case fails as some of the hv-24x7 metric events
      may need bigger workload with system wide monitoring to get the data.
      Fix this issue by changing current system wide check from true workload
      to sleep 0.01 workload.
      
      Result with the patch changes in powerpc:
      
        98: perf all metrics test : Ok
      
      Fixes: a4b8cfca ("perf stat: Delay metric parsing")
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Tested-by: default avatarDisha Goel <disgoel@linux.ibm.com>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230215093827.124921-1-kjain@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9fa0778
    • Athira Rajeev's avatar
      perf vendor events power10: Add JSON metric events to present CPI stall cycles in powerpc · cf26e043
      Athira Rajeev authored
      Power10 Performance Monitoring Unit (PMU) provides events to understand
      stall cycles of different pipeline stages.  These events along with
      completed instructions provides useful metrics for application tuning.
      
      Patch implements the JSON changes to collect counter statistics to
      present the high level CPI stall breakdown metrics. New metric group is
      named as "CPI_STALL_RATIO" and this new metric group presents these
      stall metrics:
      
      - DISPATCHED_CPI ( Dispatch stall cycles per insn )
      - ISSUE_STALL_CPI ( Issue stall cycles per insn )
      - EXECUTION_STALL_CPI ( Execution stall cycles per insn )
      - COMPLETION_STALL_CPI ( Completition stall cycles per insn )
      
      To avoid multipling of events, PM_RUN_INST_CMPL event has been modified
      to use PMC5(performance monitoring counter5) instead of PMC4. This
      change is needed, since completion stall event is using PMC4.
      
      Usage example:
      
       ./perf stat --metric-no-group -M CPI_STALL_RATIO <workload>
      
       Performance counter stats for 'workload':
      
          63,056,817,982      PM_CMPL_STALL                    #     0.28 COMPLETION_STALL_CPI
       1,743,988,038,896      PM_ISSUE_STALL                   #     7.73 ISSUE_STALL_CPI
         225,597,495,030      PM_RUN_INST_CMPL                 #     6.18 DISPATCHED_CPI
                                                        #    37.48 EXECUTION_STALL_CPI
       1,393,916,546,654      PM_DISP_STALL_CYC
       8,455,376,836,463      PM_EXEC_STALL
      
      "--metric-no-group" is used for forcing PM_RUN_INST_CMPL to be scheduled
      in all group for more accuracy.
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230216061240.18067-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cf26e043
    • Steinar H. Gunderson's avatar
      perf intel-pt: Synthesize cycle events · 7e55b956
      Steinar H. Gunderson authored
      There is no good reason why we cannot synthesize "cycle" events from
      Intel PT just as we can synthesize "instruction" events, in particular
      when CYC packets are available. This enables using PT to getting much
      more accurate cycle profiles than regular sampling (record -e cycles)
      when the work last for very short periods (<10 ms).  Thus, add support
      for this, based off of the existing IPC calculation framework. The new
      option to --itrace is "y" (for cYcles), as c was taken for calls. Cycle
      and instruction events can be synthesized together, and are by default.
      
      The only real caveat is that CYC packets are only emitted whenever some
      other packet is, which in practice is when a branch instruction is
      encountered (and not even all branches). Thus, even at no subsampling
      (e.g. --itrace=y0ns), it is impossible to get more accuracy than a
      single basic block, and all cycles spent executing that block will get
      attributed to the branch instruction that ends the packet.  Thus, one
      cannot know whether the cycles came from e.g. a specific load, a
      mispredicted branch, or something else. When subsampling (which is the
      default), the cycle events will get smeared out even more, but will
      still be generally useful to attribute cycle counts to functions.
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarSteinar H. Gunderson <sesse@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20220322082452.1429091-1-sesse@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7e55b956
  5. 16 Feb, 2023 1 commit
    • Feng Tang's avatar
      perf c2c: Add report option to show false sharing in adjacent cachelines · 1470a108
      Feng Tang authored
      Many platforms have feature of adjacent cachelines prefetch, when it is
      enabled, for data in RAM of 2 cachelines (2N and 2N+1) granularity, if
      one is fetched to cache, the other one could likely be fetched too,
      which sort of extends the cacheline size to double, thus the false
      sharing could happens in adjacent cachelines.
      
      0Day has captured performance changed related with this [1], and some
      commercial software explicitly makes its hot global variables 128 bytes
      aligned (2 cache lines) to avoid this kind of extended false sharing.
      
      So add an option "--double-cl" for 'perf c2c report' to show false
      sharing in double cache line granularity, which acts just like the
      cacheline size is doubled. There is no change to c2c record. The
      hardware events of shared cacheline are still per cacheline, and this
      option just changes the granularity of how events are grouped and
      displayed.
      
      In the 'perf c2c report' output below (will-it-scale's 'pagefault2' case
      on old kernel):
      
        ----------------------------------------------------------------------
           26       31        2        0        0        0  0xffff888103ec6000
        ----------------------------------------------------------------------
         35.48%   50.00%    0.00%    0.00%    0.00%   0x10     0       1  0xffffffff8133148b   1153   66    971   3748   74  [k] get_mem_cgroup_from_mm
          6.45%    0.00%    0.00%    0.00%    0.00%   0x10     0       1  0xffffffff813396e4    570    0   1531    879   75  [k] mem_cgroup_charge
         25.81%   50.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81331472    949   70    593   3359   74  [k] get_mem_cgroup_from_mm
         19.35%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81339686   1352    0   1073   1022   74  [k] mem_cgroup_charge
          9.68%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff813396d6   1401    0    863    768   74  [k] mem_cgroup_charge
          3.23%    0.00%    0.00%    0.00%    0.00%   0x54     0       1  0xffffffff81333106    618    0    804     11    9  [k] uncharge_batch
      
      The offset 0x10 and 0x54 used to displayed in 2 groups, and now they are
      listed together to give users a hint of extended false sharing.
      
      [1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/
      
      Committer notes:
      
      Link: https://lore.kernel.org/r/Y+wvVNWqXb70l4uy@feng-clx
      
      Removed -a, leaving just as --double-cl, as this probably is not used so
      frequently and perhaps will be even auto-detected if we manage to record
      the MSR where this is configured.
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarJoe Mario <jmario@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230214075823.246414-1-feng.tang@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1470a108
  6. 15 Feb, 2023 1 commit
    • Yang Jihong's avatar
      perf record: Fix segfault with --overwrite and --max-size · 91621be6
      Yang Jihong authored
      When --overwrite and --max-size options of perf record are used
      together, a segmentation fault occurs. The following is an example:
      
        # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1
        [ perf record: Woken up 1 times to write data ]
        perf: Segmentation fault
        Obtained 12 stack frames.
        ./perf/perf(+0x197673) [0x55f99710b673]
        /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f]
        ./perf/perf(+0x8eb40) [0x55f997002b40]
        ./perf/perf(+0x1f6882) [0x55f99716a882]
        ./perf/perf(+0x794c2) [0x55f996fed4c2]
        ./perf/perf(+0x7b7c7) [0x55f996fef7c7]
        ./perf/perf(+0x9074b) [0x55f99700474b]
        ./perf/perf(+0x12e23c) [0x55f9970a223c]
        ./perf/perf(+0x12e54a) [0x55f9970a254a]
        ./perf/perf(+0x7db60) [0x55f996ff1b60]
        /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86]
        ./perf/perf(+0x7dfe9) [0x55f996ff1fe9]
        Segmentation fault (core dumped)
      
      backtrace of the core file is as follows:
      
        (gdb) bt
        #0  record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234
        #1  record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242
        #2  record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263
        #3  process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618
        #4  0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658,
            from=from@entry=0) at util/synthetic-events.c:1895
        #5  0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658)
            at util/synthetic-events.c:1905
        #6  0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997
        #7  0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802
        #8  0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258
        #9  0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330
        #10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384
        #11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428
        #12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562
      
      The reason is that record__bytes_written accesses the freed memory rec->thread_data,
      The process is as follows:
        __cmd_record
          -> record__free_thread_data
            -> zfree(&rec->thread_data)         // free rec->thread_data
          -> record__synthesize
            -> perf_event__synthesize_id_index
              -> process_synthesized_event
                -> record__write
                  -> record__bytes_written      // access rec->thread_data
      
      We add a member variable "thread_bytes_written" in the struct "record"
      to save the data size written by the threads.
      
      Fixes: 6d575816 ("perf record: Add support for limit perf output file size")
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiwei Sun <jiwei.sun@windriver.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      91621be6
  7. 09 Feb, 2023 1 commit
    • Ian Rogers's avatar
      perf stat: Avoid merging/aggregating metric counts twice · 37f322cd
      Ian Rogers authored
      The added perf_stat_merge_counters combines uncore counters. When
      metrics are enabled, the counts are merged into a metric_leader via the
      stat-shadow saved_value logic. As the leader now is passed an aggregated
      count, it leads to all counters being added together twice and counts
      appearing approximately doubled in metrics.
      
      This change disables the saved_value merging of counts for evsels that
      are merged. It is recommended that later changes remove the saved_value
      entirely as the two layers of aggregation in the code is confusing.
      
      Fixes: 942c5593 ("perf stat: Add perf_stat_merge_counters()")
      Reported-by: default avatarPerry Taylor <perry.taylor@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Florian Fischer <florian.fischer@muhq.space>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230209064447.83733-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      37f322cd
  8. 08 Feb, 2023 5 commits
    • Thomas Richter's avatar
      perf tools: Fix perf tool build error in util/pfm.c · 6a5558f1
      Thomas Richter authored
      I have downloaded linux-next and build the perf tool using
      
        # make LIBPFM4=1
      
      to have libpfm4 support built into perf. The build fails:
      
       # make LIBPFM4=1
      ....
      INSTALL libbpf_headers
        CC      util/pfm.o
      util/pfm.c: In function ‘print_libpfm_event’:
      util/pfm.c:189:9: error: too many arguments to function ‘print_cb->print_event’
        189 |         print_cb->print_event(print_state,
            |         ^~~~~~~~
      util/pfm.c:220:25: error: too many arguments to function ‘print_cb->print_event’
        220 |                         print_cb->print_event(print_state,
      
      The build error is caused by commit d9dc8874 ("perf pmu-events:
      Remove now unused event and metric variables") which changes the
      function prototype of
      
        struct print_callbacks {
            ...
            void (*print_event)(...);  --> last two parameters removed.
        };
      
      but does not adjust the usage of this function prototype in util/pfm.c.
      In file util/pfm.c function print_event() is still invoked with 13
      parameters instead of 11. The compile fails.
      
      When I adjust the file util/pfm.c as in this patch, the build works file.
      Please check this patch for correctness, I have just fixed the compile
      issue.
      
      Fixes: d9dc8874 ("perf pmu-events: Remove now unused event and metric variables")
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarIan Rogers <irogers@google.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: egorenar@linux.ibm.com
      Cc: linux-kernel-next@vger.kernel.org
      Link: https://lore.kernel.org/r/20230207140447.1827741-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a5558f1
    • Yicong Yang's avatar
      perf tools: Fix auto-complete on aarch64 · ffd1240e
      Yicong Yang authored
      On aarch64 CPU related events are not under event_source/devices/cpu/events,
      they're under event_source/devices/armv8_pmuv3_0/events on my machine.
      Using current auto-complete script will generate below error:
      
        [root@localhost bin]# perf stat -e
        ls: cannot access '/sys/bus/event_source/devices/cpu/events': No such file or directory
      
      Fix this by not testing /sys/bus/event_source/devices/cpu/events on
      aarch64 machine.
      
      Fixes: 74cd5815 ("perf tool: Improve bash command line auto-complete for multiple events with comma")
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarYicong Yang <yangyicong@hisilicon.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linuxarm@huawei.com
      Cc: prime.zeng@hisilicon.com
      Link: https://lore.kernel.org/r/20230207035057.43394-1-yangyicong@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffd1240e
    • Namhyung Kim's avatar
      perf lock contention: Support old rw_semaphore type · 1bece135
      Namhyung Kim authored
      The old kernel has a different type of the owner field in rwsem.  We can
      check it using bpf_core_type_matches() builtin in clang but it also
      needs its own version check since it's available on recent versions.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230207002403.63590-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1bece135
    • Namhyung Kim's avatar
      perf lock contention: Add -o/--lock-owner option · 3477f079
      Namhyung Kim authored
      When there're many lock contentions in the system, people sometimes want
      to know who caused the contention, IOW who's the owner of the locks.
      
      The -o/--lock-owner option tries to follow the lock owners for the
      contended mutexes and rwsems from BPF, and then attributes the
      contention time to the owner instead of the waiter.  It's a best effort
      approach to get the owner info at the time of the contention and doesn't
      guarantee to have the precise tracking of owners if it's changing over
      time.
      
      Currently it only handles mutex and rwsem that have owner field in their
      struct and it basically points to a task_struct that owns the lock at
      the moment.
      
      Technically its type is atomic_long_t and it comes with some LSB bits
      used for other meanings.  So it needs to clear them when casting it to a
      pointer to task_struct.
      
      Also the atomic_long_t is a typedef of the atomic 32 or 64 bit types
      depending on arch which is a wrapper struct for the counter value.  I'm
      not aware of proper ways to access those kernel atomic types from BPF so
      I just read the internal counter value directly.  Please let me know if
      there's a better way.
      
      When -o/--lock-owner option is used, it goes to the task aggregation
      mode like -t/--threads option does.  However it cannot get the owner for
      other lock types like spinlock and sometimes even for mutex.
      
        $ sudo ./perf lock con -abo -- ./perf bench sched pipe
        # Running 'sched/pipe' benchmark:
        # Executed 1000000 pipe operations between two processes
      
             Total time: 4.766 [sec]
      
               4.766540 usecs/op
                 209795 ops/sec
         contended   total wait     max wait     avg wait          pid   owner
      
               403    565.32 us     26.81 us      1.40 us           -1   Unknown
                 4     27.99 us      8.57 us      7.00 us      1583145   sched-pipe
                 1      8.25 us      8.25 us      8.25 us      1583144   sched-pipe
                 1      2.03 us      2.03 us      2.03 us         5068   chrome
      
      As you can see, the owner is unknown for the most cases.  But if we
      filter only for the mutex locks, it'd more likely get the onwers.
      
        $ sudo ./perf lock con -abo -Y mutex -- ./perf bench sched pipe
        # Running 'sched/pipe' benchmark:
        # Executed 1000000 pipe operations between two processes
      
             Total time: 4.910 [sec]
      
               4.910435 usecs/op
                 203647 ops/sec
         contended   total wait     max wait     avg wait          pid   owner
      
                 2     15.50 us      8.29 us      7.75 us      1582852   sched-pipe
                 7      7.20 us      2.47 us      1.03 us           -1   Unknown
                 1      6.74 us      6.74 us      6.74 us      1582851   sched-pipe
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230207002403.63590-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3477f079
    • Namhyung Kim's avatar
      perf lock contention: Fix to save callstack for the default modified · 55e39185
      Namhyung Kim authored
      The previous change missed to set the con->save_callstack for the
      LOCK_AGGR_CALLER mode resulting in no caller information.
      
      Fixes: ebab2916 ("perf lock contention: Support filters for different aggregation")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230207002403.63590-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55e39185
  9. 06 Feb, 2023 8 commits
  10. 05 Feb, 2023 8 commits
    • Linus Torvalds's avatar
      Linux 6.2-rc7 · 4ec5183e
      Linus Torvalds authored
      4ec5183e
    • Linus Torvalds's avatar
      Merge tag 'usb-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c608f6b5
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes that resolve some reported problems.
        These include:
      
         - gadget driver fixes
      
         - dwc3 driver fix
      
         - typec driver fix
      
         - MAINTAINERS file update.
      
        All of these have been in linux-next with no reported problems"
      
      * tag 'usb-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: typec: ucsi: Don't attempt to resume the ports before they exist
        usb: gadget: udc: do not clear gadget driver.bus
        usb: gadget: f_uac2: Fix incorrect increment of bNumEndpoints
        usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait
        usb: dwc3: qcom: enable vbus override when in OTG dr-mode
        MAINTAINERS: Add myself as UVC Gadget Maintainer
      c608f6b5
    • Linus Torvalds's avatar
      Merge tag 'tty-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · dc0ce181
      Linus Torvalds authored
      Pull tty/serial driver fixes from Greg KH:
       "Here are some small serial and vt fixes. These include:
      
         - 8250 driver fixes relating to dma issues
      
         - stm32 serial driver fix for threaded irqs
      
         - vc_screen bugfix for reported problems.
      
        All have been in linux-next for a while with no reported problems"
      
      * tag 'tty-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        vc_screen: move load of struct vc_data pointer in vcs_read() to avoid UAF
        serial: 8250_dma: Fix DMA Rx rearm race
        serial: 8250_dma: Fix DMA Rx completion race
        serial: stm32: Merge hard IRQ and threaded IRQ handling into single IRQ handler
      dc0ce181
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · d3feaff4
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a number of small char/misc/whatever driver fixes. They
        include:
      
         - IIO driver fixes for some reported problems
      
         - nvmem driver fixes
      
         - fpga driver fixes
      
         - debugfs memory leak fix in the hv_balloon and irqdomain code
           (irqdomain change was acked by the maintainer)
      
        All have been in linux-next with no reported problems"
      
      * tag 'char-misc-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (33 commits)
        kernel/irq/irqdomain.c: fix memory leak with using debugfs_lookup()
        HV: hv_balloon: fix memory leak with using debugfs_lookup()
        nvmem: qcom-spmi-sdam: fix module autoloading
        nvmem: core: fix return value
        nvmem: core: fix cell removal on error
        nvmem: core: fix device node refcounting
        nvmem: core: fix registration vs use race
        nvmem: core: fix cleanup after dev_set_name()
        nvmem: core: remove nvmem_config wp_gpio
        nvmem: core: initialise nvmem->id early
        nvmem: sunxi_sid: Always use 32-bit MMIO reads
        nvmem: brcm_nvram: Add check for kzalloc
        iio: imu: fxos8700: fix MAGN sensor scale and unit
        iio: imu: fxos8700: remove definition FXOS8700_CTRL_ODR_MIN
        iio: imu: fxos8700: fix failed initialization ODR mode assignment
        iio: imu: fxos8700: fix incorrect ODR mode readback
        iio: light: cm32181: Fix PM support on system with 2 I2C resources
        iio: hid: fix the retval in gyro_3d_capture_sample
        iio: hid: fix the retval in accel_3d_capture_sample
        iio: imu: st_lsm6dsx: fix build when CONFIG_IIO_TRIGGERED_BUFFER=m
        ...
      d3feaff4
    • Linus Torvalds's avatar
      Merge tag 'fbdev-for-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev · 870c3a9a
      Linus Torvalds authored
      Pull fbdev fixes from Helge Deller:
      
       - fix fbcon to prevent fonts bigger than 32x32 pixels to avoid
         overflows reported by syzbot
      
       - switch omapfb to use kstrtobool()
      
       - switch some fbdev drivers to use the backlight helpers
      
      * tag 'fbdev-for-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
        fbcon: Check font dimension limits
        fbdev: omapfb: Use kstrtobool() instead of strtobool()
        fbdev: fbmon: fix function name in kernel-doc
        fbdev: atmel_lcdfb: Rework backlight status updates
        fbdev: riva: Use backlight helper
        fbdev: omapfb: panel-dsi-cm: Use backlight helper
        fbdev: nvidia: Use backlight helper
        fbdev: mx3fb: Use backlight helper
        fbdev: radeon: Use backlight helper
        fbdev: atyfb: Use backlight helper
        fbdev: aty128fb: Use backlight helper
      870c3a9a
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9e482602
      Linus Torvalds authored
      Pull x86 fix from Borislav Petkov:
      
       - Prevent the compiler from reordering accesses to debug regs which
         could cause a #VC exception in SEV-ES guests at the wrong place in
         the NMI handling path
      
      * tag 'x86_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/debug: Fix stack recursion caused by wrongly ordered DR7 accesses
      9e482602
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · de506eec
      Linus Torvalds authored
      Pull perf fix from Borislav Petkov:
      
       - Lock the proper critical section when dealing with perf event context
      
      * tag 'perf_urgent_for_v6.2_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf: Fix perf_event_pmu_context serialization
      de506eec
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 837c07cf
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "It's a bit of a big batch for rc6, but just because I didn't send any
        fixes the last week or two while I was on vacation, next week should
        be quieter:
      
         - Fix a few objtool warnings since we recently enabled objtool.
      
         - Fix a deadlock with the hash MMU vs perf record.
      
         - Fix perf profiling of asynchronous interrupt handlers.
      
         - Revert the IMC PMU nest_init_lock to being a mutex.
      
         - Two commits fixing problems with the kexec_file FDT size
           estimation.
      
         - Two commits fixing problems with strict RWX vs kernels running at
           non-zero.
      
         - Reconnect tlb_flush() to hash__tlb_flush()
      
        Thanks to Kajol Jain, Nicholas Piggin, Sachin Sant Sathvika Vasireddy,
        and Sourabh Jain"
      
      * tag 'powerpc-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()
        powerpc/kexec_file: Count hot-pluggable memory in FDT estimate
        powerpc/64s/radix: Fix RWX mapping with relocated kernel
        powerpc/64s/radix: Fix crash with unaligned relocated kernel
        powerpc/kexec_file: Fix division by zero in extra size estimation
        powerpc/imc-pmu: Revert nest_init_lock to being a mutex
        powerpc/64: Fix perf profiling asynchronous interrupt handlers
        powerpc/64s: Fix local irq disable when PMIs are disabled
        powerpc/kvm: Fix unannotated intra-function call warning
        powerpc/85xx: Fix unannotated intra-function call warning
      837c07cf
  11. 04 Feb, 2023 7 commits