1. 14 Dec, 2023 5 commits
    • Kan Liang's avatar
      perf top: Use evsel's cpus to replace user_requested_cpus · 5fa695e7
      Kan Liang authored
      perf top errors out on a hybrid machine
       $perf top
      
       Error:
       The cycles:P event is not supported.
      
      The perf top expects that the "cycles" is collected on all CPUs in the
      system. But for hybrid there is no single "cycles" event which can cover
      all CPUs. Perf has to split it into two cycles events, e.g.,
      cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
      If a event is opened on the unsupported CPU. The open fails. That's the
      reason of the above error out.
      
      Perf should only open the cycles event on the corresponding CPU. The
      commit ef91871c ("perf evlist: Propagate user CPU maps intersecting
      core PMU maps") intersect the requested CPU map with the CPU map of the
      PMU. Use the evsel's cpus to replace user_requested_cpus.
      
      The evlist's threads are also propagated to the evsel's threads in
      __perf_evlist__propagate_maps(). For a system-wide event, perf appends
      a dummy event and assign it to the evsel's threads. For a per-thread
      event, the evlist's thread_map is assigned to the evsel's threads. The
      same as the other tools, e.g., perf record, using the evsel's threads
      when opening an event.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Hector Martin <marcan@marcan.st>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Closes: https://lore.kernel.org/linux-perf-users/ZXNnDrGKXbEELMXV@kernel.org/
      Link: https://lore.kernel.org/r/20231214144612.1092028-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5fa695e7
    • Namhyung Kim's avatar
      perf unwind-libunwind: Fix base address for .eh_frame · 4fb54994
      Namhyung Kim authored
      The base address of a DSO mapping should start at the start of the file.
      Usually DSOs are mapped from the pgoff 0 so it doesn't matter when it
      uses the start of the map address.
      
      But generated DSOs for JIT codes doesn't start from the 0 so it should
      subtract the offset to calculate the .eh_frame table offsets correctly.
      
      Fixes: dc2cf4ca ("perf unwind: Fix segbase for ld.lld linked objects")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Pablo Galindo <pablogsal@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231212070547.612536-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4fb54994
    • Namhyung Kim's avatar
      perf unwind-libdw: Handle JIT-generated DSOs properly · c966d23a
      Namhyung Kim authored
      Usually DSOs are mapped from the beginning of the file, so the base
      address of the DSO can be calculated by map->start - map->pgoff.
      
      However, JIT DSOs which are generated by `perf inject -j`, are mapped
      only the code segment.  This makes unwind-libdw code confusing and
      rejects processing unwinds in the JIT DSOs.  It should use the map
      start address as base for them to fix the confusion.
      
      Fixes: 1fe627da ("perf unwind: Take pgoff into account when reporting elf to libdwfl")
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Pablo Galindo <pablogsal@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231212070547.612536-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c966d23a
    • Namhyung Kim's avatar
      perf genelf: Set ELF program header addresses properly · 1af47890
      Namhyung Kim authored
      The text section starts after the ELF headers so PHDR.p_vaddr and
      others should have the correct addresses.
      
      Fixes: babd0438 ("perf jit: Include program header in ELF files")
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Lieven Hey <lieven.hey@kdab.com>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Pablo Galindo <pablogsal@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231212070547.612536-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1af47890
    • Ian Rogers's avatar
      perf stat: Combine the -A/--no-aggr and --no-merge options · 6f33e6fa
      Ian Rogers authored
      The -A or --no-aggr option disables aggregation of core events:
      
        $ perf stat -A -e cycles,data_total -a true
      
         Performance counter stats for 'system wide':
      
        CPU0            1,287,665      cycles
        CPU1            1,831,681      cycles
        CPU2           27,345,998      cycles
        CPU3            1,964,799      cycles
        CPU4              236,174      cycles
        CPU5            3,302,825      cycles
        CPU6            9,201,446      cycles
        CPU7            1,403,043      cycles
        CPU0               110.90 MiB  data_total
      
               0.008961761 seconds time elapsed
      
      The --no-merge option disables the aggregation of uncore events:
      
        $ perf stat --no-merge -e cycles,data_total -a true
      
         Performance counter stats for 'system wide':
      
                38,482,778      cycles
                     15.04 MiB  data_total [uncore_imc_free_running_1]
                     15.00 MiB  data_total [uncore_imc_free_running_0]
      
               0.005915155 seconds time elapsed
      
      Having two options confuses users who generally don't appreciate the
      difference in PMUs. Keep all the options but make it so they all
      disable aggregation both of core and uncore events:
      
        $ perf stat -A -e cycles,data_total -a true
      
         Performance counter stats for 'system wide':
      
        CPU0               85,878      cycles
        CPU1               88,179      cycles
        CPU2               60,872      cycles
        CPU3            3,265,567      cycles
        CPU4               82,357      cycles
        CPU5               83,383      cycles
        CPU6               84,156      cycles
        CPU7              220,803      cycles
        CPU0                 2.38 MiB  data_total [uncore_imc_free_running_0]
        CPU0                 2.38 MiB  data_total [uncore_imc_free_running_1]
      
               0.001397205 seconds time elapsed
      
      Update the relevant 'perf stat' man page information.
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kaige Ye <ye@kaige.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231214060256.2094017-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6f33e6fa
  2. 13 Dec, 2023 2 commits
  3. 12 Dec, 2023 5 commits
    • Ian Rogers's avatar
      libperf cpumap: Add for_each_cpu() that skips the "any CPU" case · 5805c825
      Ian Rogers authored
      When iterating CPUs in a CPU map it is often desirable to skip the "any
      CPU" (aka dummy) case. Add a helper for this and use in builtin-record.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5805c825
    • Ian Rogers's avatar
      libperf cpumap: Replace usage of perf_cpu_map__new(NULL) with perf_cpu_map__new_online_cpus() · effe957c
      Ian Rogers authored
      Passing NULL to perf_cpu_map__new() performs
      perf_cpu_map__new_online_cpus(), just directly call
      perf_cpu_map__new_online_cpus() to be more intention revealing.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      effe957c
    • Ian Rogers's avatar
      libperf cpumap: Rename perf_cpu_map__empty() to perf_cpu_map__has_any_cpu_or_is_empty() · 923ca62a
      Ian Rogers authored
      The name perf_cpu_map_empty is misleading as true is also returned
      when the map contains an "any" CPU (aka dummy) map.
      
      Rename to perf_cpu_map__has_any_cpu_or_is_empty(), later changes will
      (re)introduce perf_cpu_map__empty() and perf_cpu_map__has_any_cpu().
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      923ca62a
    • Ian Rogers's avatar
      libperf cpumap: Rename perf_cpu_map__default_new() to... · 8f60f870
      Ian Rogers authored
      libperf cpumap: Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() and prefer sysfs
      
      Rename perf_cpu_map__default_new() to perf_cpu_map__new_online_cpus() to
      better indicate what the implementation does.
      
      Read the online CPUs from /sys/devices/system/cpu/online first before
      using sysconf() as it can't accurately configure holes in the CPU map.
      
      If sysconf() is used, warn when the configured and online processors
      disagree.
      
      When reading from a file, if the read doesn't yield a CPU map then
      return an empty map rather than the default online. This avoids
      recursion but also better yields being able to detect failures.
      
      Add more comments.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-3-irogers@google.com
      [ s/syfs/sysfs/g typo ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8f60f870
    • Ian Rogers's avatar
      libperf cpumap: Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() · 48219b08
      Ian Rogers authored
      Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() to
      better indicate this is creating a CPU map for the perf_event_open "any"
      CPU case.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
      Cc: Andrew Jones <ajones@ventanamicro.com>
      Cc: André Almeida <andrealmeid@igalia.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Atish Patra <atishp@rivosinc.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Paran Lee <p4ranlee@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yang Li <yang.lee@linux.alibaba.com>
      Cc: Yanteng Si <siyanteng@loongson.cn>
      Cc: bpf@vger.kernel.org
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20231129060211.1890454-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      48219b08
  4. 11 Dec, 2023 1 commit
  5. 07 Dec, 2023 8 commits
  6. 06 Dec, 2023 8 commits
    • Ian Rogers's avatar
      perf stat: Exit perf stat if parse groups fails · 0713ab3b
      Ian Rogers authored
      Metrics were added by a callback but commit a4b8cfca ("perf
      stat: Delay metric parsing") postponed this to allow optimizations based
      on the CPU configuration.
      
      In doing so it stopped errors in metric parsing from causing 'perf stat'
      termination.
      
      This change adds the termination for bad metric names back in.
      
      Fixes: a4b8cfca ("perf stat: Delay metric parsing")
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Closes: https://lore.kernel.org/lkml/ZXByT1K6enTh2EHT@kernel.org/
      Link: https://lore.kernel.org/r/20231206183533.972028-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0713ab3b
    • Ian Rogers's avatar
      perf thread: Add missing RC_CHK_EQUAL · 01261d8a
      Ian Rogers authored
      Comparing pointers without RC_CHK_ACCESS means the indirect object
      will be compared rather than the underlying maps when REFCNT_CHECKING
      is enabled. Fix by adding missing RC_CHK_EQUAL.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-15-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01261d8a
    • Ian Rogers's avatar
      perf maps: Move symbol maps functions to maps.c · 0f6ab6a3
      Ian Rogers authored
      Move the find and certain other symbol maps__* functions to maps.c for
      better abstraction.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-14-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0f6ab6a3
    • Ian Rogers's avatar
      perf map: Simplify map_ip/unmap_ip and make 'struct map' smaller · 9fa688ea
      Ian Rogers authored
      When mapping an IP it is either an identity mapping or a DSO relative
      mapping, so a single bit is required in the struct to identify
      this.
      
      The current code uses function pointers, adding 2 pointers per map and
      also pushing the size of a map beyond 1 cache line.
      
      Switch to using a byte to identify the mapping type (as well as priv and
      erange_warned), to avoid any masking.
      
      Change struct maps's layout to avoid holes.
      
      Before:
      ```
      struct map {
              u64                        start;                /*     0     8 */
              u64                        end;                  /*     8     8 */
              _Bool                      erange_warned:1;      /*    16: 0  1 */
              _Bool                      priv:1;               /*    16: 1  1 */
      
              /* XXX 6 bits hole, try to pack */
              /* XXX 3 bytes hole, try to pack */
      
              u32                        prot;                 /*    20     4 */
              u64                        pgoff;                /*    24     8 */
              u64                        reloc;                /*    32     8 */
              u64                        (*map_ip)(const struct map  *, u64); /*    40     8 */
              u64                        (*unmap_ip)(const struct map  *, u64); /*    48     8 */
              struct dso *               dso;                  /*    56     8 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              refcount_t                 refcnt;               /*    64     4 */
              u32                        flags;                /*    68     4 */
      
              /* size: 72, cachelines: 2, members: 12 */
              /* sum members: 68, holes: 1, sum holes: 3 */
              /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
              /* last cacheline: 8 bytes */
      };
      ```
      
      After:
      ```
      struct map {
              u64                        start;                /*     0     8 */
              u64                        end;                  /*     8     8 */
              u64                        pgoff;                /*    16     8 */
              u64                        reloc;                /*    24     8 */
              struct dso *               dso;                  /*    32     8 */
              refcount_t                 refcnt;               /*    40     4 */
              u32                        prot;                 /*    44     4 */
              u32                        flags;                /*    48     4 */
              enum mapping_type          mapping_type:8;       /*    52: 0  4 */
      
              /* Bitfield combined with next fields */
      
              _Bool                      erange_warned;        /*    53     1 */
              _Bool                      priv;                 /*    54     1 */
      
              /* size: 56, cachelines: 1, members: 11 */
              /* padding: 1 */
              /* last cacheline: 56 bytes */
      };
      ```
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-13-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9fa688ea
    • Ian Rogers's avatar
      perf test shell diff: Skip test if test_loop symbol is missing in the perf binary · 407a3898
      Ian Rogers authored
      The diff test depends on finding the symbol test_loop in perf and will
      fail if perf has been stripped and no debug object is available. In that
      case, skip the test instead.
      Suggested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231205164924.835682-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      407a3898
    • Chengen Du's avatar
      perf symbols: Parse NOTE segments until the build id is found · d0acce68
      Chengen Du authored
      In the ELF file, multiple NOTE segments may exist.
      To locate the build id, the process shall persist
      in parsing NOTE segments until the build id is found.
      Signed-off-by: default avatarChengen Du <chengen.du@canonical.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231130135723.17562-1-chengen.du@canonical.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d0acce68
    • Ian Rogers's avatar
      perf record: Be lazier in allocating lost samples buffer · 030ac3ca
      Ian Rogers authored
      Wait until a lost sample occurs to allocate the lost samples buffer,
      often the buffer isn't necessary. This saves a 64kb allocation and
      5.3kb of peak memory consumption.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Guilherme Amadio <amadio@gentoo.org>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231127220902.1315692-9-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      030ac3ca
    • Ian Rogers's avatar
      perf evsel: Fallback to "task-clock" when not system wide · eb2eac0c
      Ian Rogers authored
      When the "cycles" event isn't available evsel will fallback to the
      "cpu-clock" software event.
      
      "task-clock" is similar to "cpu-clock" but only runs when the process is
      running.
      
      Falling back to "cpu-clock" when not system wide leads to confusion, by
      falling back to "task-clock" it is hoped the confusion is less.
      
      Pass the target to determine if "task-clock" is more appropriate.
      
      Update a nearby comment and debug string for the change.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ajay Kaher <akaher@vmware.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Makhalov <amakhalov@vmware.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231121000420.368075-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eb2eac0c
  7. 05 Dec, 2023 9 commits
  8. 04 Dec, 2023 2 commits