Commit 6f33e6fa authored by Ian Rogers's avatar Ian Rogers Committed by Arnaldo Carvalho de Melo

perf stat: Combine the -A/--no-aggr and --no-merge options

The -A or --no-aggr option disables aggregation of core events:

  $ perf stat -A -e cycles,data_total -a true

   Performance counter stats for 'system wide':

  CPU0            1,287,665      cycles
  CPU1            1,831,681      cycles
  CPU2           27,345,998      cycles
  CPU3            1,964,799      cycles
  CPU4              236,174      cycles
  CPU5            3,302,825      cycles
  CPU6            9,201,446      cycles
  CPU7            1,403,043      cycles
  CPU0               110.90 MiB  data_total

         0.008961761 seconds time elapsed

The --no-merge option disables the aggregation of uncore events:

  $ perf stat --no-merge -e cycles,data_total -a true

   Performance counter stats for 'system wide':

          38,482,778      cycles
               15.04 MiB  data_total [uncore_imc_free_running_1]
               15.00 MiB  data_total [uncore_imc_free_running_0]

         0.005915155 seconds time elapsed

Having two options confuses users who generally don't appreciate the
difference in PMUs. Keep all the options but make it so they all
disable aggregation both of core and uncore events:

  $ perf stat -A -e cycles,data_total -a true

   Performance counter stats for 'system wide':

  CPU0               85,878      cycles
  CPU1               88,179      cycles
  CPU2               60,872      cycles
  CPU3            3,265,567      cycles
  CPU4               82,357      cycles
  CPU5               83,383      cycles
  CPU6               84,156      cycles
  CPU7              220,803      cycles
  CPU0                 2.38 MiB  data_total [uncore_imc_free_running_0]
  CPU0                 2.38 MiB  data_total [uncore_imc_free_running_1]

         0.001397205 seconds time elapsed

Update the relevant 'perf stat' man page information.
Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
Signed-off-by: default avatarIan Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kaige Ye <ye@kaige.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231214060256.2094017-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 1bc479d6
...@@ -422,7 +422,34 @@ See perf list output for the possible metrics and metricgroups. ...@@ -422,7 +422,34 @@ See perf list output for the possible metrics and metricgroups.
-A:: -A::
--no-aggr:: --no-aggr::
Do not aggregate counts across all monitored CPUs. --no-merge::
Do not aggregate/merge counts across monitored CPUs or PMUs.
When multiple events are created from a single event specification,
stat will, by default, aggregate the event counts and show the result
in a single row. This option disables that behavior and shows the
individual events and counts.
Multiple events are created from a single event specification when:
1. PID monitoring isn't requested and the system has more than one
CPU. For example, a system with 8 SMT threads will have one event
opened on each thread and aggregation is performed across them.
2. Prefix or glob wildcard matching is used for the PMU name. For
example, multiple memory controller PMUs may exist typically with a
suffix of _0, _1, etc. By default the event counts will all be
combined if the PMU is specified without the suffix such as
uncore_imc rather than uncore_imc_0.
3. Aliases, which are listed immediately after the Kernel PMU events
by perf list, are used.
--hybrid-merge::
Merge core event counts from all core PMUs. In hybrid or big.LITTLE
systems by default each core PMU will report its count
separately. This option forces core PMU counts to be combined to give
a behavior closer to having a single CPU type in the system.
--topdown:: --topdown::
Print top-down metrics supported by the CPU. This allows to determine Print top-down metrics supported by the CPU. This allows to determine
...@@ -475,29 +502,6 @@ highlight 'tma_frontend_bound'. This metric may be drilled into with ...@@ -475,29 +502,6 @@ highlight 'tma_frontend_bound'. This metric may be drilled into with
Error out if the input is higher than the supported max level. Error out if the input is higher than the supported max level.
--no-merge::
Do not merge results from same PMUs.
When multiple events are created from a single event specification,
stat will, by default, aggregate the event counts and show the result
in a single row. This option disables that behavior and shows
the individual events and counts.
Multiple events are created from a single event specification when:
1. Prefix or glob matching is used for the PMU name.
2. Aliases, which are listed immediately after the Kernel PMU events
by perf list, are used.
--hybrid-merge::
Merge the hybrid event counts from all PMUs.
For hybrid events, by default, the stat aggregates and reports the event
counts per PMU. But sometimes, it's also useful to aggregate event counts
from all PMUs. This option enables that behavior and reports the counts
without PMUs.
For non-hybrid events, it should be no effect.
--smi-cost:: --smi-cost::
Measure SMI cost if msr/aperf/ and msr/smi/ events are supported. Measure SMI cost if msr/aperf/ and msr/smi/ events are supported.
......
...@@ -1204,8 +1204,9 @@ static struct option stat_options[] = { ...@@ -1204,8 +1204,9 @@ static struct option stat_options[] = {
OPT_STRING('C', "cpu", &target.cpu_list, "cpu", OPT_STRING('C', "cpu", &target.cpu_list, "cpu",
"list of cpus to monitor in system-wide"), "list of cpus to monitor in system-wide"),
OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode, OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
"disable CPU count aggregation", AGGR_NONE), "disable aggregation across CPUs or PMUs", AGGR_NONE),
OPT_BOOLEAN(0, "no-merge", &stat_config.no_merge, "Do not merge identical named events"), OPT_SET_UINT(0, "no-merge", &stat_config.aggr_mode,
"disable aggregation the same as -A or -no-aggr", AGGR_NONE),
OPT_BOOLEAN(0, "hybrid-merge", &stat_config.hybrid_merge, OPT_BOOLEAN(0, "hybrid-merge", &stat_config.hybrid_merge,
"Merge identical named hybrid events"), "Merge identical named hybrid events"),
OPT_STRING('x', "field-separator", &stat_config.csv_sep, "separator", OPT_STRING('x', "field-separator", &stat_config.csv_sep, "separator",
......
...@@ -898,7 +898,7 @@ static bool hybrid_uniquify(struct evsel *evsel, struct perf_stat_config *config ...@@ -898,7 +898,7 @@ static bool hybrid_uniquify(struct evsel *evsel, struct perf_stat_config *config
static void uniquify_counter(struct perf_stat_config *config, struct evsel *counter) static void uniquify_counter(struct perf_stat_config *config, struct evsel *counter)
{ {
if (config->no_merge || hybrid_uniquify(counter, config)) if (config->aggr_mode == AGGR_NONE || hybrid_uniquify(counter, config))
uniquify_event_name(counter); uniquify_event_name(counter);
} }
......
...@@ -592,7 +592,7 @@ void perf_stat_merge_counters(struct perf_stat_config *config, struct evlist *ev ...@@ -592,7 +592,7 @@ void perf_stat_merge_counters(struct perf_stat_config *config, struct evlist *ev
{ {
struct evsel *evsel; struct evsel *evsel;
if (config->no_merge) if (config->aggr_mode == AGGR_NONE)
return; return;
evlist__for_each_entry(evlist, evsel) evlist__for_each_entry(evlist, evsel)
......
...@@ -76,7 +76,6 @@ struct perf_stat_config { ...@@ -76,7 +76,6 @@ struct perf_stat_config {
bool null_run; bool null_run;
bool ru_display; bool ru_display;
bool big_num; bool big_num;
bool no_merge;
bool hybrid_merge; bool hybrid_merge;
bool walltime_run_table; bool walltime_run_table;
bool all_kernel; bool all_kernel;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment