perf stat: Add no-aggregation mode to -a
This patch adds a new -A option to perf stat. If specified then perf stat does not aggregate counts across all monitored CPUs in system-wide mode, i.e., when using -a. This option is not supported in per-thread mode. Being able to get a per-cpu breakdown is useful to detect imbalances between CPUs when running a uniform workload than spans all monitored CPUs. The second version corrects the missing cpumap[] support, so that it works when the -C option is used. The third version fixes a missing cpumap[] in print_counter() and removes a stray patch in builtin-trace.c. Examples on a 4-way system: # perf stat -a -e cycles,instructions -- sleep 1 Performance counter stats for 'sleep 1': 9592808135 cycles 3490380006 instructions # 0.364 IPC 1.001584632 seconds time elapsed # perf stat -a -A -e cycles,instructions -- sleep 1 Performance counter stats for 'sleep 1': CPU0 2398163767 cycles CPU1 2398180817 cycles CPU2 2398217115 cycles CPU3 2398247483 cycles CPU0 872282046 instructions # 0.364 IPC CPU1 873481776 instructions # 0.364 IPC CPU2 872638127 instructions # 0.364 IPC CPU3 872437789 instructions # 0.364 IPC 1.001556052 seconds time elapsed Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@amd.com> LKML-Reference: <4ce257b5.1e07e30a.7b6b.3aa9@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Showing
Please register or sign in to comment