- 23 Oct, 2013 14 commits
-
-
Adrian Hunter authored
In the absence of s DEBUG variable definition on the command line perf tools was building without optimization. Fix by assigning DEBUG if it is not defined. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-2-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Amend perf_evlist__parse_mmap_pages() to check that the mmap_pages entered via the --mmap_pages/-m option is not too big. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-15-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
parse_tag_value() accepts an "unsigned long" and multiplies it according to a tag character. Do not accept the value if the multiplication overflows. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-14-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
perf.data files contain the attributes separately, do not put them in the event stream as well. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-6-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Change perf_script from being global to being local. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-4-git-send-email-adrian.hunter@intel.com [ Made the minor consistency changes suggested by David Ahern ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
builtin-sched.c took a log time to build with -O6 optimization. This turned out to be caused by: .curr_pid = { [0 ... MAX_CPUS - 1] = -1 }, Fix by initializing curr_pid programmatically. This addresses the problem cured in f36f83f9 using a smaller hammer. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-13-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Change "struct perf_sched sched" from being global to being local. The build slowdown cured by f36f83f9 is dealt with in the following patch, by programatically setting perf_sched.curr_pid. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382427258-17495-12-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Before this patch, looking at 'perf bench sched pipe' behavior over 'top' only told us that something related to perf is running: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19934 mingo 20 0 54836 1296 952 R 18.6 0.0 0:00.56 perf 19935 mingo 20 0 54836 384 36 S 18.6 0.0 0:00.56 perf After the patch it's clearly visible what's going on: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19744 mingo 20 0 125m 3536 2644 R 68.2 0.0 0:01.12 sched-pipe 19745 mingo 20 0 125m 1172 276 R 68.2 0.0 0:01.12 sched-pipe The benchmark-subsystem name is concatenated with the individual testcase name. Unfortunately 'perf top' does not show the reconfigured name, possibly because it caches ->comm[] values and does not recognize changes to them? Also clean up a few bits in builtin-bench.c while at it and reorganize the code and the output strings to be consistent. Use iterators to access the various arrays. Rename 'suites' concept to 'benchmark collection' and the 'bench_suite' to 'benchmark/bench'. The many repetitions of 'suite' made the code harder to read and understand. The new output is: comet:~/tip/tools/perf> ./perf bench Usage: perf bench [<common options>] <collection> <benchmark> [<options>] # List of all available benchmark collections: sched: Scheduler and IPC benchmarks mem: Memory access benchmarks numa: NUMA scheduling and MM benchmarks all: All benchmarks comet:~/tip/tools/perf> ./perf bench sched # List of available benchmarks for collection 'sched': messaging: Benchmark for scheduling and IPC pipe: Benchmark for pipe() between two processes all: Test all scheduler benchmarks comet:~/tip/tools/perf> ./perf bench mem # List of available benchmarks for collection 'mem': memcpy: Benchmark for memcpy() memset: Benchmark for memset() tests all: Test all memory benchmarks comet:~/tip/tools/perf> ./perf bench numa # List of available benchmarks for collection 'numa': mem: Benchmark for NUMA workloads all: Test all NUMA benchmarks Individual benchmark modules were not touched. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Hitoshi Mitake <h.mitake@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20131023123756.GA17871@gmail.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
At this point, --fentry (mcount function entry) option for gcc fuzzes the debuginfo variable locations by skipping the mcount instruction offset (on x86, this is a 5 byte call instruction). This makes variable searching fail at the entry of functions which are mcount'ed. e.g.) Available variables at vfs_read @<vfs_read+0> (No matched variables) This patch adds additional location search at the function entry point to solve this issue, which tries to find the earliest address for the variable location. Note that this only works with function parameters (formal parameters) because any local variables should not exist on the function entry address (those are not initialized yet). With this patch, perf probe shows correct parameters if possible; # perf probe --vars vfs_read Available variables at vfs_read @<vfs_read+0> char* buf loff_t* pos size_t count struct file* file Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20131011071025.15557.13275.stgit@udc4-manage.rcp.hitachi.co.jpSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Support "$vars" meta argument syntax for tracing all local variables at probe point. Now you can trace all available local variables (including function parameters) at the probe point by passing $vars. # perf probe --add foo $vars This automatically finds all local variables at foo() and adds it as probe arguments. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20131011071023.15557.51770.stgit@udc4-manage.rcp.hitachi.co.jpSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
As suggested by tglx, 'self' should be replaced by something that is more useful. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-fmblhc6tbb99tk1q8vowtsbj@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
[root@sandy ~]# perf test -v 22 22: Test sample parsing : --- start --- sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating ---- end ---- Test sample parsing: FAILED! [root@sandy ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cx83wuzz30m10m4s1xt0ocyq@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Before: [root@sandy ~]# perf test -v 22 22: Test sample parsing : --- start --- sample format has changed - test needs updating ---- end ---- Test sample parsing: FAILED! [root@sandy ~]# After: [root@sandy ~]# perf test -v 22 22: Test sample parsing : --- start --- sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating ---- end ---- Test sample parsing: FAILED! [root@sandy ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-8cazc2fpmk70jcbww8c0cobx@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Convert callchain children list to rbtree, greatly reducing the time taken for callchain processing, from Namhyung Kim. * Add --max-stack option to limit callchain stack scan in 'top' and 'report', improving callchain processing when reducing the stack depth is an option, from Waiman Long. * Compare dso's also when comparing symbols, to avoid grouping together symbols with the same name but on different DSOs, fix from Namhyung Kim. * 'perf trace' now can can use a 'perf probe' wannabe tracepoint to hook into the userspace -> kernel pathname copy so that it can map fds to pathnames without reading /proc/pid/fd/ symlinks. From Arnaldo Carvalho de Melo. * 'perf trace' now emits hints as to why tracing is not possible, helping the user to setup the system to allow tracing in the desired permission granularity, telling if the problem is due to debugfs not being mounted or with not enough permission for !root, /proc/sys/kernel/perf_event_paranoit value, etc. From Arnaldo Carvalho de Melo. * Add missing 'mmap2' in evsel debug print, from Adrian Hunter. * Add missing decrement in id sample parsing, not a fix per se, just to avoid a problem whem somebody adds another field, from Adrian Hunter. * Improve write_output error message in 'perf record', from Adrian Hunter. * Add missing sample flush for piped events, fix from Adrian Hunter. * Add missing members to perf_event__attr_swap(), fix from Adrian Hunter. * Assorted fixes for 32-bit build, from Adrian Hunter * Print addr by default for BTS in 'perf script', from Adrian Juntmer * Separating data file properties from session, code reorganization from Jiri Olsa. * Show error in 'perf list' if tracepoints not available, from Pekka Enberg. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 21 Oct, 2013 18 commits
-
-
Waiman Long authored
When the callgraph function is enabled (-G), it may take a long time to scan all the stack data and merge them accordingly. This patch adds a new --max-stack option to perf-top to limit the depth of callchain stack data to look at to reduce the time it takes for perf-top to finish its processing. It reduces the amount of information provided to the user in exchange for faster speed. Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Davidlohr Bueso <davidlohr@hp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Scott J Norton <scott.norton@hp.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382107129-2010-5-git-send-email-Waiman.Long@hp.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Waiman Long authored
When callgraph data was included in the perf data file, it may take a long time to scan all those data and merge them together especially if the stored callchains are long and the perf data file itself is large, like a Gbyte or so. The callchain stack is currently limited to PERF_MAX_STACK_DEPTH (127). This is a large value. Usually the callgraph data that developers are most interested in are the first few levels, the rests are usually not looked at. This patch adds a new --max-stack option to perf-report to limit the depth of callchain stack data to look at to reduce the time it takes for perf-report to finish its processing. It trades the presence of trailing stack information with faster speed. The following table shows the elapsed time of doing perf-report on a perf.data file of size 985,531,828 bytes. --max_stack Elapsed Time Output data size ----------- ------------ ---------------- not set 88.0s 124,422,651 64 87.5s 116,303,213 32 87.2s 112,023,804 16 86.6s 94,326,380 8 59.9s 33,697,248 4 40.7s 10,116,637 -g none 27.1s 2,555,810 Signed-off-by: Waiman Long <Waiman.Long@hp.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Scott J Norton <scott.norton@hp.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382107129-2010-4-git-send-email-Waiman.Long@hp.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Removing 'fd, fd_pipe, filename, size' from struct perf_session and replacing them with struct perf_data_file object. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-4-git-send-email-jolsa@redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
Adding perf_data_file__open interface to data object to open the perf.data file for both read and write. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-3-git-send-email-jolsa@redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Jiri Olsa authored
This patch is adding 'struct perf_data_file' object as a placeholder for all attributes regarding perf.data file handling. Changing perf_session__new to take it as an argument. The rest of the functionality will be added later to keep this change simple enough, because all the places using perf_session are changed now. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381847254-28809-2-git-send-email-jolsa@redhat.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Linus reported that sometimes 'perf report -s symbol' exits without any message on TUI. David and Jiri found that it's because it failed to add a hist entry due to an invalid symbol length. It turns out that sorting by symbol (address) was broken since it only compares symbol addresses. The symbol address is a relative address within a dso thus just checking its address can result in merging unrelated symbols together. Fix it by checking dso before comparing symbol address. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381802517-18812-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
Current collapse stage has a scalability problem which can be reproduced easily with a parallel kernel build. This is because it needs to traverse every children of callchains linearly during the collapse/merge stage. Converting it to a rbtree reduced the overhead significantly. On my 400MB perf.data file which recorded with make -j32 kernel build: $ time perf --no-pager report --stdio > /dev/null before: real 6m22.073s user 6m18.683s sys 0m0.706s after: real 0m20.780s user 0m19.962s sys 0m0.689s During the perf report the overhead on append_chain_children went down from 96.69% to 18.16%: - 18.16% perf perf [.] append_chain_children - append_chain_children - 77.48% append_chain_children + 69.79% merge_chain_branch - 22.96% append_chain_children + 67.44% merge_chain_branch + 30.15% append_chain_children + 2.41% callchain_append + 7.25% callchain_append + 12.26% callchain_append + 10.22% merge_chain_branch + 11.58% perf perf [.] dso__find_symbol + 8.02% perf perf [.] sort__comm_cmp + 5.48% perf libc-2.17.so [.] malloc_consolidate Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1381468543-25334-2-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Pekka Enberg authored
Tracepoints are not visible in "perf list" on Fedora 19 because regular users have no permission to /sys/kernel/debug by default. Show an error message so that the user knows about it instead of assuming that tracepoints are not supported on the system. Signed-off-by: Pekka Enberg <penberg@kernel.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Link: http://lkml.kernel.org/r/1381867647-8594-1-git-send-email-penberg@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
The addr field is not displayed by default for hardware events, however for branch events it is the target of the branch so for BTS display it by default if it was recorded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-18-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
The same code is used in perf_evlist__mmap_per_cpu() and perf_evlist__mmap_per_thread(). Factor it out into a separate function perf_evlist__mmap_per_evsel(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-17-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Put the comments into the correct kernel-doc format and correct reference to perf_evlist__read_on_cpu() which should be perf_evlist__mmap_read(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-16-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
bench/numa.c: In function 'worker_thread': bench/numa.c:1123:20: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] bench/numa.c:1171:6: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' [-Werror=format] cc1: all warnings being treated as errors Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-13-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
builtin-record.c:42:12: error: static declaration of 'on_exit' follows non-static declaration In file included from util/util.h:51:0, from builtin.h:4, from builtin-record.c:8: /usr/include/stdlib.h:536:12: note: previous declaration of 'on_exit' was here Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-12-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
util/evlist.c: In function 'perf_evlist__mmap': util/evlist.c:772:2: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t' [-Werror=format] cc1: all warnings being treated as errors Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-11-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
The perf_event__attr_swap() method needs to swap all members of struct perf_event_attr. Add missing ones. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-7-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Piped events can be sorted so a final flush is needed. Add that and remove a redundant 'err = 0'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-6-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
Improve the error message from write_output() to say what failed to write and give the error number. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-4-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Adrian Hunter authored
The final array decrement in id sample parsing is missing, which may trip up the next person adding a sample format, so add it in. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-5-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 18 Oct, 2013 1 commit
-
-
Adrian Hunter authored
The struct perf_event_attr now has a 'mmap2' member. Add it to perf_event_attr__fprintf(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1382099356-4918-2-git-send-email-adrian.hunter@intel.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 17 Oct, 2013 3 commits
-
-
Arnaldo Carvalho de Melo authored
kernel/events/core.c has: /* * perf event paranoia level: * -1 - not paranoid at all * 0 - disallow raw tracepoint access for unpriv * 1 - disallow cpu events for unpriv * 2 - disallow kernel profiling for unpriv */ int sysctl_perf_event_paranoid __read_mostly = 1; So, with the default being 1, a non-root user can trace his stuff: [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid 1 [acme@zoo ~]$ yes > /dev/null & [1] 15338 [acme@zoo ~]$ trace -p 15338 | head -5 0.005 ( 0.005 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096 0.045 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096 0.085 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096 0.125 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096 0.165 ( 0.001 ms): write(fd: 1</dev/null>, buf: 0x7fe6db765000, count: 4096 ) = 4096 [acme@zoo ~]$ [acme@zoo ~]$ trace --duration 1 sleep 1 1002.148 (1001.218 ms): nanosleep(rqtp: 0x7fff46c79250 ) = 0 [acme@zoo ~]$ [acme@zoo ~]$ trace -- usleep 1 | tail -5 0.905 ( 0.002 ms): brk( ) = 0x1c82000 0.910 ( 0.003 ms): brk(brk: 0x1ca3000 ) = 0x1ca3000 0.913 ( 0.001 ms): brk( ) = 0x1ca3000 0.990 ( 0.059 ms): nanosleep(rqtp: 0x7fffe31a3280 ) = 0 0.995 ( 0.000 ms): exit_group( [acme@zoo ~]$ But can't do system wide tracing: [acme@zoo ~]$ trace Error: Operation not permitted. Hint: Check /proc/sys/kernel/perf_event_paranoid setting. Hint: For system wide tracing it needs to be set to -1. Hint: The current value is 1. [acme@zoo ~]$ [acme@zoo ~]$ trace --cpu 0 Error: Operation not permitted. Hint: Check /proc/sys/kernel/perf_event_paranoid setting. Hint: For system wide tracing it needs to be set to -1. Hint: The current value is 1. [acme@zoo ~]$ If the paranoid level is >= 2, i.e. turn this perf stuff off for !root users: [acme@zoo ~]$ sudo sh -c 'echo 2 > /proc/sys/kernel/perf_event_paranoid' [acme@zoo ~]$ cat /proc/sys/kernel/perf_event_paranoid 2 [acme@zoo ~]$ [acme@zoo ~]$ trace usleep 1 Error: Permission denied. Hint: Check /proc/sys/kernel/perf_event_paranoid setting. Hint: For your workloads it needs to be <= 1 Hint: For system wide tracing it needs to be set to -1. Hint: The current value is 2. [acme@zoo ~]$ [acme@zoo ~]$ trace Error: Permission denied. Hint: Check /proc/sys/kernel/perf_event_paranoid setting. Hint: For your workloads it needs to be <= 1 Hint: For system wide tracing it needs to be set to -1. Hint: The current value is 2. [acme@zoo ~]$ [acme@zoo ~]$ trace --cpu 1 Error: Permission denied. Hint: Check /proc/sys/kernel/perf_event_paranoid setting. Hint: For your workloads it needs to be <= 1 Hint: For system wide tracing it needs to be set to -1. Hint: The current value is 2. [acme@zoo ~]$ If the user manages to get what he/she wants, convincing root not to be paranoid at all... [root@zoo ~]# echo -1 > /proc/sys/kernel/perf_event_paranoid [root@zoo ~]# cat /proc/sys/kernel/perf_event_paranoid -1 [root@zoo ~]# [acme@zoo ~]$ ps -eo user,pid,comm | grep Xorg root 729 Xorg [acme@zoo ~]$ [acme@zoo ~]$ trace -a --duration 0.001 -e \!select,ioctl,writev | grep Xorg | head -5 23.143 ( 0.003 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0 23.152 ( 0.004 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096 ) = 8 23.161 ( 0.002 ms): Xorg/729 read(fd: 31, buf: 0x2544af03, count: 4096 ) = -1 EAGAIN Resource temporarily unavailable 23.175 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0 23.235 ( 0.002 ms): Xorg/729 setitimer(which: REAL, value: 0x7fffaadf16e0 ) = 0 [acme@zoo ~]$ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-di28olfwd28rvkox7v3hqhu1@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Just opens a file and calls atoi() in at most its first 64 bytes. To read things like /proc/sys/kernel/perf_event_paranoid. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-669q04c5tou5pnt8jtiz6y2r@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Out of 'perf trace', should be used by other tools that uses tracepoints. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ramkumar Ramachandra <artagnon@gmail.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-lyvtxhchz4ga8fwht15x8wou@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 16 Oct, 2013 4 commits
-
-
Arnaldo Carvalho de Melo authored
We need to differentiate SIGCHLD from SIGINT, the later should cause as immediate as possible exit, while the former should wait to process the events that may be perceived in the ring buffer after the SIGCHLD is handled. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vf6n57ewm3mjy2sz6r491hus@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Initially it tries to find a probe:vfs_getname that should be setup with: perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string' or with slight changes to cope with code flux in the getname_flags code. In the future, if a "vfs:getname" tracepoint becomes available, then it will be preferred. This is not strictly required and more expensive method of reading the /proc/pid/fd/ symlink will be used when the fd->path array entry is not populated by a previous vfs_getname + open syscall ret sequence. As with any other 'perf probe' probe the setup must be done just once and the probe will be left inactive, waiting for users, be it 'perf trace' of any other tool. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
So that the part that grows the array as needed is untied from the code that reads the /proc/pid/fd symlink and can be used for the vfs_getname hook that will set the fd -> path translation too, when available. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-ydo5rumyv9hdc1vsfmqamugs@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Peter Zijlstra authored
There's been reports of high NMI handler overhead, highlighted by such kernel messages: [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000 [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs Don Zickus analyzed the source of the overhead and reported: > While there are a few places that are causing latencies, for now I focused on > the longest one first. It seems to be 'copy_user_from_nmi' > > intel_pmu_handle_irq -> > intel_pmu_drain_pebs_nhm -> > __intel_pmu_drain_pebs_nhm -> > __intel_pmu_pebs_event -> > intel_pmu_pebs_fixup_ip -> > copy_from_user_nmi > > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles > (there are some cases where only 10 iterations are needed to go that high > too, but in generall over 50 or so). At this point copy_user_from_nmi > seems to account for over 90% of the nmi latency. The solution to that is to avoid having to call copy_from_user_nmi() for every instruction. Since we already limit the max basic block size, we can easily pre-allocate a piece of memory to copy the entire thing into in one go. Don reported this test result: > Your patch made a huge difference in improvement. The > copy_from_user_nmi() no longer hits the million of cycles. I still > have a batch of 100,000-300,000 cycles. My longest NMI paths used > to be dominated by copy_from_user_nmi, now it is not (I have to dig > up the new hot path). Reported-and-tested-by: Don Zickus <dzickus@redhat.com> Cc: jmario@redhat.com Cc: acme@infradead.org Cc: dave.hansen@linux.intel.com Cc: eranian@google.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131016105755.GX10651@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
-