- 15 May, 2015 4 commits
-
-
Arnaldo Carvalho de Melo authored
Now that we have atomic.h, we should convert all of the existing refcounts to use it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-qhpv2etncj3hfofgj1aitkyv@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Use atomic_read(&counter) instead. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-k3hvfvpaut8wp02lzq27muhb@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Now that we have atomic.h, we should convert all of the existing refcounts to use it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-onm5u3pioba1hqqhjs8on03e@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Add --range option to show a variable's location range in 'perf probe', helping in collecting variables in probes when there is a mismatch between assembly and source code (He Kuang) - Show better error message when failed to find variable in 'perf probe' (He Kuang) - Fix 'perf report --thread' handling and document it better (Namhyung Kim) Infrastructure changes: - Fix to get negative exit codes in 'perf test' test routines (He Kuang) - Make flex/bison calls honour V=1 (Jiri Olsa) - Ignore tail calls to probed functions in 'perf probe' (Naveen N. Rao) - Fix refcount expectations in map_group share 'perf test' (Arnaldo Carvalho de Melo) Build Fixes: - Fix 'perf kmem' build due to compiler thinking uninitialized var is being accessed (Arnaldo Carvalho de Melo) - Provide le16toh if not defined, to fix the libtraceevent build on older distros (Arnaldo Carvalho de Melo) - Fix 'perf trace' build on older distros by providing some CLOEXEC, NONBLOCK defines (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 14 May, 2015 5 commits
-
-
Jiri Olsa authored
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-dnc2ggwhffdpuvijwq4rkic9@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Such as RHEL5, where CLOEXEC, NONBLOCK flags are not present, use a ifdef+define approach instead to make it build on all distros. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Vinson Lee <vlee@twitter.com> Link: http://lkml.kernel.org/n/tip-pioazikk9d9oz5qdeor3eldu@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Where such macro is not present, so just copy its definition from glibc's endian.h and define it if not already. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-4j90i2na07ppidt0z6cbuxr7@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
There's a bug that perf report sometimes ignore some options on --stdio output. This bug is triggered only if a related config variable is set. For example, let's assume we have a following config file. $ cat ~/.perfconfig [call-graph] print-type = graph [hist] percentage = absolute Then, following perf config will not honor some options. $ perf record -ag sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.199 MB perf.data (77 samples) ] $ perf report -g none --stdio # To display the perf.data header info, please use --header/--header-only options. # # Samples: 77 of event 'cycles' # Event count (approx.): 25425383 # # Overhead Command Shared Object Symbol # ........ ............... ....................... .............. # 16.34% swapper [kernel.vmlinux] [k] intel_idle | ---intel_idle cpuidle_enter_state cpuidle_enter cpu_startup_entry ... With '-g none' option, it should not show callchains, but it still shows callchains. However it works as expected on --tui output. Similarly, '--percentage relative' option is not work and still shows a absolute percentage values. Looking at the source, I found that those setting were overwritten by config variables when setup_pager() called. The setup_pager() is to start a pager process so that it can manage long lines of output on the stdio mode. But as it calls the perf_config() after parsing arguments, the settings were overwritten regardless of command line options. The reason it calls perf_config() is to find the 'pager_program' which might be set by a config variable, I guess. However current perf code does not provide the config variable for it, so it's just meaningless IMHO. Eliminating the call makes the option working as expected. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/1431529406-6762-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Naveen N. Rao authored
perf probe currently errors out if there are any tail calls to probed functions: [root@rhel71be]# perf probe do_fork Failed to find probe point in any functions. Error: Failed to add events. Fix this by teaching perf to ignore tail calls. Without patch: [root@rhel71be perf]# ./perf probe -v do_fork probe-definition(0): do_fork symbol:do_fork file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) symsrc__init: build id mismatch for /boot/vmlinux. Using /usr/lib/debug/lib/modules/3.10.0-201.el7.ppc64/vmlinux for symbols Open Debuginfo file: /usr/lib/debug/lib/modules/3.10.0-201.el7.ppc64/vmlinux Try to find probe point from debuginfo. found inline addr: 0xc0000000000bb9b0 Probe point found: do_fork+0 found inline addr: 0xc0000000000bbe20 Probe point found: kernel_thread+48 found inline addr: 0xc0000000000bbe5c Probe point found: sys_fork+28 found inline addr: 0xc0000000000bbfac Probe point found: sys_vfork+44 found inline addr: 0xc0000000000bc27c Failed to find probe point in any functions. An error occurred in debuginfo analysis (-2). Error: Failed to add events. Reason: No such file or directory (Code: -2) With patch: [root@rhel71be perf]# ./perf probe -v do_fork probe-definition(0): do_fork symbol:do_fork file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) symsrc__init: build id mismatch for /boot/vmlinux. Using /usr/lib/debug/lib/modules/3.10.0-201.el7.ppc64/vmlinux for symbols Open Debuginfo file: /usr/lib/debug/lib/modules/3.10.0-201.el7.ppc64/vmlinux Try to find probe point from debuginfo. found inline addr: 0xc0000000000bb9b0 Probe point found: do_fork+0 found inline addr: 0xc0000000000bbe20 Probe point found: kernel_thread+48 found inline addr: 0xc0000000000bbe5c Probe point found: sys_fork+28 found inline addr: 0xc0000000000bbfac Probe point found: sys_vfork+44 found inline addr: 0xc0000000000bc27c Ignoring tail call from SyS_clone Found 4 probe_trace_events. Opening /sys/kernel/debug/tracing/kprobe_events write=1 No kprobe blacklist support, ignored Added new events: Writing event: p:probe/do_fork _text+768432 Failed to write event: Invalid argument Error: Failed to add events. Reason: Invalid argument (Code: -22) [Ignore the error about failure to write event - this kernel is missing a patch to resolve _text properly] The reason to ignore tail calls is that the address does not belong to any function frame. In the example above, the address in SyS_clone is 0xc0000000000bc27c, but looking at the debug-info: <1><830081>: Abbrev Number: 133 (DW_TAG_subprogram) <830083> DW_AT_external : 1 <830083> DW_AT_name : (indirect string, offset: 0x3cea3): SyS_clone <830087> DW_AT_decl_file : 7 <830088> DW_AT_decl_line : 1689 <83008a> DW_AT_prototyped : 1 <83008a> DW_AT_type : <0x8110eb> <83008e> DW_AT_low_pc : 0xc0000000000bc270 <830096> DW_AT_high_pc : 0xc <83009e> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <8300a0> DW_AT_GNU_all_call_sites: 1 <8300a0> DW_AT_sibling : <0x830178> <snip> <3><830147>: Abbrev Number: 125 (DW_TAG_GNU_call_site) <830148> DW_AT_low_pc : 0xc0000000000bc27c <830150> DW_AT_GNU_tail_call: 1 <830150> DW_AT_abstract_origin: <0x82e7e1> The frame ends at 0xc0000000000bc27c. I suppose this is why this particular call is a "tail" call. FWIW, systemtap seems to ignore these as well and requires users to explicitly place probes at these call sites if necessary. I print out the caller so that users know. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/1430394151-15928-1-git-send-email-naveen.n.rao@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 12 May, 2015 12 commits
-
-
Arnaldo Carvalho de Melo authored
When introducing reference counting for struct thread instances I forgot to remove the synthetic threads from the machine's rbtree so that it then the threads would have just one reference and thus the thread__put() replacing the thread__delete() really turns into a thread__delete() (thread->refcnt == 1 at thread__put() time) and thus drop the thread->mg refcount, as expected by the this test. Fix it by calling machine__remove_thread() (the counterpart of machine__findnew_thread()) on all the synthetic threads after the checks that involves the rbtree were done. Before: # perf test -v mg 30: Test thread mg sharing : --- start --- test child forked, pid 26995 FAILED tests/thread-mg-share.c:68 wrong refcnt (4 != 3) test child finished with -1 ---- end ---- Test thread mg sharing: FAILED! # After: # perf test mg 30: Test thread mg sharing: Ok # Fixes: b91fc39f ("perf machine: Protect the machine->threads with a rwlock") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-uoqq0fjei90ohhhcboz6ay33@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Since it is all associated with the refcount for keeping the thread in the rbtree, it is excessive and unecessarily complex to hold a refcont when changing machine->last_match. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-98kuesmfwtvhsrzx7ttyb0kt@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
To help understand the failure. [acme@zoo linux]$ perf test -v 30 30: Test thread mg sharing : --- start --- test child forked, pid 12275 FAILED tests/thread-mg-share.c:68 wrong refcnt (4 != 3) test child finished with -1 ---- end ---- Test thread mg sharing: FAILED! [acme@zoo linux]$ This is under investigation, the thread__delete() calls were replaced with thread__put(), and those cause mismatches because now we need to be more judicious with the thread lifetime management. I.e. previously the thread__delete() would drop the map_group refcount, but now since thread__put doesn't call thread__delete() necessarily. because we have other refcount holders, the map_group refcount will not be as we expected when this test was implemented. Will be fixed soon... Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-9y8e3f7ukzco5loxvnlitpfq@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
It seems there's no reason to suppress per-thread event stat by -T option when -s or -p option is used. Make it work with those options. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1431351879-23798-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
WEXITSTATUS consists of the least significant 8 bits of the status argument, so we should convert the value to signed char if we have valid negative exit codes. And the return value of test->func() contains negative values: enum { TEST_OK = 0, TEST_FAIL = -1, TEST_SKIP = -2, }; Before this patch: $ perf test -v 1 ... test child finished with 254 ---- end ---- vmlinux symtab matches kallsyms: FAILED! After this patch: $ perf test -v 1 ... test child finished with -2 ---- end ---- vmlinux symtab matches kallsyms: Skip Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1431347316-30401-1-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
Indicate to check variable location range in error message when we got failed to find the variable. Before this patch: $ perf probe --add 'generic_perform_write+118 bytes' Failed to find the location of bytes at this address. Perhaps, it has been optimized out. Error: Failed to add events. After this patch: $ perf probe --add 'generic_perform_write+118 bytes' Failed to find the location of the 'bytes' variable at this address. Perhaps it has been optimized out. Use -V with the --range option to show 'bytes' location range. Error: Failed to add events. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1431336304-16863-3-git-send-email-hekuang@huawei.com [ Improve the error message based on lkml thread ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
It is not easy for users to get the accurate byte offset or the line number where a local variable can be probed. With '--range' option, local variables in the scope of the probe point are showed with a byte offset range, and can be added according to this range information. For example, there are some variables in the function generic_perform_write(): <generic_perform_write@mm/filemap.c:0> 0 ssize_t generic_perform_write(struct file *file, 1 struct iov_iter *i, loff_t pos) 2 { 3 struct address_space *mapping = file->f_mapping; 4 const struct address_space_operations *a_ops = mapping->a_ops; ... 42 status = a_ops->write_begin(file, mapping, pos, bytes, flags, &page, &fsdata); 44 if (unlikely(status < 0)) But we fail when we try to probe the variable 'a_ops' at line 42 or 44. $ perf probe --add 'generic_perform_write:42 a_ops' Failed to find the location of a_ops at this address. Perhaps, it has been optimized out. This is because the source code do not match the assembly, so a variable may not be available in the source code line where it appears. After this patch, we can lookup the accurate byte offset range of a variable, 'INV' indicates that this variable is not valid at the given point, but available in the scope: $ perf probe --vars 'generic_perform_write:42' --range Available variables at generic_perform_write:42 @<generic_perform_write+141> [INV] ssize_t written @<generic_perform_write+[324-331]> [INV] struct address_space_operations* a_ops @<generic_perform_write+[55-61,170-176,223-246]> [VAL] (unknown_type) fsdata @<generic_perform_write+[70-307,346-411]> [VAL] loff_t pos @<generic_perform_write+[0-286,286-336,346-411]> [VAL] long int status @<generic_perform_write+[83-342,346-411]> [VAL] long unsigned int bytes @<generic_perform_write+[122-311,320-338,346-403,403-411]> [VAL] struct address_space* mapping @<generic_perform_write+[35-344,346-411]> [VAL] struct iov_iter* i @<generic_perform_write+[0-340,346-411]> [VAL] struct page* page @<generic_perform_write+[70-307,346-411]> Then it is more clear for us to add a probe with this variable: $ perf probe --add 'generic_perform_write+170 a_ops' Added new event: probe:generic_perform_write (on generic_perform_write+170 with a_ops) Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1431336304-16863-2-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
Use struct strbuf instead of bare char[] to remove the length limitation of variables in variable_list, so they will not disappear due to overlength, and make preparation for adding more description for variables. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1431336304-16863-1-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
He Kuang authored
No need to test trace.evlist against NULL twice. Signed-off-by: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1431347316-30401-2-git-send-email-hekuang@huawei.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The -T/--thread option is supported only on --stdio mode (at least for now). So enforce the tty output if the option was requested. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1431184784-30525-2-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Namhyung Kim authored
The 'perf record -s' and 'perf report -T' should be used together to see per-thread event counts. Document the relation of these commands. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1431184784-30525-1-git-send-email-namhyung@kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
The last argument to strtok_r doesn't need to be initialized, its just a placeholder to make this routine reentrant, but gcc doesn't know about that and complains, breaking the build, fix it by setting it to NULL. Fixes: 0e111156 ("perf kmem: Print gfp flags in human readable string") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-iyyvkbnkrd9g19f6ta9zfkem@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
- 11 May, 2015 3 commits
-
-
Stephane Eranian authored
This patch enables the uncore Memory Controller (IMC) PMU support for Intel Broadwell-U (Model 61) mobile processors. The IMC PMU enables measuring memory bandwidth. To use with perf: $ perf stat -a -I 1000 -e uncore_imc/data_reads/,uncore_imc/data_writes/ sleep 10 Tested-by: Sonny Rao <sonnyrao@chromium.org> Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kan.liang@intel.com Cc: peterz@infradead.org Link: http://lkml.kernel.org/r/20150423065642.GA4890@thinkpadSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Ingo Molnar authored
Conflicts: tools/perf/builtin-kmem.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
Stephane Eranian authored
This patch enables RAPL counters (energy consumption counters) support for Intel Broadwell-U processors (Model 61): To use: $ perf stat -a -I 1000 -e power/energy-cores/,power/energy-pkg/,power/energy-ram/ sleep 10 Signed-off-by: Stephane Eranian <eranian@google.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jacob.jun.pan@linux.intel.com Cc: kan.liang@intel.com Cc: peterz@infradead.org Cc: sonnyrao@chromium.org Link: http://lkml.kernel.org/r/20150423070709.GA4970@thinkpadSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 09 May, 2015 1 commit
-
-
Ingo Molnar authored
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - 'perf probe' improvements: (Masami Hiramatsu) - Support glob wildcards for function name - Support $params special probe argument: Collect all function arguments - Make --line checks validate C-style function name. - Add --no-inlines option to avoid searching inline functions - Introduce new 'perf bench futex' benchmark: 'wake-parallel', to measure parallel waker threads generating contention for kernel locks (hb->lock). (Davidlohr Bueso) Bug fixes: - Improve 'perf top' to survive much longer on high core count machines, more work needed to refcount more data structures besides 'struct thread' and fix more races. (Arnaldo Carvalho de Melo) Infrastructure changes: - Move barrier.h mb/rmb/wmb API from tools/perf/ to kernel like tools/arch/ hierarchy. (Arnaldo Carvalho de Melo) - Borrow atomic.h from the kernel, initially the x86 implementations with a fallback to gcc intrinsics for the other arches, all the kernel like framework in place for doing arch specific implementations, preferrably cloning what is in the kernel to the greater extent possible. (Arnaldo Carvalho de Melo) - Protect the 'struct thread' lifetime with a reference counter, and protect data structures that contains its instances with a mutex. (Arnaldo Carvalho de Melo - Disable libdw DWARF unwind when built with NO_DWARF (Naveen N. Rao) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
-
- 08 May, 2015 15 commits
-
-
Naveen N. Rao authored
We get a linker error if we try to build with NO_DWARF since we build util/unwind-libdw.c, but do not include -ldw Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1430306131-6780-1-git-send-email-naveen.n.rao@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Support glob wildcards for function name when adding new probes. This will allow us to build caches of function-entry level information with $params. e.g. ---- # perf probe --no-inlines --add 'kmalloc* $params' Added new events: probe:kmalloc_slab (on kmalloc* with $params) probe:kmalloc_large_node (on kmalloc* with $params) probe:kmalloc_order_trace (on kmalloc* with $params) You can now use it in all perf tools, such as: perf record -e probe:kmalloc_order_trace -aR sleep 1 # perf probe --list probe:kmalloc_large_node (on kmalloc_large_node@mm/slub.c with size flags node) probe:kmalloc_order_trace (on kmalloc_order_trace@mm/slub.c with size flags order) probe:kmalloc_slab (on kmalloc_slab@mm/slab_common.c with size flags) ---- Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150508010335.24812.19972.stgit@localhost.localdomainSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Add --no-inlines(--inlines) option to avoid searching inline functions. Searching all functions which matches glob pattern can take a long time and find a lot of inline functions. With this option perf-probe searches target on the non-inlined functions. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150508010333.24812.86568.stgit@localhost.localdomainSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Introduce probe_conf global configuration parameters for probe-event and probe-finder, and removes related parameters from APIs. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150508010330.24812.21095.stgit@localhost.localdomainSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Masami Hiramatsu authored
Use perf_probe_event.target field for the target binary instead of passing it as an argument. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150508010328.24812.67887.stgit@localhost.localdomainSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Davidlohr Bueso authored
Wrap futex_wait around a loop and catch for EINTR. Either a spurious wakeup occurred or a signal interrupted is, either way we need to block again. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1431110280-20231-2-git-send-email-dave@stgolabs.netSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Davidlohr Bueso authored
The futex-wake benchmark only measures wakeups done within a single process. While this has value in its own, it does not really generate any hb->lock contention. A new benchmark 'wake-parallel' is added, by extending the futex-wake code such that we can measure parallel waker threads. The program output shows the avg per-thread latency in order to complete its share of wakeups: Run summary [PID 13474]: blocking on 512 threads (at [private] futex 0xa88668), 8 threads waking up 64 at a time. [Run 1]: Avg per-thread latency (waking 64/512 threads) in 0.6230 ms (+-15.31%) [Run 2]: Avg per-thread latency (waking 64/512 threads) in 0.5175 ms (+-29.95%) [Run 3]: Avg per-thread latency (waking 64/512 threads) in 0.7578 ms (+-18.03%) [Run 4]: Avg per-thread latency (waking 64/512 threads) in 0.8944 ms (+-12.54%) [Run 5]: Avg per-thread latency (waking 64/512 threads) in 1.1204 ms (+-23.85%) Avg per-thread latency (waking 64/512 threads) in 0.7826 ms (+-9.91%) Naturally, different combinations of numbers of blocking and waker threads will exhibit different information. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Link: http://lkml.kernel.org/r/1431110280-20231-1-git-send-email-dave@stgolabs.netSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
In addition to using refcounts for the struct thread lifetime management, we need to protect access to machine->threads from concurrent access. That happens in 'perf top', where a thread processes events, inserting and deleting entries from that rb_tree while another thread decays hist_entries, that end up dropping references and ultimately deleting threads from the rb_tree and releasing its resources when no further hist_entry (or other data structures, like in 'perf sched') references it. So the rule is the same for refcounts + protected trees in the kernel, get the tree lock, find object, bump the refcount, drop the tree lock, return, use object, drop the refcount if no more use of it is needed, keep it if storing it in some other data structure, drop when releasing that data structure. I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and "perf_event__preprocess_sample(&al)" with "addr_location__put(&al)". The addr_location__put() one is because as we return references to several data structures, we may end up adding more reference counting for the other data structures and then we'll drop it at addr_location__put() time. Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Fixing bugs in 'perf top' where the used thread unsafe 'struct thread' refcount implementation was falling apart because we really use two threads. Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-hil2hol294u5ntcuof4jhmn6@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
Uses the arch/x86/ kernel code for x86_64/i386, fallbacking to a gcc intrinsics implementation that has been tested in at least sparc64. Will be used for reference counting in tools/perf. Acked-by: David Ahern <dsahern@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-knfpjowhgyh6x4z0kfuk389j@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. The parisc stuff was just using the asm-generic/barrier.h, no need to introduce a tools/arch/parisc/ tree just yet. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-tfas9bs1gje0hfsvhqgrosd6@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jwcs4r1lo0ld8a4ricbe0zug@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-c5a8m8lbjuy0agep6giykxbz@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-lp68dspbtjcwbpzd7x5c6zp5@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-
Arnaldo Carvalho de Melo authored
We will need it for atomic.h, so move it from the ad-hoc tools/perf/ place to a tools/ subset of the kernel arch/ hierarchy. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-cgfhreaejd7ohitdjccu9k2o@git.kernel.orgSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-