Commit 41aad2a6 authored by Ingo Molnar's avatar Ingo Molnar

Merge tag 'perf-core-for-mingo-20160929' of...

Merge tag 'perf-core-for-mingo-20160929' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:
---------------------

New features:

- Add support for using symbols in address filters with Intel PT and ARM
  CoreSight (hardware assisted tracing facilities) (Adrian Hunter, Mathieu Poirier)

Fixes:

- Fix MMAP event synthesis for pre-existing threads when no hugetlbfs
  mount is in place (Adrian Hunter)

- Don't ignore kernel idle symbols in 'perf script' (Adrian Hunter)

- Assorted Intel PT fixes (Adrian Hunter)

Improvements:

- Fix handling of C++ symbols in 'perf probe' (Masami Hiramatsu)

- Beautify sched_[gs]et_attr return value in 'perf trace' (Arnaldo Carvalho de Melo)

Infrastructure changes:
-----------------------

New features:

- Add dwarf unwind 'perf test' for powerpc (Ravi Bangoria)

Fixes:

- Fix error paths in 'perf record' (Adrian Hunter)

Documentation:

- Update documentation info about quipper, a C++ parser for converting
  to/from perf.data/chromium profiling format (Simon Que)

Build Fixes:

  Fix building in 32 bit platform with libbabeltrace (Wang Nan)
Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents 6b652de2 d18019a5
......@@ -35,15 +35,15 @@ OPTIONS
- a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
'param1', 'param2', etc are defined as formats for the PMU in
/sys/bus/event_sources/devices/<pmu>/format/*.
/sys/bus/event_source/devices/<pmu>/format/*.
- a symbolically formed event like 'pmu/config=M,config1=N,config3=K/'
where M, N, K are numbers (in decimal, hex, octal format). Acceptable
values for each of 'config', 'config1' and 'config2' are defined by
corresponding entries in /sys/bus/event_sources/devices/<pmu>/format/*
corresponding entries in /sys/bus/event_source/devices/<pmu>/format/*
param1 and param2 are defined as formats for the PMU in:
/sys/bus/event_sources/devices/<pmu>/format/*
/sys/bus/event_source/devices/<pmu>/format/*
There are also some params which are not defined in .../<pmu>/format/*.
These params can be used to overload default config values per event.
......@@ -89,9 +89,62 @@ OPTIONS
--filter=<filter>::
Event filter. This option should follow a event selector (-e) which
selects tracepoint event(s). Multiple '--filter' options are combined
selects either tracepoint event(s) or a hardware trace PMU
(e.g. Intel PT or CoreSight).
- tracepoint filters
In the case of tracepoints, multiple '--filter' options are combined
using '&&'.
- address filters
A hardware trace PMU advertises its ability to accept a number of
address filters by specifying a non-zero value in
/sys/bus/event_source/devices/<pmu>/nr_addr_filters.
Address filters have the format:
filter|start|stop|tracestop <start> [/ <size>] [@<file name>]
Where:
- 'filter': defines a region that will be traced.
- 'start': defines an address at which tracing will begin.
- 'stop': defines an address at which tracing will stop.
- 'tracestop': defines a region in which tracing will stop.
<file name> is the name of the object file, <start> is the offset to the
code to trace in that file, and <size> is the size of the region to
trace. 'start' and 'stop' filters need not specify a <size>.
If no object file is specified then the kernel is assumed, in which case
the start address must be a current kernel memory address.
<start> can also be specified by providing the name of a symbol. If the
symbol name is not unique, it can be disambiguated by inserting #n where
'n' selects the n'th symbol in address order. Alternately #0, #g or #G
select only a global symbol. <size> can also be specified by providing
the name of a symbol, in which case the size is calculated to the end
of that symbol. For 'filter' and 'tracestop' filters, if <size> is
omitted and <start> is a symbol, then the size is calculated to the end
of that symbol.
If <size> is omitted and <start> is '*', then the start and size will
be calculated from the first and last symbols, i.e. to trace the whole
file.
If symbol names (or '*') are provided, they must be surrounded by white
space.
The filter passed to the kernel is not necessarily the same as entered.
To see the filter that is passed, use the -v option.
The kernel may not be able to configure a trace region if it is not
within a single mapping. MMAP events (or /proc/<pid>/maps) can be
examined to determine if that is a possibility.
Multiple filters can be separated with space or comma.
--exclude-perf::
Don't record events issued by perf itself. This option should follow
a event selector (-e) which selects tracepoint event(s). It adds a
......
......@@ -437,6 +437,10 @@ in pmu-tools parser. This allows to read perf.data from python and dump it.
quipper
The quipper C++ parser is available at
https://chromium.googlesource.com/chromiumos/platform/chromiumos-wide-profiling/
https://chromium.googlesource.com/chromiumos/platform2
It is under the chromiumos-wide-profiling/ subdirectory. This library can
convert a perf data file to a protobuf and vice versa.
Unfortunately this parser tends to be many versions behind and may not be able
to parse data files generated by recent perf.
libperf-y += util/
libperf-y += tests/
#ifndef ARCH_TESTS_H
#define ARCH_TESTS_H
#ifdef HAVE_DWARF_UNWIND_SUPPORT
struct thread;
struct perf_sample;
int test__arch_unwind_sample(struct perf_sample *sample,
struct thread *thread);
#endif
extern struct test arch_tests[];
#endif
......@@ -5,6 +5,8 @@
#include <linux/types.h>
#include <asm/perf_regs.h>
void perf_regs_load(u64 *regs);
#define PERF_REGS_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1)
#define PERF_REGS_MAX PERF_REG_POWERPC_MAX
#ifdef __powerpc64__
......
libperf-$(CONFIG_DWARF_UNWIND) += regs_load.o
libperf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
libperf-y += arch-tests.o
#include <string.h>
#include "tests/tests.h"
#include "arch-tests.h"
struct test arch_tests[] = {
#ifdef HAVE_DWARF_UNWIND_SUPPORT
{
.desc = "Test dwarf unwind",
.func = test__dwarf_unwind,
},
#endif
{
.func = NULL,
},
};
#include <string.h>
#include "perf_regs.h"
#include "thread.h"
#include "map.h"
#include "event.h"
#include "debug.h"
#include "tests/tests.h"
#include "arch-tests.h"
#define STACK_SIZE 8192
static int sample_ustack(struct perf_sample *sample,
struct thread *thread, u64 *regs)
{
struct stack_dump *stack = &sample->user_stack;
struct map *map;
unsigned long sp;
u64 stack_size, *buf;
buf = malloc(STACK_SIZE);
if (!buf) {
pr_debug("failed to allocate sample uregs data\n");
return -1;
}
sp = (unsigned long) regs[PERF_REG_POWERPC_R1];
map = map_groups__find(thread->mg, MAP__VARIABLE, (u64) sp);
if (!map) {
pr_debug("failed to get stack map\n");
free(buf);
return -1;
}
stack_size = map->end - sp;
stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
memcpy(buf, (void *) sp, stack_size);
stack->data = (char *) buf;
stack->size = stack_size;
return 0;
}
int test__arch_unwind_sample(struct perf_sample *sample,
struct thread *thread)
{
struct regs_dump *regs = &sample->user_regs;
u64 *buf;
buf = calloc(1, sizeof(u64) * PERF_REGS_MAX);
if (!buf) {
pr_debug("failed to allocate sample uregs data\n");
return -1;
}
perf_regs_load(buf);
regs->abi = PERF_SAMPLE_REGS_ABI;
regs->regs = buf;
regs->mask = PERF_REGS_MASK;
return sample_ustack(sample, thread, buf);
}
#include <linux/linkage.h>
/* Offset is based on macros from arch/powerpc/include/uapi/asm/ptrace.h. */
#define R0 0
#define R1 1 * 8
#define R2 2 * 8
#define R3 3 * 8
#define R4 4 * 8
#define R5 5 * 8
#define R6 6 * 8
#define R7 7 * 8
#define R8 8 * 8
#define R9 9 * 8
#define R10 10 * 8
#define R11 11 * 8
#define R12 12 * 8
#define R13 13 * 8
#define R14 14 * 8
#define R15 15 * 8
#define R16 16 * 8
#define R17 17 * 8
#define R18 18 * 8
#define R19 19 * 8
#define R20 20 * 8
#define R21 21 * 8
#define R22 22 * 8
#define R23 23 * 8
#define R24 24 * 8
#define R25 25 * 8
#define R26 26 * 8
#define R27 27 * 8
#define R28 28 * 8
#define R29 29 * 8
#define R30 30 * 8
#define R31 31 * 8
#define NIP 32 * 8
#define CTR 35 * 8
#define LINK 36 * 8
#define XER 37 * 8
.globl perf_regs_load
perf_regs_load:
std 0, R0(3)
std 1, R1(3)
std 2, R2(3)
std 3, R3(3)
std 4, R4(3)
std 5, R5(3)
std 6, R6(3)
std 7, R7(3)
std 8, R8(3)
std 9, R9(3)
std 10, R10(3)
std 11, R11(3)
std 12, R12(3)
std 13, R13(3)
std 14, R14(3)
std 15, R15(3)
std 16, R16(3)
std 17, R17(3)
std 18, R18(3)
std 19, R19(3)
std 20, R20(3)
std 21, R21(3)
std 22, R22(3)
std 23, R23(3)
std 24, R24(3)
std 25, R25(3)
std 26, R26(3)
std 27, R27(3)
std 28, R28(3)
std 29, R29(3)
std 30, R30(3)
std 31, R31(3)
/* store NIP */
mflr 4
std 4, NIP(3)
/* Store LR */
std 4, LINK(3)
/* Store XER */
mfxer 4
std 4, XER(3)
/* Store CTR */
mfctr 4
std 4, CTR(3)
/* Restore original value of r4 */
ld 4, R4(3)
blr
......@@ -62,6 +62,7 @@ struct intel_pt_recording {
size_t snapshot_ref_buf_size;
int snapshot_ref_cnt;
struct intel_pt_snapshot_ref *snapshot_refs;
size_t priv_size;
};
static int intel_pt_parse_terms_with_default(struct list_head *formats,
......@@ -273,11 +274,37 @@ intel_pt_pmu_default_config(struct perf_pmu *intel_pt_pmu)
return attr;
}
static const char *intel_pt_find_filter(struct perf_evlist *evlist,
struct perf_pmu *intel_pt_pmu)
{
struct perf_evsel *evsel;
evlist__for_each_entry(evlist, evsel) {
if (evsel->attr.type == intel_pt_pmu->type)
return evsel->filter;
}
return NULL;
}
static size_t intel_pt_filter_bytes(const char *filter)
{
size_t len = filter ? strlen(filter) : 0;
return len ? roundup(len + 1, 8) : 0;
}
static size_t
intel_pt_info_priv_size(struct auxtrace_record *itr __maybe_unused,
struct perf_evlist *evlist __maybe_unused)
intel_pt_info_priv_size(struct auxtrace_record *itr, struct perf_evlist *evlist)
{
return INTEL_PT_AUXTRACE_PRIV_SIZE;
struct intel_pt_recording *ptr =
container_of(itr, struct intel_pt_recording, itr);
const char *filter = intel_pt_find_filter(evlist, ptr->intel_pt_pmu);
ptr->priv_size = (INTEL_PT_AUXTRACE_PRIV_MAX * sizeof(u64)) +
intel_pt_filter_bytes(filter);
return ptr->priv_size;
}
static void intel_pt_tsc_ctc_ratio(u32 *n, u32 *d)
......@@ -302,9 +329,13 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
bool cap_user_time_zero = false, per_cpu_mmaps;
u64 tsc_bit, mtc_bit, mtc_freq_bits, cyc_bit, noretcomp_bit;
u32 tsc_ctc_ratio_n, tsc_ctc_ratio_d;
unsigned long max_non_turbo_ratio;
size_t filter_str_len;
const char *filter;
u64 *info;
int err;
if (priv_size != INTEL_PT_AUXTRACE_PRIV_SIZE)
if (priv_size != ptr->priv_size)
return -EINVAL;
intel_pt_parse_terms(&intel_pt_pmu->format, "tsc", &tsc_bit);
......@@ -317,6 +348,13 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
intel_pt_tsc_ctc_ratio(&tsc_ctc_ratio_n, &tsc_ctc_ratio_d);
if (perf_pmu__scan_file(intel_pt_pmu, "max_nonturbo_ratio",
"%lu", &max_non_turbo_ratio) != 1)
max_non_turbo_ratio = 0;
filter = intel_pt_find_filter(session->evlist, ptr->intel_pt_pmu);
filter_str_len = filter ? strlen(filter) : 0;
if (!session->evlist->nr_mmaps)
return -EINVAL;
......@@ -351,6 +389,17 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
auxtrace_info->priv[INTEL_PT_TSC_CTC_N] = tsc_ctc_ratio_n;
auxtrace_info->priv[INTEL_PT_TSC_CTC_D] = tsc_ctc_ratio_d;
auxtrace_info->priv[INTEL_PT_CYC_BIT] = cyc_bit;
auxtrace_info->priv[INTEL_PT_MAX_NONTURBO_RATIO] = max_non_turbo_ratio;
auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] = filter_str_len;
info = &auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] + 1;
if (filter_str_len) {
size_t len = intel_pt_filter_bytes(filter);
strncpy((char *)info, filter, len);
info += len >> 3;
}
return 0;
}
......
......@@ -1573,29 +1573,39 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (!rec->itr) {
rec->itr = auxtrace_record__init(rec->evlist, &err);
if (err)
return err;
goto out;
}
err = auxtrace_parse_snapshot_options(rec->itr, &rec->opts,
rec->opts.auxtrace_snapshot_opts);
if (err)
return err;
goto out;
/*
* Allow aliases to facilitate the lookup of symbols for address
* filters. Refer to auxtrace_parse_filters().
*/
symbol_conf.allow_aliases = true;
symbol__init(NULL);
err = auxtrace_parse_filters(rec->evlist);
if (err)
goto out;
if (dry_run)
return 0;
goto out;
err = bpf__setup_stdout(rec->evlist);
if (err) {
bpf__strerror_setup_stdout(rec->evlist, err, errbuf, sizeof(errbuf));
pr_err("ERROR: Setup BPF stdout failed: %s\n",
errbuf);
return err;
goto out;
}
err = -ENOMEM;
symbol__init(NULL);
if (symbol_conf.kptr_restrict)
pr_warning(
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
......@@ -1643,7 +1653,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist->nr_entries == 0 &&
perf_evlist__add_default(rec->evlist) < 0) {
pr_err("Not enough memory for event selector list\n");
goto out_symbol_exit;
goto out;
}
if (rec->opts.target.tid && !rec->opts.no_inherit_set)
......@@ -1663,7 +1673,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
ui__error("%s", errbuf);
err = -saved_errno;
goto out_symbol_exit;
goto out;
}
err = -ENOMEM;
......@@ -1672,7 +1682,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
err = auxtrace_record__options(rec->itr, rec->evlist, &rec->opts);
if (err)
goto out_symbol_exit;
goto out;
/*
* We take all buildids when the file contains
......@@ -1684,11 +1694,11 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (record_opts__config(&rec->opts)) {
err = -EINVAL;
goto out_symbol_exit;
goto out;
}
err = __cmd_record(&record, argc, argv);
out_symbol_exit:
out:
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);
......
......@@ -742,6 +742,8 @@ static struct syscall_fmt {
.arg_scnprintf = { [1] = SCA_SIGNUM, /* sig */ }, },
{ .name = "rt_tgsigqueueinfo", .errmsg = true,
.arg_scnprintf = { [2] = SCA_SIGNUM, /* sig */ }, },
{ .name = "sched_getattr", .errmsg = true, },
{ .name = "sched_setattr", .errmsg = true, },
{ .name = "sched_setscheduler", .errmsg = true,
.arg_scnprintf = { [1] = SCA_SCHED_POLICY, /* policy */ }, },
{ .name = "seccomp", .errmsg = true,
......@@ -2141,6 +2143,7 @@ static int trace__add_syscall_newtp(struct trace *trace)
static int trace__set_ev_qualifier_filter(struct trace *trace)
{
int err = -1;
struct perf_evsel *sys_exit;
char *filter = asprintf_expr_inout_ints("id", !trace->not_ev_qualifier,
trace->ev_qualifier_ids.nr,
trace->ev_qualifier_ids.entries);
......@@ -2148,8 +2151,11 @@ static int trace__set_ev_qualifier_filter(struct trace *trace)
if (filter == NULL)
goto out_enomem;
if (!perf_evsel__append_filter(trace->syscalls.events.sys_enter, "&&", filter))
err = perf_evsel__append_filter(trace->syscalls.events.sys_exit, "&&", filter);
if (!perf_evsel__append_tp_filter(trace->syscalls.events.sys_enter,
filter)) {
sys_exit = trace->syscalls.events.sys_exit;
err = perf_evsel__append_tp_filter(sys_exit, filter);
}
free(filter);
out:
......
......@@ -71,7 +71,7 @@ $(OUTPUT)tests/llvm-src-relocation.c: tests/bpf-script-test-relocation.c tests/B
$(Q)sed -e 's/"/\\"/g' -e 's/\(.*\)/"\1\\n"/g' $< >> $@
$(Q)echo ';' >> $@
ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64))
ifeq ($(ARCH),$(filter $(ARCH),x86 arm arm64 powerpc))
perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
endif
......
......@@ -11,7 +11,7 @@
#include "thread.h"
#include "callchain.h"
#if defined (__x86_64__) || defined (__i386__)
#if defined (__x86_64__) || defined (__i386__) || defined (__powerpc__)
#include "arch-tests.h"
#endif
......
This diff is collapsed.
......@@ -318,6 +318,48 @@ struct auxtrace_record {
unsigned int alignment;
};
/**
* struct addr_filter - address filter.
* @list: list node
* @range: true if it is a range filter
* @start: true if action is 'filter' or 'start'
* @action: 'filter', 'start' or 'stop' ('tracestop' is accepted but converted
* to 'stop')
* @sym_from: symbol name for the filter address
* @sym_to: symbol name that determines the filter size
* @sym_from_idx: selects n'th from symbols with the same name (0 means global
* and less than 0 means symbol must be unique)
* @sym_to_idx: same as @sym_from_idx but for @sym_to
* @addr: filter address
* @size: filter region size (for range filters)
* @filename: DSO file name or NULL for the kernel
* @str: allocated string that contains the other string members
*/
struct addr_filter {
struct list_head list;
bool range;
bool start;
const char *action;
const char *sym_from;
const char *sym_to;
int sym_from_idx;
int sym_to_idx;
u64 addr;
u64 size;
const char *filename;
char *str;
};
/**
* struct addr_filters - list of address filters.
* @head: list of address filters
* @cnt: number of address filters
*/
struct addr_filters {
struct list_head head;
int cnt;
};
#ifdef HAVE_AUXTRACE_SUPPORT
/*
......@@ -482,6 +524,12 @@ void perf_session__auxtrace_error_inc(struct perf_session *session,
union perf_event *event);
void events_stats__auxtrace_error_warn(const struct events_stats *stats);
void addr_filters__init(struct addr_filters *filts);
void addr_filters__exit(struct addr_filters *filts);
int addr_filters__parse_bare_filter(struct addr_filters *filts,
const char *filter);
int auxtrace_parse_filters(struct perf_evlist *evlist);
static inline int auxtrace__process_event(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample,
......@@ -640,6 +688,12 @@ void auxtrace_index__free(struct list_head *head __maybe_unused)
{
}
static inline
int auxtrace_parse_filters(struct perf_evlist *evlist __maybe_unused)
{
return 0;
}
int auxtrace_mmap__mmap(struct auxtrace_mmap *mm,
struct auxtrace_mmap_params *mp,
void *userpg, int fd);
......
......@@ -620,7 +620,7 @@ static int build_id_cache__add_sdt_cache(const char *sbuild_id,
ret = probe_cache__scan_sdt(cache, realname);
if (ret >= 0) {
pr_debug("Found %d SDTs in %s\n", ret, realname);
pr_debug4("Found %d SDTs in %s\n", ret, realname);
if (probe_cache__commit(cache) < 0)
ret = -1;
}
......@@ -691,7 +691,7 @@ int build_id_cache__add_s(const char *sbuild_id, const char *name,
/* Update SDT cache : error is just warned */
if (build_id_cache__add_sdt_cache(sbuild_id, realname) < 0)
pr_debug("Failed to update/scan SDT cache for %s\n", realname);
pr_debug4("Failed to update/scan SDT cache for %s\n", realname);
out_free:
if (!is_kallsyms)
......
......@@ -437,7 +437,7 @@ add_bpf_output_values(struct bt_ctf_event_class *event_class,
int ret;
if (nr_elements * sizeof(u32) != raw_size)
pr_warning("Incorrect raw_size (%u) in bpf output event, skip %lu bytes\n",
pr_warning("Incorrect raw_size (%u) in bpf output event, skip %zu bytes\n",
raw_size, nr_elements * sizeof(u32) - raw_size);
len_type = bt_ctf_event_class_get_field_by_name(event_class, "raw_len");
......
......@@ -129,6 +129,22 @@ int cu_walk_functions_at(Dwarf_Die *cu_die, Dwarf_Addr addr,
}
/**
* die_get_linkage_name - Get the linkage name of the object
* @dw_die: A DIE of the object
*
* Get the linkage name attiribute of given @dw_die.
* For C++ binary, the linkage name will be the mangled symbol.
*/
const char *die_get_linkage_name(Dwarf_Die *dw_die)
{
Dwarf_Attribute attr;
if (dwarf_attr_integrate(dw_die, DW_AT_linkage_name, &attr) == NULL)
return NULL;
return dwarf_formstring(&attr);
}
/**
* die_compare_name - Compare diename and tname
* @dw_die: a DIE
......@@ -145,18 +161,26 @@ bool die_compare_name(Dwarf_Die *dw_die, const char *tname)
}
/**
* die_match_name - Match diename and glob
* die_match_name - Match diename/linkage name and glob
* @dw_die: a DIE
* @glob: a string of target glob pattern
*
* Glob matching the name of @dw_die and @glob. Return false if matching fail.
* This also match linkage name.
*/
bool die_match_name(Dwarf_Die *dw_die, const char *glob)
{
const char *name;
name = dwarf_diename(dw_die);
return name ? strglobmatch(name, glob) : false;
if (name && strglobmatch(name, glob))
return true;
/* fall back to check linkage name */
name = die_get_linkage_name(dw_die);
if (name && strglobmatch(name, glob))
return true;
return false;
}
/**
......
......@@ -38,6 +38,9 @@ int cu_find_lineinfo(Dwarf_Die *cudie, unsigned long addr,
int cu_walk_functions_at(Dwarf_Die *cu_die, Dwarf_Addr addr,
int (*callback)(Dwarf_Die *, void *), void *data);
/* Get DW_AT_linkage_name (should be NULL for C binary) */
const char *die_get_linkage_name(Dwarf_Die *dw_die);
/* Ensure that this DIE is a subprogram and definition (not declaration) */
bool die_is_func_def(Dwarf_Die *dw_die);
......
......@@ -346,7 +346,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
if (!strcmp(execname, ""))
strcpy(execname, anonstr);
if (!strncmp(execname, hugetlbfs_mnt, hugetlbfs_mnt_len)) {
if (hugetlbfs_mnt_len &&
!strncmp(execname, hugetlbfs_mnt, hugetlbfs_mnt_len)) {
strcpy(execname, anonstr);
event->mmap2.flags |= MAP_HUGETLB;
}
......
......@@ -1045,15 +1045,15 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, const char *filter)
return -1;
}
int perf_evsel__append_filter(struct perf_evsel *evsel,
const char *op, const char *filter)
static int perf_evsel__append_filter(struct perf_evsel *evsel,
const char *fmt, const char *filter)
{
char *new_filter;
if (evsel->filter == NULL)
return perf_evsel__set_filter(evsel, filter);
if (asprintf(&new_filter,"(%s) %s (%s)", evsel->filter, op, filter) > 0) {
if (asprintf(&new_filter, fmt, evsel->filter, filter) > 0) {
free(evsel->filter);
evsel->filter = new_filter;
return 0;
......@@ -1062,6 +1062,16 @@ int perf_evsel__append_filter(struct perf_evsel *evsel,
return -1;
}
int perf_evsel__append_tp_filter(struct perf_evsel *evsel, const char *filter)
{
return perf_evsel__append_filter(evsel, "(%s) && (%s)", filter);
}
int perf_evsel__append_addr_filter(struct perf_evsel *evsel, const char *filter)
{
return perf_evsel__append_filter(evsel, "%s,%s", filter);
}
int perf_evsel__enable(struct perf_evsel *evsel)
{
int nthreads = thread_map__nr(evsel->threads);
......
......@@ -235,8 +235,9 @@ void perf_evsel__set_sample_id(struct perf_evsel *evsel,
bool use_sample_identifier);
int perf_evsel__set_filter(struct perf_evsel *evsel, const char *filter);
int perf_evsel__append_filter(struct perf_evsel *evsel,
const char *op, const char *filter);
int perf_evsel__append_tp_filter(struct perf_evsel *evsel, const char *filter);
int perf_evsel__append_addr_filter(struct perf_evsel *evsel,
const char *filter);
int perf_evsel__apply_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
const char *filter);
int perf_evsel__enable(struct perf_evsel *evsel);
......
......@@ -122,9 +122,6 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
if (!node)
break;
if (node->sym && node->sym->idle)
goto next;
printed += fprintf(fp, "%-*.*s", left_alignment, left_alignment, " ");
if (print_ip)
......@@ -158,7 +155,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
if (!print_oneline)
printed += fprintf(fp, "\n");
next:
callchain_cursor_advance(cursor);
}
}
......@@ -181,7 +178,7 @@ int sample__fprintf_sym(struct perf_sample *sample, struct addr_location *al,
if (cursor != NULL) {
printed += sample__fprintf_callchain(sample, left_alignment,
print_opts, cursor, fp);
} else if (!(al->sym && al->sym->idle)) {
} else {
printed += fprintf(fp, "%-*.*s", left_alignment, left_alignment, " ");
if (print_ip)
......
......@@ -80,6 +80,7 @@ struct intel_pt_decoder {
int (*walk_insn)(struct intel_pt_insn *intel_pt_insn,
uint64_t *insn_cnt_ptr, uint64_t *ip, uint64_t to_ip,
uint64_t max_insn_cnt, void *data);
bool (*pgd_ip)(uint64_t ip, void *data);
void *data;
struct intel_pt_state state;
const unsigned char *buf;
......@@ -186,6 +187,7 @@ struct intel_pt_decoder *intel_pt_decoder_new(struct intel_pt_params *params)
decoder->get_trace = params->get_trace;
decoder->walk_insn = params->walk_insn;
decoder->pgd_ip = params->pgd_ip;
decoder->data = params->data;
decoder->return_compression = params->return_compression;
......@@ -1008,6 +1010,19 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
int err;
err = intel_pt_walk_insn(decoder, &intel_pt_insn, 0);
if (err == INTEL_PT_RETURN &&
decoder->pgd_ip &&
decoder->pkt_state == INTEL_PT_STATE_TIP_PGD &&
(decoder->state.type & INTEL_PT_BRANCH) &&
decoder->pgd_ip(decoder->state.to_ip, decoder->data)) {
/* Unconditional branch leaving filter region */
decoder->no_progress = 0;
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->state.to_ip = 0;
return 0;
}
if (err == INTEL_PT_RETURN)
return 0;
if (err)
......@@ -1036,6 +1051,21 @@ static int intel_pt_walk_tip(struct intel_pt_decoder *decoder)
}
if (intel_pt_insn.branch == INTEL_PT_BR_CONDITIONAL) {
uint64_t to_ip = decoder->ip + intel_pt_insn.length +
intel_pt_insn.rel;
if (decoder->pgd_ip &&
decoder->pkt_state == INTEL_PT_STATE_TIP_PGD &&
decoder->pgd_ip(to_ip, decoder->data)) {
/* Conditional branch leaving filter region */
decoder->pge = false;
decoder->continuous_period = false;
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
decoder->ip = to_ip;
decoder->state.from_ip = decoder->ip;
decoder->state.to_ip = 0;
return 0;
}
intel_pt_log_at("ERROR: Conditional branch when expecting indirect branch",
decoder->ip);
decoder->pkt_state = INTEL_PT_STATE_ERR_RESYNC;
......
......@@ -83,6 +83,7 @@ struct intel_pt_params {
int (*walk_insn)(struct intel_pt_insn *intel_pt_insn,
uint64_t *insn_cnt_ptr, uint64_t *ip, uint64_t to_ip,
uint64_t max_insn_cnt, void *data);
bool (*pgd_ip)(uint64_t ip, void *data);
void *data;
bool return_compression;
uint64_t period;
......
......@@ -103,6 +103,9 @@ struct intel_pt {
unsigned max_non_turbo_ratio;
unsigned long num_events;
char *filter;
struct addr_filters filts;
};
enum switch_state {
......@@ -241,7 +244,7 @@ static int intel_pt_get_trace(struct intel_pt_buffer *b, void *data)
}
queue = &ptq->pt->queues.queue_array[ptq->queue_nr];
next:
buffer = auxtrace_buffer__next(queue, buffer);
if (!buffer) {
if (old_buffer)
......@@ -264,9 +267,6 @@ static int intel_pt_get_trace(struct intel_pt_buffer *b, void *data)
intel_pt_do_fix_overlap(ptq->pt, old_buffer, buffer))
return -ENOMEM;
if (old_buffer)
auxtrace_buffer__drop_data(old_buffer);
if (buffer->use_data) {
b->len = buffer->use_size;
b->buf = buffer->use_data;
......@@ -276,6 +276,16 @@ static int intel_pt_get_trace(struct intel_pt_buffer *b, void *data)
}
b->ref_timestamp = buffer->reference;
/*
* If in snapshot mode and the buffer has no usable data, get next
* buffer and again check overlap against old_buffer.
*/
if (ptq->pt->snapshot_mode && !b->len)
goto next;
if (old_buffer)
auxtrace_buffer__drop_data(old_buffer);
if (!old_buffer || ptq->pt->sampling_mode || (ptq->pt->snapshot_mode &&
!buffer->consecutive)) {
b->consecutive = false;
......@@ -541,6 +551,76 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
return 0;
}
static bool intel_pt_match_pgd_ip(struct intel_pt *pt, uint64_t ip,
uint64_t offset, const char *filename)
{
struct addr_filter *filt;
bool have_filter = false;
bool hit_tracestop = false;
bool hit_filter = false;
list_for_each_entry(filt, &pt->filts.head, list) {
if (filt->start)
have_filter = true;
if ((filename && !filt->filename) ||
(!filename && filt->filename) ||
(filename && strcmp(filename, filt->filename)))
continue;
if (!(offset >= filt->addr && offset < filt->addr + filt->size))
continue;
intel_pt_log("TIP.PGD ip %#"PRIx64" offset %#"PRIx64" in %s hit filter: %s offset %#"PRIx64" size %#"PRIx64"\n",
ip, offset, filename ? filename : "[kernel]",
filt->start ? "filter" : "stop",
filt->addr, filt->size);
if (filt->start)
hit_filter = true;
else
hit_tracestop = true;
}
if (!hit_tracestop && !hit_filter)
intel_pt_log("TIP.PGD ip %#"PRIx64" offset %#"PRIx64" in %s is not in a filter region\n",
ip, offset, filename ? filename : "[kernel]");
return hit_tracestop || (have_filter && !hit_filter);
}
static int __intel_pt_pgd_ip(uint64_t ip, void *data)
{
struct intel_pt_queue *ptq = data;
struct thread *thread;
struct addr_location al;
u8 cpumode;
u64 offset;
if (ip >= ptq->pt->kernel_start)
return intel_pt_match_pgd_ip(ptq->pt, ip, ip, NULL);
cpumode = PERF_RECORD_MISC_USER;
thread = ptq->thread;
if (!thread)
return -EINVAL;
thread__find_addr_map(thread, cpumode, MAP__FUNCTION, ip, &al);
if (!al.map || !al.map->dso)
return -EINVAL;
offset = al.map->map_ip(al.map, ip);
return intel_pt_match_pgd_ip(ptq->pt, ip, offset,
al.map->dso->long_name);
}
static bool intel_pt_pgd_ip(uint64_t ip, void *data)
{
return __intel_pt_pgd_ip(ip, data) > 0;
}
static bool intel_pt_get_config(struct intel_pt *pt,
struct perf_event_attr *attr, u64 *config)
{
......@@ -717,6 +797,9 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
params.tsc_ctc_ratio_n = pt->tsc_ctc_ratio_n;
params.tsc_ctc_ratio_d = pt->tsc_ctc_ratio_d;
if (pt->filts.cnt > 0)
params.pgd_ip = intel_pt_pgd_ip;
if (pt->synth_opts.instructions) {
if (pt->synth_opts.period) {
switch (pt->synth_opts.period_type) {
......@@ -1767,6 +1850,8 @@ static void intel_pt_free(struct perf_session *session)
intel_pt_free_events(session);
session->auxtrace = NULL;
thread__put(pt->unknown_thread);
addr_filters__exit(&pt->filts);
zfree(&pt->filter);
free(pt);
}
......@@ -2016,6 +2101,8 @@ static const char * const intel_pt_info_fmts[] = {
[INTEL_PT_TSC_CTC_N] = " TSC:CTC numerator %"PRIu64"\n",
[INTEL_PT_TSC_CTC_D] = " TSC:CTC denominator %"PRIu64"\n",
[INTEL_PT_CYC_BIT] = " CYC bit %#"PRIx64"\n",
[INTEL_PT_MAX_NONTURBO_RATIO] = " Max non-turbo ratio %"PRIu64"\n",
[INTEL_PT_FILTER_STR_LEN] = " Filter string len. %"PRIu64"\n",
};
static void intel_pt_print_info(u64 *arr, int start, int finish)
......@@ -2029,12 +2116,28 @@ static void intel_pt_print_info(u64 *arr, int start, int finish)
fprintf(stdout, intel_pt_info_fmts[i], arr[i]);
}
static void intel_pt_print_info_str(const char *name, const char *str)
{
if (!dump_trace)
return;
fprintf(stdout, " %-20s%s\n", name, str ? str : "");
}
static bool intel_pt_has(struct auxtrace_info_event *auxtrace_info, int pos)
{
return auxtrace_info->header.size >=
sizeof(struct auxtrace_info_event) + (sizeof(u64) * (pos + 1));
}
int intel_pt_process_auxtrace_info(union perf_event *event,
struct perf_session *session)
{
struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info;
size_t min_sz = sizeof(u64) * INTEL_PT_PER_CPU_MMAPS;
struct intel_pt *pt;
void *info_end;
u64 *info;
int err;
if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) +
......@@ -2045,6 +2148,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
if (!pt)
return -ENOMEM;
addr_filters__init(&pt->filts);
perf_config(intel_pt_perf_config, pt);
err = auxtrace_queues__init(&pt->queues);
......@@ -2069,8 +2174,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
intel_pt_print_info(&auxtrace_info->priv[0], INTEL_PT_PMU_TYPE,
INTEL_PT_PER_CPU_MMAPS);
if (auxtrace_info->header.size >= sizeof(struct auxtrace_info_event) +
(sizeof(u64) * INTEL_PT_CYC_BIT)) {
if (intel_pt_has(auxtrace_info, INTEL_PT_CYC_BIT)) {
pt->mtc_bit = auxtrace_info->priv[INTEL_PT_MTC_BIT];
pt->mtc_freq_bits = auxtrace_info->priv[INTEL_PT_MTC_FREQ_BITS];
pt->tsc_ctc_ratio_n = auxtrace_info->priv[INTEL_PT_TSC_CTC_N];
......@@ -2080,6 +2184,54 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
INTEL_PT_CYC_BIT);
}
if (intel_pt_has(auxtrace_info, INTEL_PT_MAX_NONTURBO_RATIO)) {
pt->max_non_turbo_ratio =
auxtrace_info->priv[INTEL_PT_MAX_NONTURBO_RATIO];
intel_pt_print_info(&auxtrace_info->priv[0],
INTEL_PT_MAX_NONTURBO_RATIO,
INTEL_PT_MAX_NONTURBO_RATIO);
}
info = &auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] + 1;
info_end = (void *)info + auxtrace_info->header.size;
if (intel_pt_has(auxtrace_info, INTEL_PT_FILTER_STR_LEN)) {
size_t len;
len = auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN];
intel_pt_print_info(&auxtrace_info->priv[0],
INTEL_PT_FILTER_STR_LEN,
INTEL_PT_FILTER_STR_LEN);
if (len) {
const char *filter = (const char *)info;
len = roundup(len + 1, 8);
info += len >> 3;
if ((void *)info > info_end) {
pr_err("%s: bad filter string length\n", __func__);
err = -EINVAL;
goto err_free_queues;
}
pt->filter = memdup(filter, len);
if (!pt->filter) {
err = -ENOMEM;
goto err_free_queues;
}
if (session->header.needs_swap)
mem_bswap_64(pt->filter, len);
if (pt->filter[len - 1]) {
pr_err("%s: filter string not null terminated\n", __func__);
err = -EINVAL;
goto err_free_queues;
}
err = addr_filters__parse_bare_filter(&pt->filts,
filter);
if (err)
goto err_free_queues;
}
intel_pt_print_info_str("Filter string", pt->filter);
}
pt->timeless_decoding = intel_pt_timeless_decoding(pt);
pt->have_tsc = intel_pt_have_tsc(pt);
pt->sampling_mode = false;
......@@ -2121,11 +2273,13 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
pt->switch_evsel = intel_pt_find_sched_switch(session->evlist);
if (!pt->switch_evsel) {
pr_err("%s: missing sched_switch event\n", __func__);
err = -EINVAL;
goto err_delete_thread;
}
} else if (pt->have_sched_switch == 2 &&
!intel_pt_find_switch(session->evlist)) {
pr_err("%s: missing context_switch attribute flag\n", __func__);
err = -EINVAL;
goto err_delete_thread;
}
......@@ -2149,7 +2303,9 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
if (pt->tc.time_mult) {
u64 tsc_freq = intel_pt_ns_to_ticks(pt, 1000000000);
pt->max_non_turbo_ratio = (tsc_freq + 50000000) / 100000000;
if (!pt->max_non_turbo_ratio)
pt->max_non_turbo_ratio =
(tsc_freq + 50000000) / 100000000;
intel_pt_log("TSC frequency %"PRIu64"\n", tsc_freq);
intel_pt_log("Maximum non-turbo ratio %u\n",
pt->max_non_turbo_ratio);
......@@ -2193,6 +2349,8 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
auxtrace_queues__free(&pt->queues);
session->auxtrace = NULL;
err_free:
addr_filters__exit(&pt->filts);
zfree(&pt->filter);
free(pt);
return err;
}
......@@ -34,11 +34,11 @@ enum {
INTEL_PT_TSC_CTC_N,
INTEL_PT_TSC_CTC_D,
INTEL_PT_CYC_BIT,
INTEL_PT_MAX_NONTURBO_RATIO,
INTEL_PT_FILTER_STR_LEN,
INTEL_PT_AUXTRACE_PRIV_MAX,
};
#define INTEL_PT_AUXTRACE_PRIV_SIZE (INTEL_PT_AUXTRACE_PRIV_MAX * sizeof(u64))
struct auxtrace_record;
struct perf_tool;
union perf_event;
......
......@@ -1760,20 +1760,49 @@ foreach_evsel_in_last_glob(struct perf_evlist *evlist,
static int set_filter(struct perf_evsel *evsel, const void *arg)
{
const char *str = arg;
bool found = false;
int nr_addr_filters = 0;
struct perf_pmu *pmu = NULL;
if (evsel == NULL || evsel->attr.type != PERF_TYPE_TRACEPOINT) {
fprintf(stderr,
"--filter option should follow a -e tracepoint option\n");
return -1;
if (evsel == NULL)
goto err;
if (evsel->attr.type == PERF_TYPE_TRACEPOINT) {
if (perf_evsel__append_tp_filter(evsel, str) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
}
return 0;
}
if (perf_evsel__append_filter(evsel, "&&", str) < 0) {
while ((pmu = perf_pmu__scan(pmu)) != NULL)
if (pmu->type == evsel->attr.type) {
found = true;
break;
}
if (found)
perf_pmu__scan_file(pmu, "nr_addr_filters",
"%d", &nr_addr_filters);
if (!nr_addr_filters)
goto err;
if (perf_evsel__append_addr_filter(evsel, str) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
}
return 0;
err:
fprintf(stderr,
"--filter option should follow a -e tracepoint or HW tracer option\n");
return -1;
}
int parse_filter(const struct option *opt, const char *str,
......@@ -1798,7 +1827,7 @@ static int add_exclude_perf_filter(struct perf_evsel *evsel,
snprintf(new_filter, sizeof(new_filter), "common_pid != %d", getpid());
if (perf_evsel__append_filter(evsel, "&&", new_filter) < 0) {
if (perf_evsel__append_tp_filter(evsel, new_filter) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
......
......@@ -213,9 +213,13 @@ static int convert_exec_to_group(const char *exec, char **result)
goto out;
}
ptr2 = strpbrk(ptr1, "-._");
if (ptr2)
*ptr2 = '\0';
for (ptr2 = ptr1; ptr2 != '\0'; ptr2++) {
if (!isalnum(*ptr2) && *ptr2 != '_') {
*ptr2 = '\0';
break;
}
}
ret = e_snprintf(buf, 64, "%s_%s", PERFPROBE_GROUP, ptr1);
if (ret < 0)
goto out;
......
......@@ -699,7 +699,7 @@ int probe_cache__scan_sdt(struct probe_cache *pcache, const char *pathname)
INIT_LIST_HEAD(&sdtlist);
ret = get_sdt_note_list(&sdtlist, pathname);
if (ret < 0) {
pr_debug("Failed to get sdt note: %d\n", ret);
pr_debug4("Failed to get sdt note: %d\n", ret);
return ret;
}
list_for_each_entry(note, &sdtlist, note_list) {
......
......@@ -955,6 +955,11 @@ static int probe_point_inline_cb(Dwarf_Die *in_die, void *data)
dwarf_diename(in_die));
return -ENOENT;
}
if (addr == 0) {
pr_debug("%s has no valid entry address. skipped.\n",
dwarf_diename(in_die));
return -ENOENT;
}
pf->addr = addr;
pf->addr += pp->offset;
pr_debug("found inline addr: 0x%jx\n",
......@@ -988,7 +993,8 @@ static int probe_point_search_cb(Dwarf_Die *sp_die, void *data)
if (pp->file && strtailcmp(pp->file, dwarf_decl_file(sp_die)))
return DWARF_CB_OK;
pr_debug("Matched function: %s\n", dwarf_diename(sp_die));
pr_debug("Matched function: %s [%lx]\n", dwarf_diename(sp_die),
(unsigned long)dwarf_dieoffset(sp_die));
pf->fname = dwarf_decl_file(sp_die);
if (pp->line) { /* Function relative line */
dwarf_decl_line(sp_die, &pf->lno);
......@@ -997,8 +1003,13 @@ static int probe_point_search_cb(Dwarf_Die *sp_die, void *data)
} else if (die_is_func_instance(sp_die)) {
/* Instances always have the entry address */
dwarf_entrypc(sp_die, &pf->addr);
/* But in some case the entry address is 0 */
if (pf->addr == 0) {
pr_debug("%s has no entry PC. Skipped\n",
dwarf_diename(sp_die));
param->retval = 0;
/* Real function */
if (pp->lazy_line)
} else if (pp->lazy_line)
param->retval = find_probe_point_lazy(sp_die, pf);
else {
skip_prologue(sp_die, pf);
......@@ -1011,7 +1022,7 @@ static int probe_point_search_cb(Dwarf_Die *sp_die, void *data)
param->retval = die_walk_instances(sp_die,
probe_point_inline_cb, (void *)pf);
/* This could be a non-existed inline definition */
if (param->retval == -ENOENT && strisglob(pp->function))
if (param->retval == -ENOENT)
param->retval = 0;
}
......
......@@ -345,6 +345,16 @@ static struct symbol *symbols__first(struct rb_root *symbols)
return NULL;
}
static struct symbol *symbols__last(struct rb_root *symbols)
{
struct rb_node *n = rb_last(symbols);
if (n)
return rb_entry(n, struct symbol, rb_node);
return NULL;
}
static struct symbol *symbols__next(struct symbol *sym)
{
struct rb_node *n = rb_next(&sym->rb_node);
......@@ -466,6 +476,11 @@ struct symbol *dso__first_symbol(struct dso *dso, enum map_type type)
return symbols__first(&dso->symbols[type]);
}
struct symbol *dso__last_symbol(struct dso *dso, enum map_type type)
{
return symbols__last(&dso->symbols[type]);
}
struct symbol *dso__next_symbol(struct symbol *sym)
{
return symbols__next(sym);
......
......@@ -259,6 +259,7 @@ struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type,
struct symbol *symbol__next_by_name(struct symbol *sym);
struct symbol *dso__first_symbol(struct dso *dso, enum map_type type);
struct symbol *dso__last_symbol(struct dso *dso, enum map_type type);
struct symbol *dso__next_symbol(struct symbol *sym);
enum dso_type dso__type_fd(int fd);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment