Commit ad5d6989 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "As a first remark I'd like to note that the way to build perf tooling
  has been simplified and sped up, in the future it should be enough for
  you to build perf via:

        cd tools/perf/
        make install

  (ie without the -j option.) The build system will figure out the
  number of CPUs and will do a parallel build+install.

  The various build system inefficiencies and breakages Linus reported
  against the v3.12 pull request should now be resolved - please
  (re-)report any remaining annoyances or bugs.

  Main changes on the perf kernel side:

   * Performance optimizations:
      . perf ring-buffer code optimizations,          by Peter Zijlstra
      . perf ring-buffer code optimizations,          by Oleg Nesterov
      . x86 NMI call-stack processing optimizations,  by Peter Zijlstra
      . perf context-switch optimizations,            by Peter Zijlstra
      . perf sampling speedups,                       by Peter Zijlstra
      . x86 Intel PEBS processing speedups,           by Peter Zijlstra

   * Enhanced hardware support:
      . for Intel Ivy Bridge-EP uncore PMUs,          by Zheng Yan
      . for Haswell transactions,                     by Andi Kleen, Peter Zijlstra

   * Core perf events code enhancements and fixes by Oleg Nesterov:
      . for uprobes, if fork() is called with pending ret-probes
      . for uprobes platform support code

   * New ABI details by Andi Kleen:
      . Report x86 Haswell TSX transaction abort cost as weight

  Main changes on the perf tooling side (some of these tooling changes
  utilize the above kernel side changes):

   * 'perf report/top' enhancements:

      . Convert callchain children list to rbtree, greatly reducing the
        time taken for callchain processing, from Namhyung Kim.

      . Add new COMM infrastructure, further improving histogram
        processing, from Frédéric Weisbecker, one fix from Namhyung Kim.

      . Add /proc/kcore based live-annotation improvements, including
        build-id cache support, multi map 'call' instruction navigation
        fixes, kcore address validation, objdump workarounds.  From
        Adrian Hunter.

      . Show progress on histogram collapsing, that can take a long
        time, from Namhyung Kim.

      . Add --max-stack option to limit callchain stack scan in 'top'
        and 'report', improving callchain processing when reducing the
        stack depth is an option, from Waiman Long.

      . Add new option --ignore-vmlinux for perf top, from Willy
        Tarreau.

   * 'perf trace' enhancements:

      . 'perf trace' now can can use a 'perf probe' dynamic tracepoints
        to hook into the userspace -> kernel pathname copy so that it
        can map fds to pathnames without reading /proc/pid/fd/ symlinks.
        From Arnaldo Carvalho de Melo.

      . Show VFS path associated with fd in live sessions, using a
        'vfs_getname' 'perf probe' created dynamic tracepoint or by
        looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.

      . Add 'trace' beautifiers for lots of syscall arguments, from
        Arnaldo Carvalho de Melo.

      . Implement more compact 'trace' output by suppressing zeroed
        args, from Arnaldo Carvalho de Melo.

      . Show thread COMM by default in 'trace', from Arnaldo Carvalho de
        Melo.

      . Add option to show full timestamp in 'trace', from David Ahern.

      . Add 'record' command in 'trace', to record raw_syscalls:*, from
        David Ahern.

      . Add summary option to dump syscall statistics in 'trace', from
        David Ahern.

      . Improve error messages in 'trace', providing hints about system
        configuration steps needed for using it, from Ramkumar
        Ramachandra.

      . 'perf trace' now emits hints as to why tracing is not possible,
        helping the user to setup the system to allow tracing in the
        desired permission granularity, telling if the problem is due to
        debugfs not being mounted or with not enough permission for
        !root, /proc/sys/kernel/perf_event_paranoit value, etc.  From
        Arnaldo Carvalho de Melo.

   * 'perf record' enhancements:

      . Check maximum frequency rate for record/top, emitting better
        error messages, from Jiri Olsa.

      . 'perf record' code cleanups, from David Ahern.

      . Improve write_output error message in 'perf record', from Adrian
        Hunter.

      . Allow specifying B/K/M/G unit to the --mmap-pages arguments,
        from Jiri Olsa.

      . Fix command line callchain attribute tests to handle the new
        -g/--call-chain semantics, from Arnaldo Carvalho de Melo.

   * 'perf kvm' enhancements:

      . Disable live kvm command if timerfd is not supported, from David
        Ahern.

      . Fix detection of non-core features, from David Ahern.

   * 'perf list' enhancements:

      . Add usage to 'perf list', from David Ahern.

      . Show error in 'perf list' if tracepoints not available, from
        Pekka Enberg.

   * 'perf probe' enhancements:

      . Support "$vars" meta argument syntax for local variables,
        allowing asking for all possible variables at a given probe
        point to be collected when it hits, from Masami Hiramatsu.

   * 'perf sched' enhancements:

      . Address the root cause of that 'perf sched' stack initialization
        build slowdown, by programmatically setting a big array after
        moving the global variable back to the stack.  Fix from Adrian
        Hunter.

   * 'perf script' enhancements:

      . Set up output options for in-stream attributes, from Adrian
        Hunter.

      . Print addr by default for BTS in 'perf script', from Adrian
        Juntmer

   * 'perf stat' enhancements:

      . Improved messages when doing profiling in all or a subset of
        CPUs using a workload as the session delimitator, as in:

         'perf stat --cpu 0,2 sleep 10s'

        from Arnaldo Carvalho de Melo.

      . Add units to nanosec-based counters in 'perf stat', from David
        Ahern.

      . Remove bogus info when using 'perf stat' -e cycles/instructions,
        from Ramkumar Ramachandra.

   * 'perf lock' enhancements:

      . 'perf lock' fixes and cleanups, from Davidlohr Bueso.

   * 'perf test' enhancements:

      . Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing
        and 'perf test', from Adrian Hunter.

      . Clarify the "sample parsing" test entry, from Arnaldo Carvalho
        de Melo.

      . Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test,
        from Arnaldo Carvalho de Melo.

      . Memory leak fixes in 'perf test', from Felipe Pena.

   * 'perf bench' enhancements:

      . Change the procps visible command-name of invididual benchmark
        tests plus cleanups, from Ingo Molnar.

   * Generic perf tooling infrastructure/plumbing changes:

      . Separating data file properties from session, code
        reorganization from Jiri Olsa.

      . Fix version when building out of tree, as when using one of
        these:

        $ make help | grep perf
          perf-tar-src-pkg    - Build perf-3.12.0.tar source tarball
          perf-targz-src-pkg  - Build perf-3.12.0.tar.gz source tarball
          perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
          perf-tarxz-src-pkg  - Build perf-3.12.0.tar.xz source tarball
        $

        from David Ahern.

      . Enhance option parse error message, showing just the help lines
        of the options affected, from Namhyung Kim.

      . libtraceevent updates from upstream trace-cmd repo, from Steven
        Rostedt.

      . Always use perf_evsel__set_sample_bit to set sample_type, from
        Adrian Hunter.

      . Memory and mmap leak fixes from Chenggang Qin.

      . Assorted build fixes for from David Ahern and Jiri Olsa.

      . Speed up and prettify the build system, from Ingo Molnar.

      . Implement addr2line directly using libbfd, from Roberto Vitillo.

      . Separate the GTK support in a separate libperf-gtk.so DSO, that
        is only loaded when --gtk is specified, from Namhyung Kim.

      . perf bash completion fixes and improvements from Ramkumar
        Ramachandra.

      . Support for Openembedded/Yocto -dbg packages, from Ricardo
        Ribalda Delgado.

  And lots and lots of other fixes and code reorganizations that did not
  make it into the list, see the shortlog, diffstat and the Git log for
  details!"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits)
  uprobes: Fix the memory out of bound overwrite in copy_insn()
  uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()
  perf tools: Remove unneeded include
  perf record: Remove post_processing_offset variable
  perf record: Remove advance_output function
  perf record: Refactor feature handling into a separate function
  perf trace: Don't relookup fields by name in each sample
  perf tools: Fix version when building out of tree
  perf evsel: Ditch evsel->handler.data field
  uprobes: Export write_opcode() as uprobe_write_opcode()
  uprobes: Introduce arch_uprobe->ixol
  uprobes: Kill module_init() and module_exit()
  uprobes: Move function declarations out of arch
  perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
  perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
  perf: Factor out strncpy() in perf_event_mmap_event()
  tools/perf: Add required memory barriers
  perf: Fix arch_perf_out_copy_user default
  perf: Update a stale comment
  perf: Optimize perf_output_begin() -- address calculation
  ...
parents ef1417a5 caea6cf5
......@@ -37,6 +37,7 @@ typedef ppc_opcode_t uprobe_opcode_t;
struct arch_uprobe {
union {
u8 insn[MAX_UINSN_BYTES];
u8 ixol[MAX_UINSN_BYTES];
u32 ainsn;
};
};
......@@ -45,11 +46,4 @@ struct arch_uprobe_task {
unsigned long saved_trap_nr;
};
extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#endif /* _ASM_UPROBES_H */
......@@ -35,7 +35,10 @@ typedef u8 uprobe_opcode_t;
struct arch_uprobe {
u16 fixups;
u8 insn[MAX_UINSN_BYTES];
union {
u8 insn[MAX_UINSN_BYTES];
u8 ixol[MAX_UINSN_BYTES];
};
#ifdef CONFIG_X86_64
unsigned long rip_rela_target_address;
#endif
......@@ -49,11 +52,4 @@ struct arch_uprobe_task {
unsigned int saved_tf;
};
extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#endif /* _ASM_UPROBES_H */
......@@ -1989,7 +1989,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
frame.return_address = 0;
bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
if (bytes != sizeof(frame))
if (bytes != 0)
break;
if (!valid_user_frame(fp, sizeof(frame)))
......@@ -2041,7 +2041,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
frame.return_address = 0;
bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
if (bytes != sizeof(frame))
if (bytes != 0)
break;
if (!valid_user_frame(fp, sizeof(frame)))
......
......@@ -163,6 +163,11 @@ struct cpu_hw_events {
u64 intel_ctrl_host_mask;
struct perf_guest_switch_msr guest_switch_msrs[X86_PMC_IDX_MAX];
/*
* Intel checkpoint mask
*/
u64 intel_cp_status;
/*
* manage shared (per-core, per-cpu) registers
* used on Intel NHM/WSM/SNB
......@@ -440,6 +445,7 @@ struct x86_pmu {
int lbr_nr; /* hardware stack size */
u64 lbr_sel_mask; /* LBR_SELECT valid bits */
const int *lbr_sel_map; /* lbr_select mappings */
bool lbr_double_abort; /* duplicated lbr aborts */
/*
* Extra registers for events
......
......@@ -190,9 +190,9 @@ static struct extra_reg intel_snbep_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};
EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3");
EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2");
EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3");
EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2");
struct attribute *nhm_events_attrs[] = {
EVENT_PTR(mem_ld_nhm),
......@@ -1184,6 +1184,11 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
wrmsrl(hwc->config_base, ctrl_val);
}
static inline bool event_is_checkpointed(struct perf_event *event)
{
return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
}
static void intel_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
......@@ -1197,6 +1202,7 @@ static void intel_pmu_disable_event(struct perf_event *event)
cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
cpuc->intel_cp_status &= ~(1ull << hwc->idx);
/*
* must disable before any actual event
......@@ -1271,6 +1277,9 @@ static void intel_pmu_enable_event(struct perf_event *event)
if (event->attr.exclude_guest)
cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
if (unlikely(event_is_checkpointed(event)))
cpuc->intel_cp_status |= (1ull << hwc->idx);
if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
intel_pmu_enable_fixed(hwc);
return;
......@@ -1289,6 +1298,17 @@ static void intel_pmu_enable_event(struct perf_event *event)
int intel_pmu_save_and_restart(struct perf_event *event)
{
x86_perf_event_update(event);
/*
* For a checkpointed counter always reset back to 0. This
* avoids a situation where the counter overflows, aborts the
* transaction and is then set back to shortly before the
* overflow, and overflows and aborts again.
*/
if (unlikely(event_is_checkpointed(event))) {
/* No race with NMIs because the counter should not be armed */
wrmsrl(event->hw.event_base, 0);
local64_set(&event->hw.prev_count, 0);
}
return x86_perf_event_set_period(event);
}
......@@ -1372,6 +1392,13 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
x86_pmu.drain_pebs(regs);
}
/*
* Checkpointed counters can lead to 'spurious' PMIs because the
* rollback caused by the PMI will have cleared the overflow status
* bit. Therefore always force probe these counters.
*/
status |= cpuc->intel_cp_status;
for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
struct perf_event *event = cpuc->events[bit];
......@@ -1837,6 +1864,20 @@ static int hsw_hw_config(struct perf_event *event)
event->attr.precise_ip > 0))
return -EOPNOTSUPP;
if (event_is_checkpointed(event)) {
/*
* Sampling of checkpointed events can cause situations where
* the CPU constantly aborts because of a overflow, which is
* then checkpointed back and ignored. Forbid checkpointing
* for sampling.
*
* But still allow a long sampling period, so that perf stat
* from KVM works.
*/
if (event->attr.sample_period > 0 &&
event->attr.sample_period < 0x7fffffff)
return -EOPNOTSUPP;
}
return 0;
}
......@@ -2182,10 +2223,36 @@ static __init void intel_nehalem_quirk(void)
}
}
EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82")
EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82")
/* Haswell special events */
EVENT_ATTR_STR(tx-start, tx_start, "event=0xc9,umask=0x1");
EVENT_ATTR_STR(tx-commit, tx_commit, "event=0xc9,umask=0x2");
EVENT_ATTR_STR(tx-abort, tx_abort, "event=0xc9,umask=0x4");
EVENT_ATTR_STR(tx-capacity, tx_capacity, "event=0x54,umask=0x2");
EVENT_ATTR_STR(tx-conflict, tx_conflict, "event=0x54,umask=0x1");
EVENT_ATTR_STR(el-start, el_start, "event=0xc8,umask=0x1");
EVENT_ATTR_STR(el-commit, el_commit, "event=0xc8,umask=0x2");
EVENT_ATTR_STR(el-abort, el_abort, "event=0xc8,umask=0x4");
EVENT_ATTR_STR(el-capacity, el_capacity, "event=0x54,umask=0x2");
EVENT_ATTR_STR(el-conflict, el_conflict, "event=0x54,umask=0x1");
EVENT_ATTR_STR(cycles-t, cycles_t, "event=0x3c,in_tx=1");
EVENT_ATTR_STR(cycles-ct, cycles_ct, "event=0x3c,in_tx=1,in_tx_cp=1");
static struct attribute *hsw_events_attrs[] = {
EVENT_PTR(tx_start),
EVENT_PTR(tx_commit),
EVENT_PTR(tx_abort),
EVENT_PTR(tx_capacity),
EVENT_PTR(tx_conflict),
EVENT_PTR(el_start),
EVENT_PTR(el_commit),
EVENT_PTR(el_abort),
EVENT_PTR(el_capacity),
EVENT_PTR(el_conflict),
EVENT_PTR(cycles_t),
EVENT_PTR(cycles_ct),
EVENT_PTR(mem_ld_hsw),
EVENT_PTR(mem_st_hsw),
NULL
......@@ -2452,6 +2519,7 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = hsw_get_event_constraints;
x86_pmu.cpu_events = hsw_events_attrs;
x86_pmu.lbr_double_abort = true;
pr_cont("Haswell events, ");
break;
......
......@@ -12,6 +12,7 @@
#define BTS_BUFFER_SIZE (PAGE_SIZE << 4)
#define PEBS_BUFFER_SIZE PAGE_SIZE
#define PEBS_FIXUP_SIZE PAGE_SIZE
/*
* pebs_record_32 for p4 and core not supported
......@@ -182,18 +183,32 @@ struct pebs_record_nhm {
* Same as pebs_record_nhm, with two additional fields.
*/
struct pebs_record_hsw {
struct pebs_record_nhm nhm;
/*
* Real IP of the event. In the Intel documentation this
* is called eventingrip.
*/
u64 real_ip;
/*
* TSX tuning information field: abort cycles and abort flags.
*/
u64 tsx_tuning;
u64 flags, ip;
u64 ax, bx, cx, dx;
u64 si, di, bp, sp;
u64 r8, r9, r10, r11;
u64 r12, r13, r14, r15;
u64 status, dla, dse, lat;
u64 real_ip, tsx_tuning;
};
union hsw_tsx_tuning {
struct {
u32 cycles_last_block : 32,
hle_abort : 1,
rtm_abort : 1,
instruction_abort : 1,
non_instruction_abort : 1,
retry : 1,
data_conflict : 1,
capacity_writes : 1,
capacity_reads : 1;
};
u64 value;
};
#define PEBS_HSW_TSX_FLAGS 0xff00000000ULL
void init_debug_store_on_cpu(int cpu)
{
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
......@@ -214,12 +229,14 @@ void fini_debug_store_on_cpu(int cpu)
wrmsr_on_cpu(cpu, MSR_IA32_DS_AREA, 0, 0);
}
static DEFINE_PER_CPU(void *, insn_buffer);
static int alloc_pebs_buffer(int cpu)
{
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
int node = cpu_to_node(cpu);
int max, thresh = 1; /* always use a single PEBS record */
void *buffer;
void *buffer, *ibuffer;
if (!x86_pmu.pebs)
return 0;
......@@ -228,6 +245,19 @@ static int alloc_pebs_buffer(int cpu)
if (unlikely(!buffer))
return -ENOMEM;
/*
* HSW+ already provides us the eventing ip; no need to allocate this
* buffer then.
*/
if (x86_pmu.intel_cap.pebs_format < 2) {
ibuffer = kzalloc_node(PEBS_FIXUP_SIZE, GFP_KERNEL, node);
if (!ibuffer) {
kfree(buffer);
return -ENOMEM;
}
per_cpu(insn_buffer, cpu) = ibuffer;
}
max = PEBS_BUFFER_SIZE / x86_pmu.pebs_record_size;
ds->pebs_buffer_base = (u64)(unsigned long)buffer;
......@@ -248,6 +278,9 @@ static void release_pebs_buffer(int cpu)
if (!ds || !x86_pmu.pebs)
return;
kfree(per_cpu(insn_buffer, cpu));
per_cpu(insn_buffer, cpu) = NULL;
kfree((void *)(unsigned long)ds->pebs_buffer_base);
ds->pebs_buffer_base = 0;
}
......@@ -715,6 +748,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
unsigned long old_to, to = cpuc->lbr_entries[0].to;
unsigned long ip = regs->ip;
int is_64bit = 0;
void *kaddr;
/*
* We don't need to fixup if the PEBS assist is fault like
......@@ -738,7 +772,7 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
* unsigned math, either ip is before the start (impossible) or
* the basic block is larger than 1 page (sanity)
*/
if ((ip - to) > PAGE_SIZE)
if ((ip - to) > PEBS_FIXUP_SIZE)
return 0;
/*
......@@ -749,29 +783,33 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
return 1;
}
if (!kernel_ip(ip)) {
int size, bytes;
u8 *buf = this_cpu_read(insn_buffer);
size = ip - to; /* Must fit our buffer, see above */
bytes = copy_from_user_nmi(buf, (void __user *)to, size);
if (bytes != 0)
return 0;
kaddr = buf;
} else {
kaddr = (void *)to;
}
do {
struct insn insn;
u8 buf[MAX_INSN_SIZE];
void *kaddr;
old_to = to;
if (!kernel_ip(ip)) {
int bytes, size = MAX_INSN_SIZE;
bytes = copy_from_user_nmi(buf, (void __user *)to, size);
if (bytes != size)
return 0;
kaddr = buf;
} else
kaddr = (void *)to;
#ifdef CONFIG_X86_64
is_64bit = kernel_ip(to) || !test_thread_flag(TIF_IA32);
#endif
insn_init(&insn, kaddr, is_64bit);
insn_get_length(&insn);
to += insn.length;
kaddr += insn.length;
} while (to < ip);
if (to == ip) {
......@@ -786,16 +824,34 @@ static int intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
return 0;
}
static inline u64 intel_hsw_weight(struct pebs_record_hsw *pebs)
{
if (pebs->tsx_tuning) {
union hsw_tsx_tuning tsx = { .value = pebs->tsx_tuning };
return tsx.cycles_last_block;
}
return 0;
}
static inline u64 intel_hsw_transaction(struct pebs_record_hsw *pebs)
{
u64 txn = (pebs->tsx_tuning & PEBS_HSW_TSX_FLAGS) >> 32;
/* For RTM XABORTs also log the abort code from AX */
if ((txn & PERF_TXN_TRANSACTION) && (pebs->ax & 1))
txn |= ((pebs->ax >> 24) & 0xff) << PERF_TXN_ABORT_SHIFT;
return txn;
}
static void __intel_pmu_pebs_event(struct perf_event *event,
struct pt_regs *iregs, void *__pebs)
{
/*
* We cast to pebs_record_nhm to get the load latency data
* if extra_reg MSR_PEBS_LD_LAT_THRESHOLD used
* We cast to the biggest pebs_record but are careful not to
* unconditionally access the 'extra' entries.
*/
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct pebs_record_nhm *pebs = __pebs;
struct pebs_record_hsw *pebs_hsw = __pebs;
struct pebs_record_hsw *pebs = __pebs;
struct perf_sample_data data;
struct pt_regs regs;
u64 sample_type;
......@@ -854,7 +910,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
regs.sp = pebs->sp;
if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
regs.ip = pebs_hsw->real_ip;
regs.ip = pebs->real_ip;
regs.flags |= PERF_EFLAGS_EXACT;
} else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(&regs))
regs.flags |= PERF_EFLAGS_EXACT;
......@@ -862,9 +918,18 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
regs.flags &= ~PERF_EFLAGS_EXACT;
if ((event->attr.sample_type & PERF_SAMPLE_ADDR) &&
x86_pmu.intel_cap.pebs_format >= 1)
x86_pmu.intel_cap.pebs_format >= 1)
data.addr = pebs->dla;
if (x86_pmu.intel_cap.pebs_format >= 2) {
/* Only set the TSX weight when no memory weight. */
if ((event->attr.sample_type & PERF_SAMPLE_WEIGHT) && !fll)
data.weight = intel_hsw_weight(pebs);
if (event->attr.sample_type & PERF_SAMPLE_TRANSACTION)
data.txn = intel_hsw_transaction(pebs);
}
if (has_branch_stack(event))
data.br_stack = &cpuc->lbr_stack;
......@@ -913,17 +978,34 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
__intel_pmu_pebs_event(event, iregs, at);
}
static void __intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, void *at,
void *top)
static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
struct perf_event *event = NULL;
void *at, *top;
u64 status = 0;
int bit;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
ds->pebs_index = ds->pebs_buffer_base;
if (unlikely(at > top))
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(top - at > x86_pmu.max_pebs_events * x86_pmu.pebs_record_size,
"Unexpected number of pebs records %ld\n",
(long)(top - at) / x86_pmu.pebs_record_size);
for (; at < top; at += x86_pmu.pebs_record_size) {
struct pebs_record_nhm *p = at;
......@@ -951,61 +1033,6 @@ static void __intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, void *at,
}
}
static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
struct pebs_record_nhm *at, *top;
int n;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
ds->pebs_index = ds->pebs_buffer_base;
n = top - at;
if (n <= 0)
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(n > x86_pmu.max_pebs_events,
"Unexpected number of pebs records %d\n", n);
return __intel_pmu_drain_pebs_nhm(iregs, at, top);
}
static void intel_pmu_drain_pebs_hsw(struct pt_regs *iregs)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
struct pebs_record_hsw *at, *top;
int n;
if (!x86_pmu.pebs_active)
return;
at = (struct pebs_record_hsw *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_record_hsw *)(unsigned long)ds->pebs_index;
n = top - at;
if (n <= 0)
return;
/*
* Should not happen, we program the threshold at 1 and do not
* set a reset value.
*/
WARN_ONCE(n > x86_pmu.max_pebs_events,
"Unexpected number of pebs records %d\n", n);
return __intel_pmu_drain_pebs_nhm(iregs, at, top);
}
/*
* BTS, PEBS probe and setup
*/
......@@ -1040,7 +1067,7 @@ void intel_ds_init(void)
case 2:
pr_cont("PEBS fmt2%c, ", pebs_type);
x86_pmu.pebs_record_size = sizeof(struct pebs_record_hsw);
x86_pmu.drain_pebs = intel_pmu_drain_pebs_hsw;
x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm;
break;
default:
......
......@@ -284,6 +284,7 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
int lbr_format = x86_pmu.intel_cap.lbr_format;
u64 tos = intel_pmu_lbr_tos();
int i;
int out = 0;
for (i = 0; i < x86_pmu.lbr_nr; i++) {
unsigned long lbr_idx = (tos - i) & mask;
......@@ -306,15 +307,27 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
}
from = (u64)((((s64)from) << skip) >> skip);
cpuc->lbr_entries[i].from = from;
cpuc->lbr_entries[i].to = to;
cpuc->lbr_entries[i].mispred = mis;
cpuc->lbr_entries[i].predicted = pred;
cpuc->lbr_entries[i].in_tx = in_tx;
cpuc->lbr_entries[i].abort = abort;
cpuc->lbr_entries[i].reserved = 0;
/*
* Some CPUs report duplicated abort records,
* with the second entry not having an abort bit set.
* Skip them here. This loop runs backwards,
* so we need to undo the previous record.
* If the abort just happened outside the window
* the extra entry cannot be removed.
*/
if (abort && x86_pmu.lbr_double_abort && out > 0)
out--;
cpuc->lbr_entries[out].from = from;
cpuc->lbr_entries[out].to = to;
cpuc->lbr_entries[out].mispred = mis;
cpuc->lbr_entries[out].predicted = pred;
cpuc->lbr_entries[out].in_tx = in_tx;
cpuc->lbr_entries[out].abort = abort;
cpuc->lbr_entries[out].reserved = 0;
out++;
}
cpuc->lbr_stack.nr = i;
cpuc->lbr_stack.nr = out;
}
void intel_pmu_lbr_read(void)
......@@ -478,7 +491,7 @@ static int branch_type(unsigned long from, unsigned long to, int abort)
/* may fail if text not present */
bytes = copy_from_user_nmi(buf, (void __user *)from, size);
if (bytes != size)
if (bytes != 0)
return X86_BR_NONE;
addr = buf;
......
......@@ -997,6 +997,20 @@ static int snbep_pci2phy_map_init(int devid)
}
}
if (!err) {
/*
* For PCI bus with no UBOX device, find the next bus
* that has UBOX device and use its mapping.
*/
i = -1;
for (bus = 255; bus >= 0; bus--) {
if (pcibus_to_physid[bus] >= 0)
i = pcibus_to_physid[bus];
else
pcibus_to_physid[bus] = i;
}
}
if (ubox_dev)
pci_dev_put(ubox_dev);
......@@ -1099,6 +1113,24 @@ static struct attribute *ivt_uncore_qpi_formats_attr[] = {
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_thresh8.attr,
&format_attr_match_rds.attr,
&format_attr_match_rnid30.attr,
&format_attr_match_rnid4.attr,
&format_attr_match_dnid.attr,
&format_attr_match_mc.attr,
&format_attr_match_opc.attr,
&format_attr_match_vnw.attr,
&format_attr_match0.attr,
&format_attr_match1.attr,
&format_attr_mask_rds.attr,
&format_attr_mask_rnid30.attr,
&format_attr_mask_rnid4.attr,
&format_attr_mask_dnid.attr,
&format_attr_mask_mc.attr,
&format_attr_mask_opc.attr,
&format_attr_mask_vnw.attr,
&format_attr_mask0.attr,
&format_attr_mask1.attr,
NULL,
};
......@@ -1312,17 +1344,83 @@ static struct intel_uncore_type ivt_uncore_imc = {
IVT_UNCORE_PCI_COMMON_INIT(),
};
/* registers in IRP boxes are not properly aligned */
static unsigned ivt_uncore_irp_ctls[] = {0xd8, 0xdc, 0xe0, 0xe4};
static unsigned ivt_uncore_irp_ctrs[] = {0xa0, 0xb0, 0xb8, 0xc0};
static void ivt_uncore_irp_enable_event(struct intel_uncore_box *box, struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
pci_write_config_dword(pdev, ivt_uncore_irp_ctls[hwc->idx],
hwc->config | SNBEP_PMON_CTL_EN);
}
static void ivt_uncore_irp_disable_event(struct intel_uncore_box *box, struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
pci_write_config_dword(pdev, ivt_uncore_irp_ctls[hwc->idx], hwc->config);
}
static u64 ivt_uncore_irp_read_counter(struct intel_uncore_box *box, struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
u64 count = 0;
pci_read_config_dword(pdev, ivt_uncore_irp_ctrs[hwc->idx], (u32 *)&count);
pci_read_config_dword(pdev, ivt_uncore_irp_ctrs[hwc->idx] + 4, (u32 *)&count + 1);
return count;
}
static struct intel_uncore_ops ivt_uncore_irp_ops = {
.init_box = ivt_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = ivt_uncore_irp_disable_event,
.enable_event = ivt_uncore_irp_enable_event,
.read_counter = ivt_uncore_irp_read_counter,
};
static struct intel_uncore_type ivt_uncore_irp = {
.name = "irp",
.num_counters = 4,
.num_boxes = 1,
.perf_ctr_bits = 48,
.event_mask = IVT_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &ivt_uncore_irp_ops,
.format_group = &ivt_uncore_format_group,
};
static struct intel_uncore_ops ivt_uncore_qpi_ops = {
.init_box = ivt_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = snbep_uncore_pci_disable_event,
.enable_event = snbep_qpi_enable_event,
.read_counter = snbep_uncore_pci_read_counter,
.hw_config = snbep_qpi_hw_config,
.get_constraint = uncore_get_constraint,
.put_constraint = uncore_put_constraint,
};
static struct intel_uncore_type ivt_uncore_qpi = {
.name = "qpi",
.num_counters = 4,
.num_boxes = 3,
.perf_ctr_bits = 48,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = IVT_QPI_PCI_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &ivt_uncore_pci_ops,
.format_group = &ivt_uncore_qpi_format_group,
.name = "qpi",
.num_counters = 4,
.num_boxes = 3,
.perf_ctr_bits = 48,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = IVT_QPI_PCI_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.num_shared_regs = 1,
.ops = &ivt_uncore_qpi_ops,
.format_group = &ivt_uncore_qpi_format_group,
};
static struct intel_uncore_type ivt_uncore_r2pcie = {
......@@ -1346,6 +1444,7 @@ static struct intel_uncore_type ivt_uncore_r3qpi = {
enum {
IVT_PCI_UNCORE_HA,
IVT_PCI_UNCORE_IMC,
IVT_PCI_UNCORE_IRP,
IVT_PCI_UNCORE_QPI,
IVT_PCI_UNCORE_R2PCIE,
IVT_PCI_UNCORE_R3QPI,
......@@ -1354,6 +1453,7 @@ enum {
static struct intel_uncore_type *ivt_pci_uncores[] = {
[IVT_PCI_UNCORE_HA] = &ivt_uncore_ha,
[IVT_PCI_UNCORE_IMC] = &ivt_uncore_imc,
[IVT_PCI_UNCORE_IRP] = &ivt_uncore_irp,
[IVT_PCI_UNCORE_QPI] = &ivt_uncore_qpi,
[IVT_PCI_UNCORE_R2PCIE] = &ivt_uncore_r2pcie,
[IVT_PCI_UNCORE_R3QPI] = &ivt_uncore_r3qpi,
......@@ -1401,6 +1501,10 @@ static DEFINE_PCI_DEVICE_TABLE(ivt_uncore_pci_ids) = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xef1),
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IMC, 7),
},
{ /* IRP */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe39),
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_IRP, 0),
},
{ /* QPI0 Port 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe32),
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_QPI, 0),
......@@ -1429,6 +1533,16 @@ static DEFINE_PCI_DEVICE_TABLE(ivt_uncore_pci_ids) = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe3e),
.driver_data = UNCORE_PCI_DEV_DATA(IVT_PCI_UNCORE_R3QPI, 2),
},
{ /* QPI Port 0 filter */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe86),
.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
SNBEP_PCI_QPI_PORT0_FILTER),
},
{ /* QPI Port 0 filter */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xe96),
.driver_data = UNCORE_PCI_DEV_DATA(UNCORE_EXTRA_PCI_DEV,
SNBEP_PCI_QPI_PORT1_FILTER),
},
{ /* end: all zeroes */ }
};
......
......@@ -11,39 +11,26 @@
#include <linux/sched.h>
/*
* best effort, GUP based copy_from_user() that is NMI-safe
* We rely on the nested NMI work to allow atomic faults from the NMI path; the
* nested NMI paths are careful to preserve CR2.
*/
unsigned long
copy_from_user_nmi(void *to, const void __user *from, unsigned long n)
{
unsigned long offset, addr = (unsigned long)from;
unsigned long size, len = 0;
struct page *page;
void *map;
int ret;
unsigned long ret;
if (__range_not_ok(from, n, TASK_SIZE))
return len;
do {
ret = __get_user_pages_fast(addr, 1, 0, &page);
if (!ret)
break;
offset = addr & (PAGE_SIZE - 1);
size = min(PAGE_SIZE - offset, n - len);
map = kmap_atomic(page);
memcpy(to, map+offset, size);
kunmap_atomic(map);
put_page(page);
len += size;
to += size;
addr += size;
} while (len < n);
return len;
return 0;
/*
* Even though this function is typically called from NMI/IRQ context
* disable pagefaults so that its behaviour is consistent even when
* called form other contexts.
*/
pagefault_disable();
ret = __copy_from_user_inatomic(to, from, n);
pagefault_enable();
return ret;
}
EXPORT_SYMBOL_GPL(copy_from_user_nmi);
......@@ -51,7 +51,7 @@ kmmio_fault(struct pt_regs *regs, unsigned long addr)
return 0;
}
static inline int __kprobes notify_page_fault(struct pt_regs *regs)
static inline int __kprobes kprobes_fault(struct pt_regs *regs)
{
int ret = 0;
......@@ -1048,7 +1048,7 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
return;
/* kprobes don't want to hook the spurious faults: */
if (notify_page_fault(regs))
if (kprobes_fault(regs))
return;
/*
* Don't take the mm semaphore here. If we fixup a prefetch
......@@ -1060,23 +1060,8 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
}
/* kprobes don't want to hook the spurious faults: */
if (unlikely(notify_page_fault(regs)))
if (unlikely(kprobes_fault(regs)))
return;
/*
* It's safe to allow irq's after cr2 has been saved and the
* vmalloc fault has been handled.
*
* User-mode registers count as a user access even for any
* potential system fault or CPU buglet:
*/
if (user_mode_vm(regs)) {
local_irq_enable();
error_code |= PF_USER;
flags |= FAULT_FLAG_USER;
} else {
if (regs->flags & X86_EFLAGS_IF)
local_irq_enable();
}
if (unlikely(error_code & PF_RSVD))
pgtable_bad(regs, error_code, address);
......@@ -1088,8 +1073,6 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
}
}
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
/*
* If we're in an interrupt, have no user context or are running
* in an atomic region then we must not take the fault:
......@@ -1099,6 +1082,24 @@ __do_page_fault(struct pt_regs *regs, unsigned long error_code)
return;
}
/*
* It's safe to allow irq's after cr2 has been saved and the
* vmalloc fault has been handled.
*
* User-mode registers count as a user access even for any
* potential system fault or CPU buglet:
*/
if (user_mode_vm(regs)) {
local_irq_enable();
error_code |= PF_USER;
flags |= FAULT_FLAG_USER;
} else {
if (regs->flags & X86_EFLAGS_IF)
local_irq_enable();
}
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
if (error_code & PF_WRITE)
flags |= FAULT_FLAG_WRITE;
......
......@@ -47,7 +47,7 @@ dump_user_backtrace_32(struct stack_frame_ia32 *head)
unsigned long bytes;
bytes = copy_from_user_nmi(bufhead, head, sizeof(bufhead));
if (bytes != sizeof(bufhead))
if (bytes != 0)
return NULL;
fp = (struct stack_frame_ia32 *) compat_ptr(bufhead[0].next_frame);
......@@ -93,7 +93,7 @@ static struct stack_frame *dump_user_backtrace(struct stack_frame *head)
unsigned long bytes;
bytes = copy_from_user_nmi(bufhead, head, sizeof(bufhead));
if (bytes != sizeof(bufhead))
if (bytes != 0)
return NULL;
oprofile_add_trace(bufhead[0].return_address);
......
......@@ -584,6 +584,10 @@ struct perf_sample_data {
struct perf_regs_user regs_user;
u64 stack_user_size;
u64 weight;
/*
* Transaction flags for abort events:
*/
u64 txn;
};
static inline void perf_sample_data_init(struct perf_sample_data *data,
......@@ -599,6 +603,7 @@ static inline void perf_sample_data_init(struct perf_sample_data *data,
data->stack_user_size = 0;
data->weight = 0;
data->data_src.val = 0;
data->txn = 0;
}
extern void perf_output_sample(struct perf_output_handle *handle,
......
......@@ -30,6 +30,7 @@
struct vm_area_struct;
struct mm_struct;
struct inode;
struct notifier_block;
#ifdef CONFIG_ARCH_SUPPORTS_UPROBES
# include <asm/uprobes.h>
......@@ -108,6 +109,7 @@ extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsign
extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr);
extern bool __weak is_swbp_insn(uprobe_opcode_t *insn);
extern bool __weak is_trap_insn(uprobe_opcode_t *insn);
extern int uprobe_write_opcode(struct mm_struct *mm, unsigned long vaddr, uprobe_opcode_t);
extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc);
extern int uprobe_apply(struct inode *inode, loff_t offset, struct uprobe_consumer *uc, bool);
extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc);
......@@ -117,14 +119,21 @@ extern void uprobe_start_dup_mmap(void);
extern void uprobe_end_dup_mmap(void);
extern void uprobe_dup_mmap(struct mm_struct *oldmm, struct mm_struct *newmm);
extern void uprobe_free_utask(struct task_struct *t);
extern void uprobe_copy_process(struct task_struct *t);
extern void uprobe_copy_process(struct task_struct *t, unsigned long flags);
extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs);
extern int uprobe_post_sstep_notifier(struct pt_regs *regs);
extern int uprobe_pre_sstep_notifier(struct pt_regs *regs);
extern void uprobe_notify_resume(struct pt_regs *regs);
extern bool uprobe_deny_signal(void);
extern bool __weak arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs *regs);
extern void uprobe_clear_state(struct mm_struct *mm);
extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long addr);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern unsigned long arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs *regs);
#else /* !CONFIG_UPROBES */
struct uprobes_state {
};
......@@ -174,7 +183,7 @@ static inline unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
static inline void uprobe_free_utask(struct task_struct *t)
{
}
static inline void uprobe_copy_process(struct task_struct *t)
static inline void uprobe_copy_process(struct task_struct *t, unsigned long flags)
{
}
static inline void uprobe_clear_state(struct mm_struct *mm)
......
......@@ -136,8 +136,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
PERF_SAMPLE_IDENTIFIER = 1U << 16,
PERF_SAMPLE_TRANSACTION = 1U << 17,
PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
PERF_SAMPLE_MAX = 1U << 18, /* non-ABI */
};
/*
......@@ -180,6 +181,28 @@ enum perf_sample_regs_abi {
PERF_SAMPLE_REGS_ABI_64 = 2,
};
/*
* Values for the memory transaction event qualifier, mostly for
* abort events. Multiple bits can be set.
*/
enum {
PERF_TXN_ELISION = (1 << 0), /* From elision */
PERF_TXN_TRANSACTION = (1 << 1), /* From transaction */
PERF_TXN_SYNC = (1 << 2), /* Instruction is related */
PERF_TXN_ASYNC = (1 << 3), /* Instruction not related */
PERF_TXN_RETRY = (1 << 4), /* Retry possible */
PERF_TXN_CONFLICT = (1 << 5), /* Conflict abort */
PERF_TXN_CAPACITY_WRITE = (1 << 6), /* Capacity write abort */
PERF_TXN_CAPACITY_READ = (1 << 7), /* Capacity read abort */
PERF_TXN_MAX = (1 << 8), /* non-ABI */
/* bits 32..63 are reserved for the abort code */
PERF_TXN_ABORT_MASK = (0xffffffffULL << 32),
PERF_TXN_ABORT_SHIFT = 32,
};
/*
* The format of the data returned by read() on a perf event fd,
* as specified by attr.read_format:
......
This diff is collapsed.
......@@ -82,16 +82,16 @@ static inline unsigned long perf_data_size(struct ring_buffer *rb)
}
#define DEFINE_OUTPUT_COPY(func_name, memcpy_func) \
static inline unsigned int \
static inline unsigned long \
func_name(struct perf_output_handle *handle, \
const void *buf, unsigned int len) \
const void *buf, unsigned long len) \
{ \
unsigned long size, written; \
\
do { \
size = min_t(unsigned long, handle->size, len); \
\
size = min(handle->size, len); \
written = memcpy_func(handle->addr, buf, size); \
written = size - written; \
\
len -= written; \
handle->addr += written; \
......@@ -110,20 +110,37 @@ func_name(struct perf_output_handle *handle, \
return len; \
}
static inline int memcpy_common(void *dst, const void *src, size_t n)
static inline unsigned long
memcpy_common(void *dst, const void *src, unsigned long n)
{
memcpy(dst, src, n);
return n;
return 0;
}
DEFINE_OUTPUT_COPY(__output_copy, memcpy_common)
#define MEMCPY_SKIP(dst, src, n) (n)
static inline unsigned long
memcpy_skip(void *dst, const void *src, unsigned long n)
{
return 0;
}
DEFINE_OUTPUT_COPY(__output_skip, MEMCPY_SKIP)
DEFINE_OUTPUT_COPY(__output_skip, memcpy_skip)
#ifndef arch_perf_out_copy_user
#define arch_perf_out_copy_user __copy_from_user_inatomic
#define arch_perf_out_copy_user arch_perf_out_copy_user
static inline unsigned long
arch_perf_out_copy_user(void *dst, const void *src, unsigned long n)
{
unsigned long ret;
pagefault_disable();
ret = __copy_from_user_inatomic(dst, src, n);
pagefault_enable();
return ret;
}
#endif
DEFINE_OUTPUT_COPY(__output_copy_user, arch_perf_out_copy_user)
......
......@@ -12,40 +12,10 @@
#include <linux/perf_event.h>
#include <linux/vmalloc.h>
#include <linux/slab.h>
#include <linux/circ_buf.h>
#include "internal.h"
static bool perf_output_space(struct ring_buffer *rb, unsigned long tail,
unsigned long offset, unsigned long head)
{
unsigned long sz = perf_data_size(rb);
unsigned long mask = sz - 1;
/*
* check if user-writable
* overwrite : over-write its own tail
* !overwrite: buffer possibly drops events.
*/
if (rb->overwrite)
return true;
/*
* verify that payload is not bigger than buffer
* otherwise masking logic may fail to detect
* the "not enough space" condition
*/
if ((head - offset) > sz)
return false;
offset = (offset - tail) & mask;
head = (head - tail) & mask;
if ((int)(head - offset) < 0)
return false;
return true;
}
static void perf_output_wakeup(struct perf_output_handle *handle)
{
atomic_set(&handle->rb->poll, POLL_IN);
......@@ -115,8 +85,8 @@ static void perf_output_put_handle(struct perf_output_handle *handle)
rb->user_page->data_head = head;
/*
* Now check if we missed an update, rely on the (compiler)
* barrier in atomic_dec_and_test() to re-read rb->head.
* Now check if we missed an update -- rely on previous implied
* compiler barriers to force a re-read.
*/
if (unlikely(head != local_read(&rb->head))) {
local_inc(&rb->nest);
......@@ -135,8 +105,7 @@ int perf_output_begin(struct perf_output_handle *handle,
{
struct ring_buffer *rb;
unsigned long tail, offset, head;
int have_lost;
struct perf_sample_data sample_data;
int have_lost, page_shift;
struct {
struct perf_event_header header;
u64 id;
......@@ -151,57 +120,63 @@ int perf_output_begin(struct perf_output_handle *handle,
event = event->parent;
rb = rcu_dereference(event->rb);
if (!rb)
if (unlikely(!rb))
goto out;
handle->rb = rb;
handle->event = event;
if (!rb->nr_pages)
if (unlikely(!rb->nr_pages))
goto out;
handle->rb = rb;
handle->event = event;
have_lost = local_read(&rb->lost);
if (have_lost) {
lost_event.header.size = sizeof(lost_event);
perf_event_header__init_id(&lost_event.header, &sample_data,
event);
size += lost_event.header.size;
if (unlikely(have_lost)) {
size += sizeof(lost_event);
if (event->attr.sample_id_all)
size += event->id_header_size;
}
perf_output_get_handle(handle);
do {
/*
* Userspace could choose to issue a mb() before updating the
* tail pointer. So that all reads will be completed before the
* write is issued.
*
* See perf_output_put_handle().
*/
tail = ACCESS_ONCE(rb->user_page->data_tail);
smp_mb();
offset = head = local_read(&rb->head);
head += size;
if (unlikely(!perf_output_space(rb, tail, offset, head)))
if (!rb->overwrite &&
unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size))
goto fail;
head += size;
} while (local_cmpxchg(&rb->head, offset, head) != offset);
if (head - local_read(&rb->wakeup) > rb->watermark)
/*
* Separate the userpage->tail read from the data stores below.
* Matches the MB userspace SHOULD issue after reading the data
* and before storing the new tail position.
*
* See perf_output_put_handle().
*/
smp_mb();
if (unlikely(head - local_read(&rb->wakeup) > rb->watermark))
local_add(rb->watermark, &rb->wakeup);
handle->page = offset >> (PAGE_SHIFT + page_order(rb));
handle->page &= rb->nr_pages - 1;
handle->size = offset & ((PAGE_SIZE << page_order(rb)) - 1);
handle->addr = rb->data_pages[handle->page];
handle->addr += handle->size;
handle->size = (PAGE_SIZE << page_order(rb)) - handle->size;
page_shift = PAGE_SHIFT + page_order(rb);
if (have_lost) {
handle->page = (offset >> page_shift) & (rb->nr_pages - 1);
offset &= (1UL << page_shift) - 1;
handle->addr = rb->data_pages[handle->page] + offset;
handle->size = (1UL << page_shift) - offset;
if (unlikely(have_lost)) {
struct perf_sample_data sample_data;
lost_event.header.size = sizeof(lost_event);
lost_event.header.type = PERF_RECORD_LOST;
lost_event.header.misc = 0;
lost_event.id = event->id;
lost_event.lost = local_xchg(&rb->lost, 0);
perf_event_header__init_id(&lost_event.header,
&sample_data, event);
perf_output_put(handle, lost_event);
perf_event__output_id_sample(event, handle, &sample_data);
}
......
This diff is collapsed.
......@@ -1373,7 +1373,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
INIT_LIST_HEAD(&p->pi_state_list);
p->pi_state_cache = NULL;
#endif
uprobe_copy_process(p);
/*
* sigaltstack should be cleared when sharing the same VM
*/
......@@ -1490,6 +1489,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
perf_event_fork(p);
trace_task_newtask(p, clone_flags);
uprobe_copy_process(p, clone_flags);
return p;
......
......@@ -1049,6 +1049,7 @@ static struct ctl_table kern_table[] = {
.maxlen = sizeof(sysctl_perf_event_sample_rate),
.mode = 0644,
.proc_handler = perf_proc_update_handler,
.extra1 = &one,
},
{
.procname = "perf_cpu_time_max_percent",
......
......@@ -115,7 +115,9 @@ git --git-dir=$(srctree)/.git archive --prefix=$(perf-tar)/ \
-o $(perf-tar).tar; \
mkdir -p $(perf-tar); \
git --git-dir=$(srctree)/.git rev-parse HEAD > $(perf-tar)/HEAD; \
tar rf $(perf-tar).tar $(perf-tar)/HEAD; \
(cd $(srctree)/tools/perf; \
util/PERF-VERSION-GEN ../../$(perf-tar)/ 2>/dev/null); \
tar rf $(perf-tar).tar $(perf-tar)/HEAD $(perf-tar)/PERF-VERSION-FILE; \
rm -r $(perf-tar); \
$(if $(findstring tar-src,$@),, \
$(if $(findstring bz2,$@),bzip2, \
......
......@@ -134,14 +134,14 @@ ifeq ($(VERBOSE),1)
print_install =
else
Q = @
print_compile = echo ' CC '$(OBJ);
print_app_build = echo ' BUILD '$(OBJ);
print_fpic_compile = echo ' CC FPIC '$(OBJ);
print_shared_lib_compile = echo ' BUILD SHARED LIB '$(OBJ);
print_plugin_obj_compile = echo ' CC PLUGIN OBJ '$(OBJ);
print_plugin_build = echo ' CC PLUGI '$(OBJ);
print_static_lib_build = echo ' BUILD STATIC LIB '$(OBJ);
print_install = echo ' INSTALL '$1' to $(DESTDIR_SQ)$2';
print_compile = echo ' CC '$(OBJ);
print_app_build = echo ' BUILD '$(OBJ);
print_fpic_compile = echo ' CC FPIC '$(OBJ);
print_shared_lib_compile = echo ' BUILD SHARED LIB '$(OBJ);
print_plugin_obj_compile = echo ' BUILD PLUGIN OBJ '$(OBJ);
print_plugin_build = echo ' BUILD PLUGIN '$(OBJ);
print_static_lib_build = echo ' BUILD STATIC LIB '$(OBJ);
print_install = echo ' INSTALL '$1' to $(DESTDIR_SQ)$2';
endif
do_fpic_compile = \
......@@ -268,7 +268,7 @@ TRACK_CFLAGS = $(subst ','\'',$(CFLAGS)):$(ARCH):$(CROSS_COMPILE)
TRACEEVENT-CFLAGS: force
@FLAGS='$(TRACK_CFLAGS)'; \
if test x"$$FLAGS" != x"`cat TRACEEVENT-CFLAGS 2>/dev/null`" ; then \
echo 1>&2 " * new build flags or cross compiler"; \
echo 1>&2 " FLAGS: * new build flags or cross compiler"; \
echo "$$FLAGS" >TRACEEVENT-CFLAGS; \
fi
......
......@@ -305,6 +305,11 @@ int pevent_register_comm(struct pevent *pevent, const char *comm, int pid)
return 0;
}
void pevent_register_trace_clock(struct pevent *pevent, char *trace_clock)
{
pevent->trace_clock = trace_clock;
}
struct func_map {
unsigned long long addr;
char *func;
......@@ -599,10 +604,11 @@ find_printk(struct pevent *pevent, unsigned long long addr)
* This registers a string by the address it was stored in the kernel.
* The @fmt passed in is duplicated.
*/
int pevent_register_print_string(struct pevent *pevent, char *fmt,
int pevent_register_print_string(struct pevent *pevent, const char *fmt,
unsigned long long addr)
{
struct printk_list *item = malloc(sizeof(*item));
char *p;
if (!item)
return -1;
......@@ -610,10 +616,21 @@ int pevent_register_print_string(struct pevent *pevent, char *fmt,
item->next = pevent->printklist;
item->addr = addr;
/* Strip off quotes and '\n' from the end */
if (fmt[0] == '"')
fmt++;
item->printk = strdup(fmt);
if (!item->printk)
goto out_free;
p = item->printk + strlen(item->printk) - 1;
if (*p == '"')
*p = 0;
p -= 2;
if (strcmp(p, "\\n") == 0)
*p = 0;
pevent->printklist = item;
pevent->printk_count++;
......@@ -3488,6 +3505,7 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
struct pevent *pevent = event->pevent;
struct print_flag_sym *flag;
struct format_field *field;
struct printk_map *printk;
unsigned long long val, fval;
unsigned long addr;
char *str;
......@@ -3523,7 +3541,12 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
if (!(field->flags & FIELD_IS_ARRAY) &&
field->size == pevent->long_size) {
addr = *(unsigned long *)(data + field->offset);
trace_seq_printf(s, "%lx", addr);
/* Check if it matches a print format */
printk = find_printk(pevent, addr);
if (printk)
trace_seq_puts(s, printk->printk);
else
trace_seq_printf(s, "%lx", addr);
break;
}
str = malloc(len + 1);
......@@ -3565,15 +3588,23 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
}
break;
case PRINT_HEX:
field = arg->hex.field->field.field;
if (!field) {
str = arg->hex.field->field.name;
field = pevent_find_any_field(event, str);
if (!field)
goto out_warning_field;
arg->hex.field->field.field = field;
if (arg->hex.field->type == PRINT_DYNAMIC_ARRAY) {
unsigned long offset;
offset = pevent_read_number(pevent,
data + arg->hex.field->dynarray.field->offset,
arg->hex.field->dynarray.field->size);
hex = data + (offset & 0xffff);
} else {
field = arg->hex.field->field.field;
if (!field) {
str = arg->hex.field->field.name;
field = pevent_find_any_field(event, str);
if (!field)
goto out_warning_field;
arg->hex.field->field.field = field;
}
hex = data + field->offset;
}
hex = data + field->offset;
len = eval_num_arg(data, size, event, arg->hex.size);
for (i = 0; i < len; i++) {
if (i)
......@@ -3771,8 +3802,8 @@ static struct print_arg *make_bprint_args(char *fmt, void *data, int size, struc
if (asprintf(&arg->atom.atom, "%lld", ip) < 0)
goto out_free;
/* skip the first "%pf : " */
for (ptr = fmt + 6, bptr = data + field->offset;
/* skip the first "%pf: " */
for (ptr = fmt + 5, bptr = data + field->offset;
bptr < data + size && *ptr; ptr++) {
int ls = 0;
......@@ -3882,7 +3913,6 @@ get_bprint_format(void *data, int size __maybe_unused,
struct format_field *field;
struct printk_map *printk;
char *format;
char *p;
field = pevent->bprint_fmt_field;
......@@ -3899,25 +3929,13 @@ get_bprint_format(void *data, int size __maybe_unused,
printk = find_printk(pevent, addr);
if (!printk) {
if (asprintf(&format, "%%pf : (NO FORMAT FOUND at %llx)\n", addr) < 0)
if (asprintf(&format, "%%pf: (NO FORMAT FOUND at %llx)\n", addr) < 0)
return NULL;
return format;
}
p = printk->printk;
/* Remove any quotes. */
if (*p == '"')
p++;
if (asprintf(&format, "%s : %s", "%pf", p) < 0)
if (asprintf(&format, "%s: %s", "%pf", printk->printk) < 0)
return NULL;
/* remove ending quotes and new line since we will add one too */
p = format + strlen(format) - 1;
if (*p == '"')
*p = 0;
p -= 2;
if (strcmp(p, "\\n") == 0)
*p = 0;
return format;
}
......@@ -3963,7 +3981,7 @@ static int is_printable_array(char *p, unsigned int len)
unsigned int i;
for (i = 0; i < len && p[i]; i++)
if (!isprint(p[i]))
if (!isprint(p[i]) && !isspace(p[i]))
return 0;
return 1;
}
......@@ -4428,11 +4446,11 @@ void pevent_event_info(struct trace_seq *s, struct event_format *event,
{
int print_pretty = 1;
if (event->pevent->print_raw)
if (event->pevent->print_raw || (event->flags & EVENT_FL_PRINTRAW))
print_event_fields(s, record->data, record->size, event);
else {
if (event->handler)
if (event->handler && !(event->flags & EVENT_FL_NOHANDLE))
print_pretty = event->handler(s, record, event,
event->context);
......@@ -4443,8 +4461,21 @@ void pevent_event_info(struct trace_seq *s, struct event_format *event,
trace_seq_terminate(s);
}
static bool is_timestamp_in_us(char *trace_clock, bool use_trace_clock)
{
if (!use_trace_clock)
return true;
if (!strcmp(trace_clock, "local") || !strcmp(trace_clock, "global")
|| !strcmp(trace_clock, "uptime") || !strcmp(trace_clock, "perf"))
return true;
/* trace_clock is setting in tsc or counter mode */
return false;
}
void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
struct pevent_record *record)
struct pevent_record *record, bool use_trace_clock)
{
static const char *spaces = " "; /* 20 spaces */
struct event_format *event;
......@@ -4457,9 +4488,14 @@ void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
int pid;
int len;
int p;
bool use_usec_format;
secs = record->ts / NSECS_PER_SEC;
nsecs = record->ts - secs * NSECS_PER_SEC;
use_usec_format = is_timestamp_in_us(pevent->trace_clock,
use_trace_clock);
if (use_usec_format) {
secs = record->ts / NSECS_PER_SEC;
nsecs = record->ts - secs * NSECS_PER_SEC;
}
if (record->size < 0) {
do_warning("ug! negative record size %d", record->size);
......@@ -4484,15 +4520,20 @@ void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
} else
trace_seq_printf(s, "%16s-%-5d [%03d]", comm, pid, record->cpu);
if (pevent->flags & PEVENT_NSEC_OUTPUT) {
usecs = nsecs;
p = 9;
} else {
usecs = (nsecs + 500) / NSECS_PER_USEC;
p = 6;
}
if (use_usec_format) {
if (pevent->flags & PEVENT_NSEC_OUTPUT) {
usecs = nsecs;
p = 9;
} else {
usecs = (nsecs + 500) / NSECS_PER_USEC;
p = 6;
}
trace_seq_printf(s, " %5lu.%0*lu: %s: ", secs, p, usecs, event->name);
trace_seq_printf(s, " %5lu.%0*lu: %s: ",
secs, p, usecs, event->name);
} else
trace_seq_printf(s, " %12llu: %s: ",
record->ts, event->name);
/* Space out the event names evenly. */
len = strlen(event->name);
......@@ -5326,6 +5367,48 @@ int pevent_print_num_field(struct trace_seq *s, const char *fmt,
return -1;
}
/**
* pevent_print_func_field - print a field and a format for function pointers
* @s: The seq to print to
* @fmt: The printf format to print the field with.
* @event: the event that the field is for
* @name: The name of the field
* @record: The record with the field name.
* @err: print default error if failed.
*
* Returns: 0 on success, -1 field not found, or 1 if buffer is full.
*/
int pevent_print_func_field(struct trace_seq *s, const char *fmt,
struct event_format *event, const char *name,
struct pevent_record *record, int err)
{
struct format_field *field = pevent_find_field(event, name);
struct pevent *pevent = event->pevent;
unsigned long long val;
struct func_map *func;
char tmp[128];
if (!field)
goto failed;
if (pevent_read_number_field(field, record->data, &val))
goto failed;
func = find_func(pevent, val);
if (func)
snprintf(tmp, 128, "%s/0x%llx", func->func, func->addr - val);
else
sprintf(tmp, "0x%08llx", val);
return trace_seq_printf(s, fmt, tmp);
failed:
if (err)
trace_seq_printf(s, "CAN'T FIND FIELD \"%s\"", name);
return -1;
}
static void free_func_handle(struct pevent_function_handler *func)
{
struct pevent_func_params *params;
......
......@@ -20,6 +20,7 @@
#ifndef _PARSE_EVENTS_H
#define _PARSE_EVENTS_H
#include <stdbool.h>
#include <stdarg.h>
#include <regex.h>
......@@ -307,6 +308,8 @@ enum {
EVENT_FL_ISBPRINT = 0x04,
EVENT_FL_ISFUNCENT = 0x10,
EVENT_FL_ISFUNCRET = 0x20,
EVENT_FL_NOHANDLE = 0x40,
EVENT_FL_PRINTRAW = 0x80,
EVENT_FL_FAILED = 0x80000000
};
......@@ -450,6 +453,8 @@ struct pevent {
/* cache */
struct event_format *last_event;
char *trace_clock;
};
static inline void pevent_set_flag(struct pevent *pevent, int flag)
......@@ -527,14 +532,15 @@ enum trace_flag_type {
};
int pevent_register_comm(struct pevent *pevent, const char *comm, int pid);
void pevent_register_trace_clock(struct pevent *pevent, char *trace_clock);
int pevent_register_function(struct pevent *pevent, char *name,
unsigned long long addr, char *mod);
int pevent_register_print_string(struct pevent *pevent, char *fmt,
int pevent_register_print_string(struct pevent *pevent, const char *fmt,
unsigned long long addr);
int pevent_pid_is_registered(struct pevent *pevent, int pid);
void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
struct pevent_record *record);
struct pevent_record *record, bool use_trace_clock);
int pevent_parse_header_page(struct pevent *pevent, char *buf, unsigned long size,
int long_size);
......@@ -563,6 +569,10 @@ int pevent_print_num_field(struct trace_seq *s, const char *fmt,
struct event_format *event, const char *name,
struct pevent_record *record, int err);
int pevent_print_func_field(struct trace_seq *s, const char *fmt,
struct event_format *event, const char *name,
struct pevent_record *record, int err);
int pevent_register_event_handler(struct pevent *pevent, int id,
const char *sys_name, const char *event_name,
pevent_event_handler_func func, void *context);
......
......@@ -13,6 +13,7 @@ perf*.html
common-cmds.h
perf.data
perf.data.old
output.svg
perf-archive
tags
TAGS
......
......@@ -145,16 +145,17 @@ endif
ifneq ($(findstring $(MAKEFLAGS),s),s)
ifneq ($(V),1)
QUIET_ASCIIDOC = @echo ' ' ASCIIDOC $@;
QUIET_XMLTO = @echo ' ' XMLTO $@;
QUIET_DB2TEXI = @echo ' ' DB2TEXI $@;
QUIET_MAKEINFO = @echo ' ' MAKEINFO $@;
QUIET_DBLATEX = @echo ' ' DBLATEX $@;
QUIET_XSLTPROC = @echo ' ' XSLTPROC $@;
QUIET_GEN = @echo ' ' GEN $@;
QUIET_ASCIIDOC = @echo ' ASCIIDOC '$@;
QUIET_XMLTO = @echo ' XMLTO '$@;
QUIET_DB2TEXI = @echo ' DB2TEXI '$@;
QUIET_MAKEINFO = @echo ' MAKEINFO '$@;
QUIET_DBLATEX = @echo ' DBLATEX '$@;
QUIET_XSLTPROC = @echo ' XSLTPROC '$@;
QUIET_GEN = @echo ' GEN '$@;
QUIET_STDERR = 2> /dev/null
QUIET_SUBDIR0 = +@subdir=
QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo ' ' SUBDIR $$subdir; \
QUIET_SUBDIR1 = ;$(NO_SUBDIR) \
echo ' SUBDIR ' $$subdir; \
$(MAKE) $(PRINT_DIR) -C $$subdir
export V
endif
......@@ -183,47 +184,43 @@ ifdef missing_tools
endif
do-install-man: man
$(INSTALL) -d -m 755 $(DESTDIR)$(man1dir)
# $(INSTALL) -d -m 755 $(DESTDIR)$(man5dir)
# $(INSTALL) -d -m 755 $(DESTDIR)$(man7dir)
$(INSTALL) -m 644 $(DOC_MAN1) $(DESTDIR)$(man1dir)
# $(INSTALL) -m 644 $(DOC_MAN5) $(DESTDIR)$(man5dir)
# $(INSTALL) -m 644 $(DOC_MAN7) $(DESTDIR)$(man7dir)
$(call QUIET_INSTALL, Documentation-man) \
$(INSTALL) -d -m 755 $(DESTDIR)$(man1dir); \
# $(INSTALL) -d -m 755 $(DESTDIR)$(man5dir); \
# $(INSTALL) -d -m 755 $(DESTDIR)$(man7dir); \
$(INSTALL) -m 644 $(DOC_MAN1) $(DESTDIR)$(man1dir); \
# $(INSTALL) -m 644 $(DOC_MAN5) $(DESTDIR)$(man5dir); \
# $(INSTALL) -m 644 $(DOC_MAN7) $(DESTDIR)$(man7dir)
install-man: check-man-tools man
try-install-man:
ifdef missing_tools
$(warning Please install $(missing_tools) to have the man pages installed)
DO_INSTALL_MAN = $(warning Please install $(missing_tools) to have the man pages installed)
else
$(MAKE) do-install-man
DO_INSTALL_MAN = do-install-man
endif
try-install-man: $(DO_INSTALL_MAN)
install-info: info
$(INSTALL) -d -m 755 $(DESTDIR)$(infodir)
$(INSTALL) -m 644 $(OUTPUT)perf.info $(OUTPUT)perfman.info $(DESTDIR)$(infodir)
$(call QUIET_INSTALL, Documentation-info) \
$(INSTALL) -d -m 755 $(DESTDIR)$(infodir); \
$(INSTALL) -m 644 $(OUTPUT)perf.info $(OUTPUT)perfman.info $(DESTDIR)$(infodir); \
if test -r $(DESTDIR)$(infodir)/dir; then \
$(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perf.info ;\
$(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perfman.info ;\
$(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perf.info ;\
$(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perfman.info ;\
else \
echo "No directory found in $(DESTDIR)$(infodir)" >&2 ; \
fi
install-pdf: pdf
$(INSTALL) -d -m 755 $(DESTDIR)$(pdfdir)
$(INSTALL) -m 644 $(OUTPUT)user-manual.pdf $(DESTDIR)$(pdfdir)
$(call QUIET_INSTALL, Documentation-pdf) \
$(INSTALL) -d -m 755 $(DESTDIR)$(pdfdir); \
$(INSTALL) -m 644 $(OUTPUT)user-manual.pdf $(DESTDIR)$(pdfdir)
#install-html: html
# '$(SHELL_PATH_SQ)' ./install-webdoc.sh $(DESTDIR)$(htmldir)
ifneq ($(MAKECMDGOALS),clean)
ifneq ($(MAKECMDGOALS),tags)
$(OUTPUT)PERF-VERSION-FILE: .FORCE-PERF-VERSION-FILE
$(QUIET_SUBDIR0)../ $(QUIET_SUBDIR1) $(OUTPUT)PERF-VERSION-FILE
-include $(OUTPUT)PERF-VERSION-FILE
endif
endif
#
# Determine "include::" file references in asciidoc files.
......@@ -253,15 +250,17 @@ $(OUTPUT)cmd-list.made: cmd-list.perl ../command-list.txt $(MAN1_TXT)
$(PERL_PATH) ./cmd-list.perl ../command-list.txt $(QUIET_STDERR) && \
date >$@
CLEAN_FILES = \
$(MAN_XML) $(addsuffix +,$(MAN_XML)) \
$(MAN_HTML) $(addsuffix +,$(MAN_HTML)) \
$(DOC_HTML) $(DOC_MAN1) $(DOC_MAN5) $(DOC_MAN7) \
$(OUTPUT)*.texi $(OUTPUT)*.texi+ $(OUTPUT)*.texi++ \
$(OUTPUT)perf.info $(OUTPUT)perfman.info \
$(OUTPUT)howto-index.txt $(OUTPUT)howto/*.html $(OUTPUT)doc.dep \
$(OUTPUT)technical/api-*.html $(OUTPUT)technical/api-index.txt \
$(cmds_txt) $(OUTPUT)*.made
clean:
$(RM) $(MAN_XML) $(addsuffix +,$(MAN_XML))
$(RM) $(MAN_HTML) $(addsuffix +,$(MAN_HTML))
$(RM) $(DOC_HTML) $(DOC_MAN1) $(DOC_MAN5) $(DOC_MAN7)
$(RM) $(OUTPUT)*.texi $(OUTPUT)*.texi+ $(OUTPUT)*.texi++
$(RM) $(OUTPUT)perf.info $(OUTPUT)perfman.info
$(RM) $(OUTPUT)howto-index.txt $(OUTPUT)howto/*.html $(OUTPUT)doc.dep
$(RM) $(OUTPUT)technical/api-*.html $(OUTPUT)technical/api-index.txt
$(RM) $(cmds_txt) $(OUTPUT)*.made
$(call QUIET_CLEAN, Documentation) $(RM) $(CLEAN_FILES)
$(MAN_HTML): $(OUTPUT)%.html : %.txt
$(QUIET_ASCIIDOC)$(RM) $@+ $@ && \
......@@ -342,5 +341,3 @@ $(patsubst %.txt,%.html,$(wildcard howto/*.txt)): %.html : %.txt
#quick-install-html:
# '$(SHELL_PATH_SQ)' ./install-doc-quick.sh $(HTML_REF) $(DESTDIR)$(htmldir)
.PHONY: .FORCE-PERF-VERSION-FILE
......@@ -21,6 +21,19 @@ OPTIONS
-a::
--add=::
Add specified file to the cache.
-k::
--kcore::
Add specified kcore file to the cache. For the current host that is
/proc/kcore which requires root permissions to read. Be aware that
running 'perf buildid-cache' as root may update root's build-id cache
not the user's. Use the -v option to see where the file is created.
Note that the copied file contains only code sections not the whole core
image. Note also that files "kallsyms" and "modules" must also be in the
same directory and are also copied. All 3 files are created with read
permissions for root only. kcore will not be added if there is already a
kcore in the cache (with the same build-id) that has the same modules at
the same addresses. Use the -v option to see if a copy of kcore is
actually made.
-r::
--remove=::
Remove specified file from the cache.
......
......@@ -109,7 +109,9 @@ STAT LIVE OPTIONS
-m::
--mmap-pages=::
Number of mmap data pages. Must be a power of two.
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
-a::
--all-cpus::
......
......@@ -48,7 +48,7 @@ REPORT OPTIONS
-k::
--key=<value>::
Sorting key. Possible values: acquired (default), contended,
wait_total, wait_max, wait_min.
avg_wait, wait_total, wait_max, wait_min.
INFO OPTIONS
------------
......
......@@ -87,7 +87,9 @@ OPTIONS
-m::
--mmap-pages=::
Number of mmap data pages. Must be a power of two.
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
-g::
Enables call-graph (stack chain/backtrace) recording.
......@@ -178,6 +180,9 @@ following filters are defined:
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the target is at the hypervisor level
- in_tx: only when the target is in a hardware transaction
- no_tx: only when the target is not in a hardware transaction
- abort_tx: only when the target is a hardware transaction abort
+
The option requires at least one branch type among any, any_call, any_ret, ind_call.
......@@ -188,12 +193,14 @@ is enabled for all the sampling events. The sampled branch type is the same for
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.
-W::
--weight::
Enable weightened sampling. An additional weight is recorded per sample and can be
displayed with the weight and local_weight sort keys. This currently works for TSX
abort events and some memory events in precise mode on modern Intel CPUs.
--transaction::
Record transaction flags for transaction related events.
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
......@@ -71,7 +71,11 @@ OPTIONS
entries are displayed as "[other]".
- cpu: cpu number the task ran at the time of sample
- srcline: filename and line number executed at the time of sample. The
DWARF debuggin info must be provided.
DWARF debugging info must be provided.
- weight: Event specific weight, e.g. memory latency or transaction
abort cost. This is the global weight.
- local_weight: Local weight version of the weight above.
- transaction: Transaction abort flags.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
......@@ -85,6 +89,8 @@ OPTIONS
- symbol_from: name of function branched from
- symbol_to: name of function branched to
- mispredict: "N" for predicted branch, "Y" for mispredicted branch
- in_tx: branch in TSX transaction
- abort: TSX transaction abort.
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
......@@ -135,6 +141,14 @@ OPTIONS
Default: fractal,0.5,callee,function.
--max-stack::
Set the stack depth limit when parsing the callchain, anything
beyond the specified depth will be ignored. This is a trade-off
between information loss and faster processing especially for
workloads that can have a very long callchain stack.
Default: 127
-G::
--inverted::
alias for inverted caller based call graph.
......
......@@ -137,6 +137,11 @@ core number and the number of online logical processors on that physical process
After starting the program, wait msecs before measuring. This is useful to
filter out the startup phase of the program, which is often very different.
-T::
--transaction::
Print statistics of transactional execution if supported.
EXAMPLES
--------
......
......@@ -8,7 +8,8 @@ perf-timechart - Tool to visualize total system behavior during a workload
SYNOPSIS
--------
[verse]
'perf timechart' {record}
'perf timechart' record <command>
'perf timechart' [<options>]
DESCRIPTION
-----------
......@@ -41,6 +42,18 @@ OPTIONS
--symfs=<directory>::
Look for files with symbols relative to this directory.
EXAMPLES
--------
$ perf timechart record git pull
[ perf record: Woken up 13 times to write data ]
[ perf record: Captured and wrote 4.253 MB perf.data (~185801 samples) ]
$ perf timechart
Written 10.2 seconds of trace to output.svg.
SEE ALSO
--------
linkperf:perf-record[1]
......@@ -68,7 +68,9 @@ Default is to monitor all CPUS.
-m <pages>::
--mmap-pages=<pages>::
Number of mmapped data pages.
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
-p <pid>::
--pid=<pid>::
......@@ -112,7 +114,8 @@ Default is to monitor all CPUS.
-s::
--sort::
Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight, local_weight.
Sort by key(s): pid, comm, dso, symbol, parent, srcline, weight,
local_weight, abort, in_tx, transaction
-n::
--show-nr-samples::
......@@ -147,6 +150,14 @@ Default is to monitor all CPUS.
Setup and enable call-graph (stack chain/backtrace) recording,
implies -G.
--max-stack::
Set the stack depth limit when parsing the callchain, anything
beyond the specified depth will be ignored. This is a trade-off
between information loss and faster processing especially for
workloads that can have a very long callchain stack.
Default: 127
--ignore-callees=<regex>::
Ignore callees of the function(s) matching the given regex.
This has the effect of collecting the callers of each such
......
......@@ -9,6 +9,7 @@ SYNOPSIS
--------
[verse]
'perf trace'
'perf trace record'
DESCRIPTION
-----------
......@@ -16,9 +17,14 @@ This command will show the events associated with the target, initially
syscalls, but other system events like pagefaults, task lifetime events,
scheduling events, etc.
Initially this is a live mode only tool, but eventually will work with
perf.data files like the other tools, allowing a detached 'record' from
analysis phases.
This is a live mode tool in addition to working with perf.data files like
the other perf tools. Files can be generated using the 'perf record' command
but the session needs to include the raw_syscalls events (-e 'raw_syscalls:*').
Alernatively, the 'perf trace record' can be used as a shortcut to
automatically include the raw_syscalls events when writing events to a file.
The following options apply to perf trace; options to perf trace record are
found in the perf record man page.
OPTIONS
-------
......@@ -59,7 +65,9 @@ OPTIONS
-m::
--mmap-pages=::
Number of mmap data pages. Must be a power of two.
Number of mmap data pages (must be a power of two) or size
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
-C::
--cpu::
......@@ -78,6 +86,21 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
--input
Process events from a given perf data file.
-T
--time
Print full timestamp rather time relative to first sample.
--comm::
Show process COMM right beside its ID, on by default, disable with --no-comm.
--summary::
Show a summary of syscalls by thread with min, max, and average times (in
msec) and relative stddev.
--tool_stats::
Show tool stats such as number of times fd->pathname was discovered thru
hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script[1]
This diff is collapsed.
This diff is collapsed.
......@@ -5,7 +5,7 @@
#include "../../util/types.h"
#include <asm/perf_regs.h>
#ifndef ARCH_X86_64
#ifndef HAVE_ARCH_X86_64_SUPPORT
#define PERF_REGS_MASK ((1ULL << PERF_REG_X86_32_MAX) - 1)
#else
#define REG_NOSUPPORT ((1ULL << PERF_REG_X86_DS) | \
......@@ -52,7 +52,7 @@ static inline const char *perf_reg_name(int id)
return "FS";
case PERF_REG_X86_GS:
return "GS";
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
case PERF_REG_X86_R8:
return "R8";
case PERF_REG_X86_R9:
......@@ -69,7 +69,7 @@ static inline const char *perf_reg_name(int id)
return "R14";
case PERF_REG_X86_R15:
return "R15";
#endif /* ARCH_X86_64 */
#endif /* HAVE_ARCH_X86_64_SUPPORT */
default:
return NULL;
}
......
......@@ -4,7 +4,7 @@
#include "perf_regs.h"
#include "../../util/unwind.h"
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
int unwind__arch_reg_id(int regnum)
{
int id;
......@@ -108,4 +108,4 @@ int unwind__arch_reg_id(int regnum)
return id;
}
#endif /* ARCH_X86_64 */
#endif /* HAVE_ARCH_X86_64_SUPPORT */
# perf completion
function_exists()
# Taken from git.git's completion script.
__my_reassemble_comp_words_by_ref()
{
declare -F $1 > /dev/null
return $?
local exclude i j first
# Which word separators to exclude?
exclude="${1//[^$COMP_WORDBREAKS]}"
cword_=$COMP_CWORD
if [ -z "$exclude" ]; then
words_=("${COMP_WORDS[@]}")
return
fi
# List of word completion separators has shrunk;
# re-assemble words to complete.
for ((i=0, j=0; i < ${#COMP_WORDS[@]}; i++, j++)); do
# Append each nonempty word consisting of just
# word separator characters to the current word.
first=t
while
[ $i -gt 0 ] &&
[ -n "${COMP_WORDS[$i]}" ] &&
# word consists of excluded word separators
[ "${COMP_WORDS[$i]//[^$exclude]}" = "${COMP_WORDS[$i]}" ]
do
# Attach to the previous token,
# unless the previous token is the command name.
if [ $j -ge 2 ] && [ -n "$first" ]; then
((j--))
fi
first=
words_[$j]=${words_[j]}${COMP_WORDS[i]}
if [ $i = $COMP_CWORD ]; then
cword_=$j
fi
if (($i < ${#COMP_WORDS[@]} - 1)); then
((i++))
else
# Done.
return
fi
done
words_[$j]=${words_[j]}${COMP_WORDS[i]}
if [ $i = $COMP_CWORD ]; then
cword_=$j
fi
done
}
function_exists __ltrim_colon_completions ||
type _get_comp_words_by_ref &>/dev/null ||
_get_comp_words_by_ref()
{
local exclude cur_ words_ cword_
if [ "$1" = "-n" ]; then
exclude=$2
shift 2
fi
__my_reassemble_comp_words_by_ref "$exclude"
cur_=${words_[cword_]}
while [ $# -gt 0 ]; do
case "$1" in
cur)
cur=$cur_
;;
prev)
prev=${words_[$cword_-1]}
;;
words)
words=("${words_[@]}")
;;
cword)
cword=$cword_
;;
esac
shift
done
}
type __ltrim_colon_completions &>/dev/null ||
__ltrim_colon_completions()
{
if [[ "$1" == *:* && "$COMP_WORDBREAKS" == *:* ]]; then
# Remove colon-word prefix from COMPREPLY items
local colon_word=${1%${1##*:}}
local colon_word=${1%"${1##*:}"}
local i=${#COMPREPLY[*]}
while [[ $((--i)) -ge 0 ]]; do
COMPREPLY[$i]=${COMPREPLY[$i]#"$colon_word"}
......@@ -19,23 +89,18 @@ __ltrim_colon_completions()
fi
}
have perf &&
type perf &>/dev/null &&
_perf()
{
local cur prev cmd
local cur words cword prev cmd
COMPREPLY=()
if function_exists _get_comp_words_by_ref; then
_get_comp_words_by_ref -n : cur prev
else
cur=$(_get_cword :)
prev=${COMP_WORDS[COMP_CWORD-1]}
fi
_get_comp_words_by_ref -n =: cur words cword prev
cmd=${COMP_WORDS[0]}
cmd=${words[0]}
# List perf subcommands or long options
if [ $COMP_CWORD -eq 1 ]; then
if [ $cword -eq 1 ]; then
if [[ $cur == --* ]]; then
COMPREPLY=( $( compgen -W '--help --version \
--exec-path --html-path --paginate --no-pager \
......@@ -45,18 +110,17 @@ _perf()
COMPREPLY=( $( compgen -W '$cmds' -- "$cur" ) )
fi
# List possible events for -e option
elif [[ $prev == "-e" && "${COMP_WORDS[1]}" == @(record|stat|top) ]]; then
elif [[ $prev == "-e" && "${words[1]}" == @(record|stat|top) ]]; then
evts=$($cmd list --raw-dump)
COMPREPLY=( $( compgen -W '$evts' -- "$cur" ) )
__ltrim_colon_completions $cur
# List long option names
elif [[ $cur == --* ]]; then
subcmd=${COMP_WORDS[1]}
subcmd=${words[1]}
opts=$($cmd $subcmd --list-opts)
COMPREPLY=( $( compgen -W '$opts' -- "$cur" ) )
# Fall down to list regular files
else
_filedir
fi
} &&
complete -F _perf perf
complete -o bashdefault -o default -o nospace -F _perf perf 2>/dev/null \
|| complete -o default -o nospace -F _perf perf
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
#define MEMCPY_FN(fn, name, desc) \
extern void *fn(void *, const void *, size_t);
......
......@@ -58,7 +58,7 @@ struct routine routines[] = {
{ "default",
"Default memcpy() provided by glibc",
memcpy },
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
#define MEMCPY_FN(fn, name, desc) { name, desc, fn },
#include "mem-memcpy-x86-64-asm-def.h"
......
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
#define MEMSET_FN(fn, name, desc) \
extern void *fn(void *, int, size_t);
......
......@@ -58,7 +58,7 @@ static const struct routine routines[] = {
{ "default",
"Default memset() provided by glibc",
memset },
#ifdef ARCH_X86_64
#ifdef HAVE_ARCH_X86_64_SUPPORT
#define MEMSET_FN(fn, name, desc) { name, desc, fn },
#include "mem-memset-x86-64-asm-def.h"
......
......@@ -429,14 +429,14 @@ static int parse_cpu_list(const char *arg)
return 0;
}
static void parse_setup_cpu_list(void)
static int parse_setup_cpu_list(void)
{
struct thread_data *td;
char *str0, *str;
int t;
if (!g->p.cpu_list_str)
return;
return 0;
dprintf("g->p.nr_tasks: %d\n", g->p.nr_tasks);
......@@ -500,8 +500,12 @@ static void parse_setup_cpu_list(void)
dprintf("CPUs: %d_%d-%d#%dx%d\n", bind_cpu_0, bind_len, bind_cpu_1, step, mul);
BUG_ON(bind_cpu_0 < 0 || bind_cpu_0 >= g->p.nr_cpus);
BUG_ON(bind_cpu_1 < 0 || bind_cpu_1 >= g->p.nr_cpus);
if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >= g->p.nr_cpus) {
printf("\nTest not applicable, system has only %d CPUs.\n", g->p.nr_cpus);
return -1;
}
BUG_ON(bind_cpu_0 < 0 || bind_cpu_1 < 0);
BUG_ON(bind_cpu_0 > bind_cpu_1);
for (bind_cpu = bind_cpu_0; bind_cpu <= bind_cpu_1; bind_cpu += step) {
......@@ -541,6 +545,7 @@ static void parse_setup_cpu_list(void)
printf("# NOTE: %d tasks bound, %d tasks unbound\n", t, g->p.nr_tasks - t);
free(str0);
return 0;
}
static int parse_cpus_opt(const struct option *opt __maybe_unused,
......@@ -561,14 +566,14 @@ static int parse_node_list(const char *arg)
return 0;
}
static void parse_setup_node_list(void)
static int parse_setup_node_list(void)
{
struct thread_data *td;
char *str0, *str;
int t;
if (!g->p.node_list_str)
return;
return 0;
dprintf("g->p.nr_tasks: %d\n", g->p.nr_tasks);
......@@ -619,8 +624,12 @@ static void parse_setup_node_list(void)
dprintf("NODEs: %d-%d #%d\n", bind_node_0, bind_node_1, step);
BUG_ON(bind_node_0 < 0 || bind_node_0 >= g->p.nr_nodes);
BUG_ON(bind_node_1 < 0 || bind_node_1 >= g->p.nr_nodes);
if (bind_node_0 >= g->p.nr_nodes || bind_node_1 >= g->p.nr_nodes) {
printf("\nTest not applicable, system has only %d nodes.\n", g->p.nr_nodes);
return -1;
}
BUG_ON(bind_node_0 < 0 || bind_node_1 < 0);
BUG_ON(bind_node_0 > bind_node_1);
for (bind_node = bind_node_0; bind_node <= bind_node_1; bind_node += step) {
......@@ -651,6 +660,7 @@ static void parse_setup_node_list(void)
printf("# NOTE: %d tasks mem-bound, %d tasks unbound\n", t, g->p.nr_tasks - t);
free(str0);
return 0;
}
static int parse_nodes_opt(const struct option *opt __maybe_unused,
......@@ -1110,7 +1120,7 @@ static void *worker_thread(void *__tdata)
/* Check whether our max runtime timed out: */
if (g->p.nr_secs) {
timersub(&stop, &start0, &diff);
if (diff.tv_sec >= g->p.nr_secs) {
if ((u32)diff.tv_sec >= g->p.nr_secs) {
g->stop_work = true;
break;
}
......@@ -1157,7 +1167,7 @@ static void *worker_thread(void *__tdata)
runtime_ns_max += diff.tv_usec * 1000;
if (details >= 0) {
printf(" #%2d / %2d: %14.2lf nsecs/op [val: %016lx]\n",
printf(" #%2d / %2d: %14.2lf nsecs/op [val: %016"PRIx64"]\n",
process_nr, thread_nr, runtime_ns_max / bytes_done, val);
}
fflush(stdout);
......@@ -1356,8 +1366,8 @@ static int init(void)
init_thread_data();
tprintf("#\n");
parse_setup_cpu_list();
parse_setup_node_list();
if (parse_setup_cpu_list() || parse_setup_node_list())
return -1;
tprintf("#\n");
print_summary();
......@@ -1600,7 +1610,6 @@ static int run_bench_numa(const char *name, const char **argv)
return 0;
err:
usage_with_options(numa_usage, options);
return -1;
}
......@@ -1701,8 +1710,7 @@ static int bench_all(void)
BUG_ON(ret < 0);
for (i = 0; i < nr; i++) {
if (run_bench_numa(tests[i][0], tests[i] + 1))
return -1;
run_bench_numa(tests[i][0], tests[i] + 1);
}
printf("\n");
......
......@@ -7,9 +7,7 @@
* Based on pipe-test-1m.c by Ingo Molnar <mingo@redhat.com>
* http://people.redhat.com/mingo/cfs-scheduler/tools/pipe-test-1m.c
* Ported to perf by Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
*
*/
#include "../perf.h"
#include "../util/util.h"
#include "../util/parse-options.h"
......@@ -28,12 +26,24 @@
#include <sys/time.h>
#include <sys/types.h>
#include <pthread.h>
struct thread_data {
int nr;
int pipe_read;
int pipe_write;
pthread_t pthread;
};
#define LOOPS_DEFAULT 1000000
static int loops = LOOPS_DEFAULT;
static int loops = LOOPS_DEFAULT;
/* Use processes by default: */
static bool threaded;
static const struct option options[] = {
OPT_INTEGER('l', "loop", &loops,
"Specify number of loops"),
OPT_INTEGER('l', "loop", &loops, "Specify number of loops"),
OPT_BOOLEAN('T', "threaded", &threaded, "Specify threads/process based task setup"),
OPT_END()
};
......@@ -42,13 +52,37 @@ static const char * const bench_sched_pipe_usage[] = {
NULL
};
int bench_sched_pipe(int argc, const char **argv,
const char *prefix __maybe_unused)
static void *worker_thread(void *__tdata)
{
int pipe_1[2], pipe_2[2];
struct thread_data *td = __tdata;
int m = 0, i;
int ret;
for (i = 0; i < loops; i++) {
if (!td->nr) {
ret = read(td->pipe_read, &m, sizeof(int));
BUG_ON(ret != sizeof(int));
ret = write(td->pipe_write, &m, sizeof(int));
BUG_ON(ret != sizeof(int));
} else {
ret = write(td->pipe_write, &m, sizeof(int));
BUG_ON(ret != sizeof(int));
ret = read(td->pipe_read, &m, sizeof(int));
BUG_ON(ret != sizeof(int));
}
}
return NULL;
}
int bench_sched_pipe(int argc, const char **argv, const char *prefix __maybe_unused)
{
struct thread_data threads[2], *td;
int pipe_1[2], pipe_2[2];
struct timeval start, stop, diff;
unsigned long long result_usec = 0;
int nr_threads = 2;
int t;
/*
* why does "ret" exist?
......@@ -58,43 +92,66 @@ int bench_sched_pipe(int argc, const char **argv,
int __maybe_unused ret, wait_stat;
pid_t pid, retpid __maybe_unused;
argc = parse_options(argc, argv, options,
bench_sched_pipe_usage, 0);
argc = parse_options(argc, argv, options, bench_sched_pipe_usage, 0);
BUG_ON(pipe(pipe_1));
BUG_ON(pipe(pipe_2));
pid = fork();
assert(pid >= 0);
gettimeofday(&start, NULL);
if (!pid) {
for (i = 0; i < loops; i++) {
ret = read(pipe_1[0], &m, sizeof(int));
ret = write(pipe_2[1], &m, sizeof(int));
}
} else {
for (i = 0; i < loops; i++) {
ret = write(pipe_1[1], &m, sizeof(int));
ret = read(pipe_2[0], &m, sizeof(int));
for (t = 0; t < nr_threads; t++) {
td = threads + t;
td->nr = t;
if (t == 0) {
td->pipe_read = pipe_1[0];
td->pipe_write = pipe_2[1];
} else {
td->pipe_write = pipe_1[1];
td->pipe_read = pipe_2[0];
}
}
gettimeofday(&stop, NULL);
timersub(&stop, &start, &diff);
if (pid) {
if (threaded) {
for (t = 0; t < nr_threads; t++) {
td = threads + t;
ret = pthread_create(&td->pthread, NULL, worker_thread, td);
BUG_ON(ret);
}
for (t = 0; t < nr_threads; t++) {
td = threads + t;
ret = pthread_join(td->pthread, NULL);
BUG_ON(ret);
}
} else {
pid = fork();
assert(pid >= 0);
if (!pid) {
worker_thread(threads + 0);
exit(0);
} else {
worker_thread(threads + 1);
}
retpid = waitpid(pid, &wait_stat, 0);
assert((retpid == pid) && WIFEXITED(wait_stat));
} else {
exit(0);
}
gettimeofday(&stop, NULL);
timersub(&stop, &start, &diff);
switch (bench_format) {
case BENCH_FORMAT_DEFAULT:
printf("# Executed %d pipe operations between two tasks\n\n",
loops);
printf("# Executed %d pipe operations between two %s\n\n",
loops, threaded ? "threads" : "processes");
result_usec = diff.tv_sec * 1000000;
result_usec += diff.tv_usec;
......
......@@ -28,8 +28,10 @@
#include "util/hist.h"
#include "util/session.h"
#include "util/tool.h"
#include "util/data.h"
#include "arch/common.h"
#include <dlfcn.h>
#include <linux/bitmap.h>
struct perf_annotate {
......@@ -63,7 +65,7 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
he = __hists__add_entry(&evsel->hists, al, NULL, 1, 1);
he = __hists__add_entry(&evsel->hists, al, NULL, NULL, NULL, 1, 1, 0);
if (he == NULL)
return -ENOMEM;
......@@ -116,11 +118,11 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
ann->print_line, ann->full_paths, 0, 0);
}
static void hists__find_annotations(struct hists *self,
static void hists__find_annotations(struct hists *hists,
struct perf_evsel *evsel,
struct perf_annotate *ann)
{
struct rb_node *nd = rb_first(&self->entries), *next;
struct rb_node *nd = rb_first(&hists->entries), *next;
int key = K_RIGHT;
while (nd) {
......@@ -142,8 +144,18 @@ static void hists__find_annotations(struct hists *self,
if (use_browser == 2) {
int ret;
int (*annotate)(struct hist_entry *he,
struct perf_evsel *evsel,
struct hist_browser_timer *hbt);
annotate = dlsym(perf_gtk_handle,
"hist_entry__gtk_annotate");
if (annotate == NULL) {
ui__error("GTK browser not found!\n");
return;
}
ret = hist_entry__gtk_annotate(he, evsel, NULL);
ret = annotate(he, evsel, NULL);
if (!ret || !ann->skip_missing)
return;
......@@ -188,9 +200,13 @@ static int __cmd_annotate(struct perf_annotate *ann)
struct perf_session *session;
struct perf_evsel *pos;
u64 total_nr_samples;
struct perf_data_file file = {
.path = input_name,
.mode = PERF_DATA_MODE_READ,
.force = ann->force,
};
session = perf_session__new(input_name, O_RDONLY,
ann->force, false, &ann->tool);
session = perf_session__new(&file, false, &ann->tool);
if (session == NULL)
return -ENOMEM;
......@@ -231,7 +247,7 @@ static int __cmd_annotate(struct perf_annotate *ann)
if (nr_samples > 0) {
total_nr_samples += nr_samples;
hists__collapse_resort(hists);
hists__collapse_resort(hists, NULL);
hists__output_resort(hists);
if (symbol_conf.event_group &&
......@@ -243,12 +259,21 @@ static int __cmd_annotate(struct perf_annotate *ann)
}
if (total_nr_samples == 0) {
ui__error("The %s file has no samples!\n", session->filename);
ui__error("The %s file has no samples!\n", file.path);
goto out_delete;
}
if (use_browser == 2)
perf_gtk__show_annotations();
if (use_browser == 2) {
void (*show_annotations)(void);
show_annotations = dlsym(perf_gtk_handle,
"perf_gtk__show_annotations");
if (show_annotations == NULL) {
ui__error("GTK browser not found!\n");
goto out_delete;
}
show_annotations();
}
out_delete:
/*
......
This diff is collapsed.
This diff is collapsed.
......@@ -15,6 +15,7 @@
#include "util/parse-options.h"
#include "util/session.h"
#include "util/symbol.h"
#include "util/data.h"
static int sysfs__fprintf_build_id(FILE *fp)
{
......@@ -52,6 +53,11 @@ static bool dso__skip_buildid(struct dso *dso, int with_hits)
static int perf_session__list_build_ids(bool force, bool with_hits)
{
struct perf_session *session;
struct perf_data_file file = {
.path = input_name,
.mode = PERF_DATA_MODE_READ,
.force = force,
};
symbol__elf_init();
/*
......@@ -60,15 +66,14 @@ static int perf_session__list_build_ids(bool force, bool with_hits)
if (filename__fprintf_build_id(input_name, stdout))
goto out;
session = perf_session__new(input_name, O_RDONLY, force, false,
&build_id__mark_dso_hit_ops);
session = perf_session__new(&file, false, &build_id__mark_dso_hit_ops);
if (session == NULL)
return -1;
/*
* in pipe-mode, the only way to get the buildids is to parse
* the record stream. Buildids are stored as RECORD_HEADER_BUILD_ID
*/
if (with_hits || session->fd_pipe)
if (with_hits || perf_data_file__is_pipe(&file))
perf_session__process_events(session, &build_id__mark_dso_hit_ops);
perf_session__fprintf_dsos_buildid(session, stdout, dso__skip_buildid, with_hits);
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
#include <execinfo.h>
#include <stdio.h>
int main(void)
{
void *backtrace_fns[10];
size_t entries;
entries = backtrace(backtrace_fns, 10);
backtrace_symbols_fd(backtrace_fns, entries, 1);
return 0;
}
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment