Commit 654443e2 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull user-space probe instrumentation from Ingo Molnar:
 "The uprobes code originates from SystemTap and has been used for years
  in Fedora and RHEL kernels.  This version is much rewritten, reviews
  from PeterZ, Oleg and myself shaped the end result.

  This tree includes uprobes support in 'perf probe' - but SystemTap
  (and other tools) can take advantage of user probe points as well.

  Sample usage of uprobes via perf, for example to profile malloc()
  calls without modifying user-space binaries.

  First boot a new kernel with CONFIG_UPROBE_EVENT=y enabled.

  If you don't know which function you want to probe you can pick one
  from 'perf top' or can get a list all functions that can be probed
  within libc (binaries can be specified as well):

	$ perf probe -F -x /lib/libc.so.6

  To probe libc's malloc():

	$ perf probe -x /lib64/libc.so.6 malloc
	Added new event:
	probe_libc:malloc    (on 0x7eac0)

  You can now use it in all perf tools, such as:

	perf record -e probe_libc:malloc -aR sleep 1

  Make use of it to create a call graph (as the flat profile is going to
  look very boring):

	$ perf record -e probe_libc:malloc -gR make
	[ perf record: Woken up 173 times to write data ]
	[ perf record: Captured and wrote 44.190 MB perf.data (~1930712

	$ perf report | less

	  32.03%            git  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	                    |
	                    --- malloc
	                       |
	                       |--0.95%-- 0x208eb1000000000
	                       |
	                       |--0.63%-- htab_traverse_noresize

	  11.04%             as  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   7.15%             ld  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |

	   5.07%             sh  libc-2.15.so   [.] malloc
	                     |
	                     --- malloc
	                        |
	   4.99%  python-config  libc-2.15.so   [.] malloc
	          |
	          --- malloc
	             |
	   4.54%           make  libc-2.15.so   [.] malloc
	                   |
	                   --- malloc
	                      |
	                      |--7.34%-- glob
	                      |          |
	                      |          |--93.18%-- 0x41588f
	                      |          |
	                      |           --6.82%-- glob
	                      |                     0x41588f

	   ...

  Or:

	$ perf report -g flat | less

	# Overhead        Command  Shared Object      Symbol
	# ........  .............  .............  ..........
	#
	  32.03%            git  libc-2.15.so   [.] malloc
	          27.19%
	              malloc

	  29.49%            cc1  libc-2.15.so   [.] malloc
	          24.77%
	              malloc

	  11.04%             as  libc-2.15.so   [.] malloc
	          11.02%
	              malloc

	   7.15%             ld  libc-2.15.so   [.] malloc
	           6.57%
	              malloc

	 ...

  The core uprobes design is fairly straightforward: uprobes probe
  points register themselves at (inode:offset) addresses of
  libraries/binaries, after which all existing (or new) vmas that map
  that address will have a software breakpoint injected at that address.
  vmas are COW-ed to preserve original content.  The probe points are
  kept in an rbtree.

  If user-space executes the probed inode:offset instruction address
  then an event is generated which can be recovered from the regular
  perf event channels and mmap-ed ring-buffer.

  Multiple probes at the same address are supported, they create a
  dynamic callback list of event consumers.

  The basic model is further complicated by the XOL speedup: the
  original instruction that is probed is copied (in an architecture
  specific fashion) and executed out of line when the probe triggers.
  The XOL area is a single vma per process, with a fixed number of
  entries (which limits probe execution parallelism).

  The API: uprobes are installed/removed via
  /sys/kernel/debug/tracing/uprobe_events, the API is integrated to
  align with the kprobes interface as much as possible, but is separate
  to it.

  Injecting a probe point is privileged operation, which can be relaxed
  by setting perf_paranoid to -1.

  You can use multiple probes as well and mix them with kprobes and
  regular PMU events or tracepoints, when instrumenting a task."

Fix up trivial conflicts in mm/memory.c due to previous cleanup of
unmap_single_vma().

* 'perf-uprobes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  perf probe: Detect probe target when m/x options are absent
  perf probe: Provide perf interface for uprobes
  tracing: Fix kconfig warning due to a typo
  tracing: Provide trace events interface for uprobes
  tracing: Extract out common code for kprobes/uprobes trace events
  tracing: Modify is_delete, is_return from int to bool
  uprobes/core: Decrement uprobe count before the pages are unmapped
  uprobes/core: Make background page replacement logic account for rss_stat counters
  uprobes/core: Optimize probe hits with the help of a counter
  uprobes/core: Allocate XOL slots for uprobes use
  uprobes/core: Handle breakpoint and singlestep exceptions
  uprobes/core: Rename bkpt to swbp
  uprobes/core: Make order of function parameters consistent across functions
  uprobes/core: Make macro names consistent
  uprobes: Update copyright notices
  uprobes/core: Move insn to arch specific structure
  uprobes/core: Remove uprobe_opcode_sz
  uprobes/core: Make instruction tables volatile
  uprobes: Move to kernel/events/
  uprobes/core: Clean up, refactor and improve the code
  ...
parents 2c01e7bc 9cba26e6
Uprobe-tracer: Uprobe-based Event Tracing
=========================================
Documentation written by Srikar Dronamraju
Overview
--------
Uprobe based trace events are similar to kprobe based trace events.
To enable this feature, build your kernel with CONFIG_UPROBE_EVENT=y.
Similar to the kprobe-event tracer, this doesn't need to be activated via
current_tracer. Instead of that, add probe points via
/sys/kernel/debug/tracing/uprobe_events, and enable it via
/sys/kernel/debug/tracing/events/uprobes/<EVENT>/enabled.
However unlike kprobe-event tracer, the uprobe event interface expects the
user to calculate the offset of the probepoint in the object
Synopsis of uprobe_tracer
-------------------------
p[:[GRP/]EVENT] PATH:SYMBOL[+offs] [FETCHARGS] : Set a probe
GRP : Group name. If omitted, use "uprobes" for it.
EVENT : Event name. If omitted, the event name is generated
based on SYMBOL+offs.
PATH : path to an executable or a library.
SYMBOL[+offs] : Symbol+offset where the probe is inserted.
FETCHARGS : Arguments. Each probe can have up to 128 args.
%REG : Fetch register REG
Event Profiling
---------------
You can check the total number of probe hits and probe miss-hits via
/sys/kernel/debug/tracing/uprobe_profile.
The first column is event name, the second is the number of probe hits,
the third is the number of probe miss-hits.
Usage examples
--------------
To add a probe as a new event, write a new definition to uprobe_events
as below.
echo 'p: /bin/bash:0x4245c0' > /sys/kernel/debug/tracing/uprobe_events
This sets a uprobe at an offset of 0x4245c0 in the executable /bin/bash
echo > /sys/kernel/debug/tracing/uprobe_events
This clears all probe points.
The following example shows how to dump the instruction pointer and %ax
a register at the probed text address. Here we are trying to probe
function zfree in /bin/zsh
# cd /sys/kernel/debug/tracing/
# cat /proc/`pgrep zsh`/maps | grep /bin/zsh | grep r-xp
00400000-0048a000 r-xp 00000000 08:03 130904 /bin/zsh
# objdump -T /bin/zsh | grep -w zfree
0000000000446420 g DF .text 0000000000000012 Base zfree
0x46420 is the offset of zfree in object /bin/zsh that is loaded at
0x00400000. Hence the command to probe would be :
# echo 'p /bin/zsh:0x46420 %ip %ax' > uprobe_events
Please note: User has to explicitly calculate the offset of the probepoint
in the object. We can see the events that are registered by looking at the
uprobe_events file.
# cat uprobe_events
p:uprobes/p_zsh_0x46420 /bin/zsh:0x00046420 arg1=%ip arg2=%ax
The format of events can be seen by viewing the file events/uprobes/p_zsh_0x46420/format
# cat events/uprobes/p_zsh_0x46420/format
name: p_zsh_0x46420
ID: 922
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
field:unsigned long __probe_ip; offset:12; size:4; signed:0;
field:u32 arg1; offset:16; size:4; signed:0;
field:u32 arg2; offset:20; size:4; signed:0;
print fmt: "(%lx) arg1=%lx arg2=%lx", REC->__probe_ip, REC->arg1, REC->arg2
Right after definition, each event is disabled by default. For tracing these
events, you need to enable it by:
# echo 1 > events/uprobes/enable
Lets disable the event after sleeping for some time.
# sleep 20
# echo 0 > events/uprobes/enable
And you can see the traced information via /sys/kernel/debug/tracing/trace.
# cat trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
zsh-24842 [006] 258544.995456: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [007] 258545.000270: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [002] 258545.043929: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
zsh-24842 [004] 258547.046129: p_zsh_0x46420: (0x446420) arg1=446421 arg2=79
Each line shows us probes were triggered for a pid 24842 with ip being
0x446421 and contents of ax register being 79.
...@@ -76,6 +76,23 @@ config OPTPROBES ...@@ -76,6 +76,23 @@ config OPTPROBES
depends on KPROBES && HAVE_OPTPROBES depends on KPROBES && HAVE_OPTPROBES
depends on !PREEMPT depends on !PREEMPT
config UPROBES
bool "Transparent user-space probes (EXPERIMENTAL)"
depends on UPROBE_EVENT && PERF_EVENTS
default n
help
Uprobes is the user-space counterpart to kprobes: they
enable instrumentation applications (such as 'perf probe')
to establish unintrusive probes in user-space binaries and
libraries, by executing handler functions when the probes
are hit by user-space applications.
( These probes come in the form of single-byte breakpoints,
managed by the kernel and kept transparent to the probed
application. )
If in doubt, say "N".
config HAVE_EFFICIENT_UNALIGNED_ACCESS config HAVE_EFFICIENT_UNALIGNED_ACCESS
bool bool
help help
......
...@@ -87,7 +87,7 @@ config X86 ...@@ -87,7 +87,7 @@ config X86
select BUILDTIME_EXTABLE_SORT select BUILDTIME_EXTABLE_SORT
config INSTRUCTION_DECODER config INSTRUCTION_DECODER
def_bool (KPROBES || PERF_EVENTS) def_bool (KPROBES || PERF_EVENTS || UPROBES)
config OUTPUT_FORMAT config OUTPUT_FORMAT
string string
...@@ -243,6 +243,9 @@ config ARCH_CPU_PROBE_RELEASE ...@@ -243,6 +243,9 @@ config ARCH_CPU_PROBE_RELEASE
def_bool y def_bool y
depends on HOTPLUG_CPU depends on HOTPLUG_CPU
config ARCH_SUPPORTS_UPROBES
def_bool y
source "init/Kconfig" source "init/Kconfig"
source "kernel/Kconfig.freezer" source "kernel/Kconfig.freezer"
......
...@@ -85,6 +85,7 @@ struct thread_info { ...@@ -85,6 +85,7 @@ struct thread_info {
#define TIF_SECCOMP 8 /* secure computing */ #define TIF_SECCOMP 8 /* secure computing */
#define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */ #define TIF_MCE_NOTIFY 10 /* notify userspace of an MCE */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */ #define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
#define TIF_NOTSC 16 /* TSC is not accessible in userland */ #define TIF_NOTSC 16 /* TSC is not accessible in userland */
#define TIF_IA32 17 /* IA32 compatibility process */ #define TIF_IA32 17 /* IA32 compatibility process */
#define TIF_FORK 18 /* ret_from_fork */ #define TIF_FORK 18 /* ret_from_fork */
...@@ -109,6 +110,7 @@ struct thread_info { ...@@ -109,6 +110,7 @@ struct thread_info {
#define _TIF_SECCOMP (1 << TIF_SECCOMP) #define _TIF_SECCOMP (1 << TIF_SECCOMP)
#define _TIF_MCE_NOTIFY (1 << TIF_MCE_NOTIFY) #define _TIF_MCE_NOTIFY (1 << TIF_MCE_NOTIFY)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY) #define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_NOTSC (1 << TIF_NOTSC) #define _TIF_NOTSC (1 << TIF_NOTSC)
#define _TIF_IA32 (1 << TIF_IA32) #define _TIF_IA32 (1 << TIF_IA32)
#define _TIF_FORK (1 << TIF_FORK) #define _TIF_FORK (1 << TIF_FORK)
......
#ifndef _ASM_UPROBES_H
#define _ASM_UPROBES_H
/*
* User-space Probes (UProbes) for x86
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*
* Copyright (C) IBM Corporation, 2008-2011
* Authors:
* Srikar Dronamraju
* Jim Keniston
*/
#include <linux/notifier.h>
typedef u8 uprobe_opcode_t;
#define MAX_UINSN_BYTES 16
#define UPROBE_XOL_SLOT_BYTES 128 /* to keep it cache aligned */
#define UPROBE_SWBP_INSN 0xcc
#define UPROBE_SWBP_INSN_SIZE 1
struct arch_uprobe {
u16 fixups;
u8 insn[MAX_UINSN_BYTES];
#ifdef CONFIG_X86_64
unsigned long rip_rela_target_address;
#endif
};
struct arch_uprobe_task {
unsigned long saved_trap_nr;
#ifdef CONFIG_X86_64
unsigned long saved_scratch_register;
#endif
};
extern int arch_uprobe_analyze_insn(struct arch_uprobe *aup, struct mm_struct *mm);
extern int arch_uprobe_pre_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern int arch_uprobe_post_xol(struct arch_uprobe *aup, struct pt_regs *regs);
extern bool arch_uprobe_xol_was_trapped(struct task_struct *tsk);
extern int arch_uprobe_exception_notify(struct notifier_block *self, unsigned long val, void *data);
extern void arch_uprobe_abort_xol(struct arch_uprobe *aup, struct pt_regs *regs);
#endif /* _ASM_UPROBES_H */
...@@ -100,6 +100,7 @@ obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o ...@@ -100,6 +100,7 @@ obj-$(CONFIG_X86_CHECK_BIOS_CORRUPTION) += check.o
obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o obj-$(CONFIG_SWIOTLB) += pci-swiotlb.o
obj-$(CONFIG_OF) += devicetree.o obj-$(CONFIG_OF) += devicetree.o
obj-$(CONFIG_UPROBES) += uprobes.o
### ###
# 64 bit specific files # 64 bit specific files
......
...@@ -18,6 +18,7 @@ ...@@ -18,6 +18,7 @@
#include <linux/personality.h> #include <linux/personality.h>
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/user-return-notifier.h> #include <linux/user-return-notifier.h>
#include <linux/uprobes.h>
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/ucontext.h> #include <asm/ucontext.h>
...@@ -814,6 +815,11 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags) ...@@ -814,6 +815,11 @@ do_notify_resume(struct pt_regs *regs, void *unused, __u32 thread_info_flags)
mce_notify_process(); mce_notify_process();
#endif /* CONFIG_X86_64 && CONFIG_X86_MCE */ #endif /* CONFIG_X86_64 && CONFIG_X86_MCE */
if (thread_info_flags & _TIF_UPROBE) {
clear_thread_flag(TIF_UPROBE);
uprobe_notify_resume(regs);
}
/* deal with pending signal delivery */ /* deal with pending signal delivery */
if (thread_info_flags & _TIF_SIGPENDING) if (thread_info_flags & _TIF_SIGPENDING)
do_signal(regs); do_signal(regs);
......
This diff is collapsed.
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
#include <linux/completion.h> #include <linux/completion.h>
#include <linux/cpumask.h> #include <linux/cpumask.h>
#include <linux/page-debug-flags.h> #include <linux/page-debug-flags.h>
#include <linux/uprobes.h>
#include <asm/page.h> #include <asm/page.h>
#include <asm/mmu.h> #include <asm/mmu.h>
...@@ -388,6 +389,7 @@ struct mm_struct { ...@@ -388,6 +389,7 @@ struct mm_struct {
#ifdef CONFIG_CPUMASK_OFFSTACK #ifdef CONFIG_CPUMASK_OFFSTACK
struct cpumask cpumask_allocation; struct cpumask cpumask_allocation;
#endif #endif
struct uprobes_state uprobes_state;
}; };
static inline void mm_init_cpumask(struct mm_struct *mm) static inline void mm_init_cpumask(struct mm_struct *mm)
......
...@@ -1572,6 +1572,10 @@ struct task_struct { ...@@ -1572,6 +1572,10 @@ struct task_struct {
#ifdef CONFIG_HAVE_HW_BREAKPOINT #ifdef CONFIG_HAVE_HW_BREAKPOINT
atomic_t ptrace_bp_refcnt; atomic_t ptrace_bp_refcnt;
#endif #endif
#ifdef CONFIG_UPROBES
struct uprobe_task *utask;
int uprobe_srcu_id;
#endif
}; };
/* Future-safe accessor for struct task_struct's cpus_allowed. */ /* Future-safe accessor for struct task_struct's cpus_allowed. */
......
#ifndef _LINUX_UPROBES_H
#define _LINUX_UPROBES_H
/*
* User-space Probes (UProbes)
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*
* Copyright (C) IBM Corporation, 2008-2012
* Authors:
* Srikar Dronamraju
* Jim Keniston
* Copyright (C) 2011-2012 Red Hat, Inc., Peter Zijlstra <pzijlstr@redhat.com>
*/
#include <linux/errno.h>
#include <linux/rbtree.h>
struct vm_area_struct;
struct mm_struct;
struct inode;
#ifdef CONFIG_ARCH_SUPPORTS_UPROBES
# include <asm/uprobes.h>
#endif
/* flags that denote/change uprobes behaviour */
/* Have a copy of original instruction */
#define UPROBE_COPY_INSN 0x1
/* Dont run handlers when first register/ last unregister in progress*/
#define UPROBE_RUN_HANDLER 0x2
/* Can skip singlestep */
#define UPROBE_SKIP_SSTEP 0x4
struct uprobe_consumer {
int (*handler)(struct uprobe_consumer *self, struct pt_regs *regs);
/*
* filter is optional; If a filter exists, handler is run
* if and only if filter returns true.
*/
bool (*filter)(struct uprobe_consumer *self, struct task_struct *task);
struct uprobe_consumer *next;
};
#ifdef CONFIG_UPROBES
enum uprobe_task_state {
UTASK_RUNNING,
UTASK_BP_HIT,
UTASK_SSTEP,
UTASK_SSTEP_ACK,
UTASK_SSTEP_TRAPPED,
};
/*
* uprobe_task: Metadata of a task while it singlesteps.
*/
struct uprobe_task {
enum uprobe_task_state state;
struct arch_uprobe_task autask;
struct uprobe *active_uprobe;
unsigned long xol_vaddr;
unsigned long vaddr;
};
/*
* On a breakpoint hit, thread contests for a slot. It frees the
* slot after singlestep. Currently a fixed number of slots are
* allocated.
*/
struct xol_area {
wait_queue_head_t wq; /* if all slots are busy */
atomic_t slot_count; /* number of in-use slots */
unsigned long *bitmap; /* 0 = free slot */
struct page *page;
/*
* We keep the vma's vm_start rather than a pointer to the vma
* itself. The probed process or a naughty kernel module could make
* the vma go away, and we must handle that reasonably gracefully.
*/
unsigned long vaddr; /* Page(s) of instruction slots */
};
struct uprobes_state {
struct xol_area *xol_area;
atomic_t count;
};
extern int __weak set_swbp(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr);
extern int __weak set_orig_insn(struct arch_uprobe *aup, struct mm_struct *mm, unsigned long vaddr, bool verify);
extern bool __weak is_swbp_insn(uprobe_opcode_t *insn);
extern int uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc);
extern void uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc);
extern int uprobe_mmap(struct vm_area_struct *vma);
extern void uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end);
extern void uprobe_free_utask(struct task_struct *t);
extern void uprobe_copy_process(struct task_struct *t);
extern unsigned long __weak uprobe_get_swbp_addr(struct pt_regs *regs);
extern int uprobe_post_sstep_notifier(struct pt_regs *regs);
extern int uprobe_pre_sstep_notifier(struct pt_regs *regs);
extern void uprobe_notify_resume(struct pt_regs *regs);
extern bool uprobe_deny_signal(void);
extern bool __weak arch_uprobe_skip_sstep(struct arch_uprobe *aup, struct pt_regs *regs);
extern void uprobe_clear_state(struct mm_struct *mm);
extern void uprobe_reset_state(struct mm_struct *mm);
#else /* !CONFIG_UPROBES */
struct uprobes_state {
};
static inline int
uprobe_register(struct inode *inode, loff_t offset, struct uprobe_consumer *uc)
{
return -ENOSYS;
}
static inline void
uprobe_unregister(struct inode *inode, loff_t offset, struct uprobe_consumer *uc)
{
}
static inline int uprobe_mmap(struct vm_area_struct *vma)
{
return 0;
}
static inline void
uprobe_munmap(struct vm_area_struct *vma, unsigned long start, unsigned long end)
{
}
static inline void uprobe_notify_resume(struct pt_regs *regs)
{
}
static inline bool uprobe_deny_signal(void)
{
return false;
}
static inline unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
{
return 0;
}
static inline void uprobe_free_utask(struct task_struct *t)
{
}
static inline void uprobe_copy_process(struct task_struct *t)
{
}
static inline void uprobe_clear_state(struct mm_struct *mm)
{
}
static inline void uprobe_reset_state(struct mm_struct *mm)
{
}
#endif /* !CONFIG_UPROBES */
#endif /* _LINUX_UPROBES_H */
...@@ -3,4 +3,7 @@ CFLAGS_REMOVE_core.o = -pg ...@@ -3,4 +3,7 @@ CFLAGS_REMOVE_core.o = -pg
endif endif
obj-y := core.o ring_buffer.o callchain.o obj-y := core.o ring_buffer.o callchain.o
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_UPROBES) += uprobes.o
This diff is collapsed.
...@@ -69,6 +69,7 @@ ...@@ -69,6 +69,7 @@
#include <linux/oom.h> #include <linux/oom.h>
#include <linux/khugepaged.h> #include <linux/khugepaged.h>
#include <linux/signalfd.h> #include <linux/signalfd.h>
#include <linux/uprobes.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
#include <asm/pgalloc.h> #include <asm/pgalloc.h>
...@@ -451,6 +452,9 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) ...@@ -451,6 +452,9 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
if (retval) if (retval)
goto out; goto out;
if (file && uprobe_mmap(tmp))
goto out;
} }
/* a new mm has just been created */ /* a new mm has just been created */
arch_dup_mmap(oldmm, mm); arch_dup_mmap(oldmm, mm);
...@@ -599,6 +603,7 @@ void mmput(struct mm_struct *mm) ...@@ -599,6 +603,7 @@ void mmput(struct mm_struct *mm)
might_sleep(); might_sleep();
if (atomic_dec_and_test(&mm->mm_users)) { if (atomic_dec_and_test(&mm->mm_users)) {
uprobe_clear_state(mm);
exit_aio(mm); exit_aio(mm);
ksm_exit(mm); ksm_exit(mm);
khugepaged_exit(mm); /* must run before exit_mmap */ khugepaged_exit(mm); /* must run before exit_mmap */
...@@ -777,6 +782,8 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm) ...@@ -777,6 +782,8 @@ void mm_release(struct task_struct *tsk, struct mm_struct *mm)
exit_pi_state_list(tsk); exit_pi_state_list(tsk);
#endif #endif
uprobe_free_utask(tsk);
/* Get rid of any cached register state */ /* Get rid of any cached register state */
deactivate_mm(tsk, mm); deactivate_mm(tsk, mm);
...@@ -831,6 +838,7 @@ struct mm_struct *dup_mm(struct task_struct *tsk) ...@@ -831,6 +838,7 @@ struct mm_struct *dup_mm(struct task_struct *tsk)
#ifdef CONFIG_TRANSPARENT_HUGEPAGE #ifdef CONFIG_TRANSPARENT_HUGEPAGE
mm->pmd_huge_pte = NULL; mm->pmd_huge_pte = NULL;
#endif #endif
uprobe_reset_state(mm);
if (!mm_init(mm, tsk)) if (!mm_init(mm, tsk))
goto fail_nomem; goto fail_nomem;
...@@ -1373,6 +1381,7 @@ static struct task_struct *copy_process(unsigned long clone_flags, ...@@ -1373,6 +1381,7 @@ static struct task_struct *copy_process(unsigned long clone_flags,
INIT_LIST_HEAD(&p->pi_state_list); INIT_LIST_HEAD(&p->pi_state_list);
p->pi_state_cache = NULL; p->pi_state_cache = NULL;
#endif #endif
uprobe_copy_process(p);
/* /*
* sigaltstack should be cleared when sharing the same VM * sigaltstack should be cleared when sharing the same VM
*/ */
......
...@@ -29,6 +29,7 @@ ...@@ -29,6 +29,7 @@
#include <linux/pid_namespace.h> #include <linux/pid_namespace.h>
#include <linux/nsproxy.h> #include <linux/nsproxy.h>
#include <linux/user_namespace.h> #include <linux/user_namespace.h>
#include <linux/uprobes.h>
#define CREATE_TRACE_POINTS #define CREATE_TRACE_POINTS
#include <trace/events/signal.h> #include <trace/events/signal.h>
...@@ -2191,6 +2192,9 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka, ...@@ -2191,6 +2192,9 @@ int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
struct signal_struct *signal = current->signal; struct signal_struct *signal = current->signal;
int signr; int signr;
if (unlikely(uprobe_deny_signal()))
return 0;
relock: relock:
/* /*
* We'll jump back here after any time we were stopped in TASK_STOPPED. * We'll jump back here after any time we were stopped in TASK_STOPPED.
......
...@@ -372,6 +372,7 @@ config KPROBE_EVENT ...@@ -372,6 +372,7 @@ config KPROBE_EVENT
depends on HAVE_REGS_AND_STACK_ACCESS_API depends on HAVE_REGS_AND_STACK_ACCESS_API
bool "Enable kprobes-based dynamic events" bool "Enable kprobes-based dynamic events"
select TRACING select TRACING
select PROBE_EVENTS
default y default y
help help
This allows the user to add tracing events (similar to tracepoints) This allows the user to add tracing events (similar to tracepoints)
...@@ -384,6 +385,25 @@ config KPROBE_EVENT ...@@ -384,6 +385,25 @@ config KPROBE_EVENT
This option is also required by perf-probe subcommand of perf tools. This option is also required by perf-probe subcommand of perf tools.
If you want to use perf tools, this option is strongly recommended. If you want to use perf tools, this option is strongly recommended.
config UPROBE_EVENT
bool "Enable uprobes-based dynamic events"
depends on ARCH_SUPPORTS_UPROBES
depends on MMU
select UPROBES
select PROBE_EVENTS
select TRACING
default n
help
This allows the user to add tracing events on top of userspace
dynamic events (similar to tracepoints) on the fly via the trace
events interface. Those events can be inserted wherever uprobes
can probe, and record various registers.
This option is required if you plan to use perf-probe subcommand
of perf tools on user space applications.
config PROBE_EVENTS
def_bool n
config DYNAMIC_FTRACE config DYNAMIC_FTRACE
bool "enable/disable ftrace tracepoints dynamically" bool "enable/disable ftrace tracepoints dynamically"
depends on FUNCTION_TRACER depends on FUNCTION_TRACER
......
...@@ -60,5 +60,7 @@ endif ...@@ -60,5 +60,7 @@ endif
ifeq ($(CONFIG_TRACING),y) ifeq ($(CONFIG_TRACING),y)
obj-$(CONFIG_KGDB_KDB) += trace_kdb.o obj-$(CONFIG_KGDB_KDB) += trace_kdb.o
endif endif
obj-$(CONFIG_PROBE_EVENTS) += trace_probe.o
obj-$(CONFIG_UPROBE_EVENT) += trace_uprobe.o
libftrace-y := ftrace.o libftrace-y := ftrace.o
...@@ -103,6 +103,11 @@ struct kretprobe_trace_entry_head { ...@@ -103,6 +103,11 @@ struct kretprobe_trace_entry_head {
unsigned long ret_ip; unsigned long ret_ip;
}; };
struct uprobe_trace_entry_head {
struct trace_entry ent;
unsigned long ip;
};
/* /*
* trace_flag_type is an enumeration that holds different * trace_flag_type is an enumeration that holds different
* states when a trace occurs. These are: * states when a trace occurs. These are:
......
This diff is collapsed.
This diff is collapsed.
/*
* Common header file for probe-based Dynamic events.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
* This code was copied from kernel/trace/trace_kprobe.h written by
* Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
*
* Updates to make this generic:
* Copyright (C) IBM Corporation, 2010-2011
* Author: Srikar Dronamraju
*/
#include <linux/seq_file.h>
#include <linux/slab.h>
#include <linux/smp.h>
#include <linux/debugfs.h>
#include <linux/types.h>
#include <linux/string.h>
#include <linux/ctype.h>
#include <linux/ptrace.h>
#include <linux/perf_event.h>
#include <linux/kprobes.h>
#include <linux/stringify.h>
#include <linux/limits.h>
#include <linux/uaccess.h>
#include <asm/bitsperlong.h>
#include "trace.h"
#include "trace_output.h"
#define MAX_TRACE_ARGS 128
#define MAX_ARGSTR_LEN 63
#define MAX_EVENT_NAME_LEN 64
#define MAX_STRING_SIZE PATH_MAX
/* Reserved field names */
#define FIELD_STRING_IP "__probe_ip"
#define FIELD_STRING_RETIP "__probe_ret_ip"
#define FIELD_STRING_FUNC "__probe_func"
#undef DEFINE_FIELD
#define DEFINE_FIELD(type, item, name, is_signed) \
do { \
ret = trace_define_field(event_call, #type, name, \
offsetof(typeof(field), item), \
sizeof(field.item), is_signed, \
FILTER_OTHER); \
if (ret) \
return ret; \
} while (0)
/* Flags for trace_probe */
#define TP_FLAG_TRACE 1
#define TP_FLAG_PROFILE 2
#define TP_FLAG_REGISTERED 4
#define TP_FLAG_UPROBE 8
/* data_rloc: data relative location, compatible with u32 */
#define make_data_rloc(len, roffs) \
(((u32)(len) << 16) | ((u32)(roffs) & 0xffff))
#define get_rloc_len(dl) ((u32)(dl) >> 16)
#define get_rloc_offs(dl) ((u32)(dl) & 0xffff)
/*
* Convert data_rloc to data_loc:
* data_rloc stores the offset from data_rloc itself, but data_loc
* stores the offset from event entry.
*/
#define convert_rloc_to_loc(dl, offs) ((u32)(dl) + (offs))
/* Data fetch function type */
typedef void (*fetch_func_t)(struct pt_regs *, void *, void *);
/* Printing function type */
typedef int (*print_type_func_t)(struct trace_seq *, const char *, void *, void *);
/* Fetch types */
enum {
FETCH_MTD_reg = 0,
FETCH_MTD_stack,
FETCH_MTD_retval,
FETCH_MTD_memory,
FETCH_MTD_symbol,
FETCH_MTD_deref,
FETCH_MTD_bitfield,
FETCH_MTD_END,
};
/* Fetch type information table */
struct fetch_type {
const char *name; /* Name of type */
size_t size; /* Byte size of type */
int is_signed; /* Signed flag */
print_type_func_t print; /* Print functions */
const char *fmt; /* Fromat string */
const char *fmttype; /* Name in format file */
/* Fetch functions */
fetch_func_t fetch[FETCH_MTD_END];
};
struct fetch_param {
fetch_func_t fn;
void *data;
};
struct probe_arg {
struct fetch_param fetch;
struct fetch_param fetch_size;
unsigned int offset; /* Offset from argument entry */
const char *name; /* Name of this argument */
const char *comm; /* Command of this argument */
const struct fetch_type *type; /* Type of this argument */
};
static inline __kprobes void call_fetch(struct fetch_param *fprm,
struct pt_regs *regs, void *dest)
{
return fprm->fn(regs, fprm->data, dest);
}
/* Check the name is good for event/group/fields */
static inline int is_good_name(const char *name)
{
if (!isalpha(*name) && *name != '_')
return 0;
while (*++name != '\0') {
if (!isalpha(*name) && !isdigit(*name) && *name != '_')
return 0;
}
return 1;
}
extern int traceprobe_parse_probe_arg(char *arg, ssize_t *size,
struct probe_arg *parg, bool is_return, bool is_kprobe);
extern int traceprobe_conflict_field_name(const char *name,
struct probe_arg *args, int narg);
extern void traceprobe_update_arg(struct probe_arg *arg);
extern void traceprobe_free_probe_arg(struct probe_arg *arg);
extern int traceprobe_split_symbol_offset(char *symbol, unsigned long *offset);
extern ssize_t traceprobe_probes_write(struct file *file,
const char __user *buffer, size_t count, loff_t *ppos,
int (*createfn)(int, char**));
extern int traceprobe_command(const char *buf, int (*createfn)(int, char**));
This diff is collapsed.
...@@ -1307,6 +1307,9 @@ static void unmap_single_vma(struct mmu_gather *tlb, ...@@ -1307,6 +1307,9 @@ static void unmap_single_vma(struct mmu_gather *tlb,
if (end <= vma->vm_start) if (end <= vma->vm_start)
return; return;
if (vma->vm_file)
uprobe_munmap(vma, start, end);
if (unlikely(is_pfn_mapping(vma))) if (unlikely(is_pfn_mapping(vma)))
untrack_pfn_vma(vma, 0, 0); untrack_pfn_vma(vma, 0, 0);
......
This diff is collapsed.
...@@ -77,7 +77,8 @@ OPTIONS ...@@ -77,7 +77,8 @@ OPTIONS
-F:: -F::
--funcs:: --funcs::
Show available functions in given module or kernel. Show available functions in given module or kernel. With -x/--exec,
can also list functions in a user space executable / shared library.
--filter=FILTER:: --filter=FILTER::
(Only for --vars and --funcs) Set filter. FILTER is a combination of glob (Only for --vars and --funcs) Set filter. FILTER is a combination of glob
...@@ -98,6 +99,15 @@ OPTIONS ...@@ -98,6 +99,15 @@ OPTIONS
--max-probes:: --max-probes::
Set the maximum number of probe points for an event. Default is 128. Set the maximum number of probe points for an event. Default is 128.
-x::
--exec=PATH::
Specify path to the executable or shared library file for user
space tracing. Can also be used with --funcs option.
In absence of -m/-x options, perf probe checks if the first argument after
the options is an absolute path name. If its an absolute path, perf probe
uses it as a target module/target user space binary to probe.
PROBE SYNTAX PROBE SYNTAX
------------ ------------
Probe points are defined by following syntax. Probe points are defined by following syntax.
...@@ -182,6 +192,13 @@ Delete all probes on schedule(). ...@@ -182,6 +192,13 @@ Delete all probes on schedule().
./perf probe --del='schedule*' ./perf probe --del='schedule*'
Add probes at zfree() function on /bin/zsh
./perf probe -x /bin/zsh zfree or ./perf probe /bin/zsh zfree
Add probes at malloc() function on libc
./perf probe -x /lib/libc.so.6 malloc or ./perf probe /lib/libc.so.6 malloc
SEE ALSO SEE ALSO
-------- --------
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -2783,3 +2783,11 @@ int machine__load_vmlinux_path(struct machine *machine, enum map_type type, ...@@ -2783,3 +2783,11 @@ int machine__load_vmlinux_path(struct machine *machine, enum map_type type,
return ret; return ret;
} }
struct map *dso__new_map(const char *name)
{
struct dso *dso = dso__new(name);
struct map *map = map__new2(0, dso, MAP__FUNCTION);
return map;
}
...@@ -242,6 +242,7 @@ void dso__set_long_name(struct dso *dso, char *name); ...@@ -242,6 +242,7 @@ void dso__set_long_name(struct dso *dso, char *name);
void dso__set_build_id(struct dso *dso, void *build_id); void dso__set_build_id(struct dso *dso, void *build_id);
void dso__read_running_kernel_build_id(struct dso *dso, void dso__read_running_kernel_build_id(struct dso *dso,
struct machine *machine); struct machine *machine);
struct map *dso__new_map(const char *name);
struct symbol *dso__find_symbol(struct dso *dso, enum map_type type, struct symbol *dso__find_symbol(struct dso *dso, enum map_type type,
u64 addr); u64 addr);
struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type, struct symbol *dso__find_symbol_by_name(struct dso *dso, enum map_type type,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment