Commits · fb8c5a56c7ddbc2b0d2ee7a8da60fe1355f75141 · nexedi / linux

21 Oct, 2010 7 commits

perf probe: Show accessible global variables · fb8c5a56

Masami Hiramatsu authored Oct 21, 2010

Add --externs for allowing --vars to show accessible global (externally
defined) variables from a given probe point too.

This will give you a hint which globals can be accessible from the probe point.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101021101335.3542.31003.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

fb8c5a56

perf probe: Function style fix · c82ec0a2

Masami Hiramatsu authored Oct 21, 2010

Just change the order of function arguments for ease of read; moving optional
bool flag to the last.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101021101329.3542.51200.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

c82ec0a2

perf probe: Show accessible local variables · cf6eb489

Masami Hiramatsu authored Oct 21, 2010

Add -V (--vars) option for listing accessible local variables at given probe
point. This will help finding which local variables are available for event
arguments.

e.g.)
 # perf probe -V call_timer_fn:23
 Available variables at call_timer_fn:23
         @<run_timer_softirq+345>
                 function_type*  fn
                 int     preempt_count
                 long unsigned int       data
                 struct list_head        work_list
                 struct list_head*       head
                 struct timer_list*      timer
                 struct tvec_base*       base

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101021101323.3542.40282.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

cf6eb489

perf probe: Support global variables · 632941c4

Masami Hiramatsu authored Oct 21, 2010

Allow users to set external defined global variables as event arguments (e.g.
jiffies).

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20101021101316.3542.1999.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

632941c4

perf probe: Fix local variable searching loop · 378eeaad

Masami Hiramatsu authored Oct 21, 2010

Fix to check the die's address and search into the die only if it has given
address.

This will avoid finding wrong variables in wrong basic block.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20101021101309.3542.46434.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

378eeaad

perf probe: Fix type searching · 4046b8bb

Masami Hiramatsu authored Oct 21, 2010

Fix to get the actual type die of variables by using dwarf_attr_integrate()
which gets attribute from die even if the type die is connected by
DW_AT_abstract_origin.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20101021101302.3542.38549.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

4046b8bb

tracing: Cleanup the convoluted softirq tracepoints · f4bc6bb2

Thomas Gleixner authored Oct 19, 2010

With the addition of trace_softirq_raise() the softirq tracepoint got
even more convoluted. Why the tracepoints take two pointers to assign
an integer is beyond my comprehension.

But adding an extra case which treats the first pointer as an unsigned
long when the second pointer is NULL including the back and forth
type casting is just horrible.

Convert the softirq tracepoints to take a single unsigned int argument
for the softirq vector number and fix the call sites.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <alpine.LFD.2.00.1010191428560.6815@localhost6.localdomain6>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: mathieu.desnoyers@efficios.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

f4bc6bb2

19 Oct, 2010 6 commits

Merge branch 'tip/perf/core' of... · 750ed158

Ingo Molnar authored Oct 19, 2010

Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core

750ed158

tracing: Fix compile issue for trace_sched_wakeup.c · 7e40798f

Steven Rostedt authored Oct 19, 2010

The function start_func_tracer() was incorrectly added in the
 #ifdef CONFIG_FUNCTION_TRACER condition, but is still used even
when function tracing is not enabled.

The calls to register_ftrace_function() and register_ftrace_graph()
become nops (and their arguments are even ignored), thus there is
no reason to hide start_func_tracer() when function tracing is
not enabled.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

7e40798f

[S390] hardirq: remove pointless header file includes · 3f7edb16

Heiko Carstens authored Oct 12, 2010

Remove a couple of pointless header file includes.
Fixes a compile bug caused by header file include dependencies with
"irq: Add tracepoint to softirq_raise" within linux-next.
Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ cherry-picked from the s390 tree to fix "2bf2160d: irq: Add tracepoint to softirq_raise" ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>

3f7edb16

[IA64] Move local_softirq_pending() definition · 3c4ea5b4

Tony Luck authored Sep 20, 2010

Ugly #include dependencies. We need to have local_softirq_pending()
defined before it gets used in <linux/interrupt.h>. But <asm/hardirq.h>
provides the definition *after* this #include chain:
  <linux/irq.h>
    <asm/irq.h>
      <asm/hw_irq.h>
        <linux/interrupt.h>
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ cherry-picked from the ia64 tree to fix "2bf2160d: irq: Add tracepoint to softirq_raise" ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>

3c4ea5b4

perf, powerpc: Fix power_pmu_event_init to not use event->ctx · 57fa7214

Paul Mackerras authored Oct 19, 2010

Commit c3f00c70 ("perf: Separate find_get_context() from event
initialization") changed the generic perf_event code to call
perf_event_alloc, which calls the arch-specific event_init code,
before looking up the context for the new event.  Unfortunately,
power_pmu_event_init uses event->ctx->task to see whether the
new event is a per-task event or a system-wide event, and thus
crashes since event->ctx is NULL at the point where
power_pmu_event_init gets called.

(The reason it needs to know whether it is a per-task event is
because there are some hardware events on Power systems which
only count when the processor is not idle, and there are some
fixed-function counters which count such events.  For example,
the "run cycles" event counts cycles when the processor is not
idle.  If the user asks to count cycles, we can use "run cycles"
if this is a per-task event, since the processor is running when
the task is running, by definition.  We can't use "run cycles"
if the user asks for "cycles" on a system-wide counter.)

Fortunately the information we need is in the
event->attach_state field, so we just use that instead.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101019055535.GA10398@drongo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Reported-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

57fa7214

Merge branch 'tip/perf/recordmcount-2' of... · 1fa41266

Ingo Molnar authored Oct 19, 2010

Merge branch 'tip/perf/recordmcount-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core

1fa41266

18 Oct, 2010 21 commits

ftrace: Remove recursion between recordmcount and scripts/mod/empty · d7b4d6de

Steven Rostedt authored Oct 18, 2010

When DYNAMIC_FTRACE is enabled and we use the C version of recordmcount,
all objects are run through the recordmcount program to create a
separate section that stores all the callers of mcount.

The build process has a special file: scripts/mod/empty.o. This is
built from empty.c which is literally an empty file (except for a
single comment). This file is used to find information about the target
elf format, like endianness and word size.

The problem comes up when we need to build recordmcount. The
build process requires that empty.o is built first. The build rules
for empty.o will try to execute recordmcount on the empty.o file.
We get an error that recordmcount does not exist.

To avoid this recursion, the build file will skip running recordmcount
if the file that it is building is script/mod/empty.o.

[ extra comment Suggested-by: Sam Ravnborg <sam@ravnborg.org> ]
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

d7b4d6de

jump_label: Add COND_STMT(), reducer wrappery · ebf31f50

Peter Zijlstra authored Oct 17, 2010

The use of the JUMP_LABEL() construct ends up creating endless silly
wrappers, create a higher level construct to reduce this clutter.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ebf31f50

perf: Optimize sw events · 7e54a5a0

Peter Zijlstra authored Oct 14, 2010

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

7e54a5a0

perf: Use jump_labels to optimize the scheduler hooks · 82cd6def

Peter Zijlstra authored Oct 14, 2010

Trades a call + conditional + ret for an unconditional jmp.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

82cd6def

jump_label: Add atomic_t interface · 8b92538d

Peter Zijlstra authored Oct 14, 2010

Add an interface to allow usage of jump_labels with atomic counters.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.501657727@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

8b92538d

jump_label: Use more consistent naming · 3b6e901f

Peter Zijlstra authored Oct 14, 2010

Now that there's still only a few users around, rename things to make
them more consistent.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.448565169@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

3b6e901f

perf, hw_breakpoint: Fix crash in hw_breakpoint creation · d580ff86

Peter Zijlstra authored Oct 14, 2010

hw_breakpoint creation needs to account stuff per-task to ensure there
is always sufficient hardware resources to back these things due to
ptrace.

With the perf per pmu context changes the event initialization no
longer has access to the event context, for the simple reason that we
need to first find the pmu (result of initialization) before we can
find the context.

This makes hw_breakpoints unhappy, because it can no longer do per
task accounting, cure this by frobbing a task pointer in the event::hw
bits for now...
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101014203625.391543667@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

d580ff86

perf: Find task before event alloc · c6be5a5c

Peter Zijlstra authored Oct 14, 2010

So that we can pass the task pointer to the event allocation, so that
we can use task associated data during event initialization.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101014203625.340789919@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

c6be5a5c

perf: Fix task refcount bugs · e7d0bc04

Peter Zijlstra authored Oct 14, 2010

Currently it looks like find_lively_task_by_vpid() takes a task ref
and relies on find_get_context() to drop it.

The problem is that perf_event_create_kernel_counter() shouldn't be
dropping task refs.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
LKML-Reference: <20101014203625.278436085@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

e7d0bc04

perf: Fix group moving · 74c3337c

Peter Zijlstra authored Oct 15, 2010

Matt found we trigger the WARN_ON_ONCE() in perf_group_attach() when we take
the move_group path in perf_event_open().

Since we cannot de-construct the group (we rely on it to move the events), we
have to simply ignore the double attach. The group state is context invariant
and doesn't need changing.
Reported-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1287135757.29097.1368.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

74c3337c

irq_work: Add generic hardirq context callbacks · e360adbe

Peter Zijlstra authored Oct 14, 2010

Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

e360adbe

perf_events: Fix transaction recovery in group_sched_in() · 8e5fc1a7

Stephane Eranian authored Oct 15, 2010

The group_sched_in() function uses a transactional approach to schedule
a group of events. In a group, either all events can be scheduled or
none are. To schedule each event in, the function calls event_sched_in().
In case of error, event_sched_out() is called on each event in the group.

The problem is that event_sched_out() does not completely cancel the
effects of event_sched_in(). Furthermore event_sched_out() changes the
state of the event as if it had run which is not true is this particular
case.

Those inconsistencies impact time tracking fields and may lead to events
in a group not all reporting the same time_enabled and time_running values.
This is demonstrated with the example below:

$ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
1946101 unhalted_core_cycles (32.85% scaling, ena=829181, run=556827)
  11423 baclears (32.85% scaling, ena=829181, run=556827)
   7671 baclears (0.00% scaling, ena=556827, run=556827)

2250443 unhalted_core_cycles (57.83% scaling, ena=962822, run=405995)
  11705 baclears (57.83% scaling, ena=962822, run=405995)
  11705 baclears (57.83% scaling, ena=962822, run=405995)

Notice that in the first group, the last baclears event does not
report the same timings as its siblings.

This issue comes from the fact that tstamp_stopped is updated
by event_sched_out() as if the event had actually run.

To solve the issue, we must ensure that, in case of error, there is
no change in the event state whatsoever. That means timings must
remain as they were when entering group_sched_in().

To do this we defer updating tstamp_running until we know the
transaction succeeded. Therefore, we have split event_sched_in()
in two parts separating the update to tstamp_running.

Similarly, in case of error, we do not want to update tstamp_stopped.
Therefore, we have split event_sched_out() in two parts separating
the update to tstamp_stopped.

With this patch, we now get the following output:

$ task -eunhalted_core_cycles,baclears,baclears -e unhalted_core_cycles,baclears,baclears sleep 5
2492050 unhalted_core_cycles (71.75% scaling, ena=1093330, run=308841)
  11243 baclears (71.75% scaling, ena=1093330, run=308841)
  11243 baclears (71.75% scaling, ena=1093330, run=308841)

1852746 unhalted_core_cycles (0.00% scaling, ena=784489, run=784489)
   9253 baclears (0.00% scaling, ena=784489, run=784489)
   9253 baclears (0.00% scaling, ena=784489, run=784489)

Note that the uneven timing between groups is a side effect of
the process spending most of its time sleeping, i.e., not enough
event rotations (but that's a separate issue).
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4cb86b4c.41e9d80a.44e9.3e19@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

8e5fc1a7

perf_events: Fix bogus AMD64 generic TLB events · ba0cef3d

Stephane Eranian authored Oct 15, 2010

PERF_COUNT_HW_CACHE_DTLB:READ:MISS had a bogus umask value of 0 which
counts nothing. Needed to be 0x7 (to count all possibilities).

PERF_COUNT_HW_CACHE_ITLB:READ:MISS had a bogus umask value of 0 which
counts nothing. Needed to be 0x3 (to count all possibilities).
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: <stable@kernel.org> # as far back as it applies
LKML-Reference: <4cb85478.41e9d80a.44e2.3f00@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ba0cef3d

perf_events: Fix bogus context time tracking · c530ccd9

Stephane Eranian authored Oct 15, 2010

You can only call update_context_time() when the context
is active, i.e., the thread it is attached to is still running.

However, perf_event_read() can be called even when the context
is inactive, e.g., user read() the counters. The call to
update_context_time() must be conditioned on the status of
the context, otherwise, bogus time_enabled, time_running may
be returned. Here is an example on AMD64. The task program
is an example from libpfm4. The -p prints deltas every 1s.

$ task -p -e cpu_clk_unhalted sleep 5
    2,266,610 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
	    0 cpu_clk_unhalted (0.00% scaling, ena=2,158,982, run=2,158,982)
5,242,358,071 cpu_clk_unhalted (99.95% scaling, ena=5,000,359,984, run=2,319,270)

Whereas if you don't read deltas, e.g., no call to perf_event_read() until
the process terminates:

$ task -e cpu_clk_unhalted sleep 5
    2,497,783 cpu_clk_unhalted (0.00% scaling, ena=2,376,899, run=2,376,899)

Notice that time_enable, time_running are bogus in the first example
causing bogus scaling.

This patch fixes the problem, by conditionally calling update_context_time()
in perf_event_read().
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
LKML-Reference: <4cb856dc.51edd80a.5ae0.38fb@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

c530ccd9

tracing: Remove parent recording in latency tracer graph options · 78c89ba1

Steven Rostedt authored Oct 05, 2010

Even though the parent is recorded with the normal function tracing
of the latency tracers (irqsoff and wakeup), the function graph
recording is bogus.

This is due to the function graph messing with the return stack.
The latency tracers pass in as the parent CALLER_ADDR0, which
works fine for plain function tracing. But this causes bogus output
with the graph tracer:

 3)    <idle>-0    |  d.s3.  0.000 us    |  return_to_handler();
 3)    <idle>-0    |  d.s3.  0.000 us    |  _raw_spin_unlock_irqrestore();
 3)    <idle>-0    |  d.s3.  0.000 us    |  return_to_handler();
 3)    <idle>-0    |  d.s3.  0.000 us    |  trace_hardirqs_on();

The "return_to_handle()" call is the trampoline of the
function graph tracer, and is meaningless in this context.

Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

78c89ba1

tracing: Use one prologue for the preempt irqs off tracer function tracers · 5e6d2b9c

Steven Rostedt authored Oct 05, 2010

The preempt and irqsoff tracers have three types of function tracers.
Normal function tracer, function graph entry, and function graph return.
Each of these use a complex dance to prevent recursion and whether
to trace the data or not (depending if interrupts are enabled or not).

This patch moves the duplicate code into a single routine, to
prevent future mistakes with modifying duplicate complex code.

Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

5e6d2b9c

tracing: Use one prologue for the wakeup tracer function tracers · 542181d3

Steven Rostedt authored Oct 05, 2010

The wakeup tracer has three types of function tracers. Normal
function tracer, function graph entry, and function graph return.
Each of these use a complex dance to prevent recursion and whether
to trace the data or not (depending on the wake_task variable).

This patch moves the duplicate code into a single routine, to
prevent future mistakes with modifying duplicate complex code.

Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

542181d3

tracing: Graph support for wakeup tracer · 7495a5be

Jiri Olsa authored Sep 23, 2010

Add function graph support for wakeup latency tracer.
The graph output is enabled by setting the 'display-graph'
trace option.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <1285243253-7372-4-git-send-email-jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

7495a5be

tracing: Make graph related irqs/preemptsoff functions global · 0a772620

Jiri Olsa authored Sep 23, 2010

Move trace_graph_function() and print_graph_headers_flags() functions
to the trace_function_graph.c to be globaly available.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <1285243253-7372-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

0a772620

tracing: Add proper check for irq_depth routines · a9d61173

Jiri Olsa authored Sep 24, 2010

The check_irq_entry and check_irq_return could be called
from graph event context. In such case there's no graph
private data allocated. Adding checks to handle this case.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <20100924154102.GB1818@jolsa.brq.redhat.com>

[ Fixed some grammar in the comments ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

a9d61173

tracing/trivial: Remove cast from void* · 907f2784

matt mooney authored Sep 27, 2010

Unnecessary cast from void* in assignment.
Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

907f2784

16 Oct, 2010 2 commits
- Merge branch 'core' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into perf/core · f92f6e6e
  Ingo Molnar authored Oct 16, 2010
  
  f92f6e6e
- Merge branch 'tip/perf/recordmcount' of... · 66af86e2
  Ingo Molnar authored Oct 16, 2010
```
Merge branch 'tip/perf/recordmcount' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
```
  66af86e2
15 Oct, 2010 4 commits

ftrace: Use objtree for C version of recordmcount · 85caa993

Steven Rostedt authored Oct 15, 2010

The C version of recordmcount is compiled to a binary, which will
end up located in the objtree. If the kernel is built with O=path,
the srctree will not include the binary recordmcount caller.

Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

85caa993

ftrace: Do not process kernel/trace/ftrace.o with C recordmcount program · 44475863

Steven Rostedt authored Oct 15, 2010

The file kernel/trace/ftrace.c references the mcount() call to
convert the mcount() callers to nops. But because it references
mcount(), the mcount() address is placed in the relocation table.

The C version of recordmcount reads the relocation table of all
object files, and it will add all references to mcount to the
__mcount_loc table that is used to find the places that call mcount()
and change the call to a nop. When recordmcount finds the mcount reference
in kernel/trace/ftrace.o, it saves that location even though the code
is not a call, but references mcount as data.

On boot up, when all calls are converted to nops, the code has a safety
check to determine what op code it is actually replacing before it
replaces it. If that op code at the address does not match, then
a warning is printed and the function tracer is disabled.

The reference to mcount in ftrace.c, causes this warning to trigger,
since the reference is not a call to mcount(). The ftrace.c file is
not compiled with the -pg flag, so no calls to mcount() should be
expected.

This patch simply makes recordmcount.c skip the kernel/trace/ftrace.c
file. This was the same solution used by the perl version of
recordmcount.
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: John Reiser <jreiser@bitwagon.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

44475863

oprofile: make !CONFIG_PM function stubs static inline · cd254f29

Robert Richter authored Oct 15, 2010

Make !CONFIG_PM function stubs static inline and remove section
attribute.
Signed-off-by: Robert Richter <robert.richter@amd.com>

cd254f29

oprofile: fix linker errors · b3b3a9b6

Anand Gadiyar authored Oct 14, 2010

Commit e9677b3c (oprofile, ARM: Use oprofile_arch_exit() to
cleanup on failure) caused oprofile_perf_exit to be called
in the cleanup path of oprofile_perf_init. The __exit tag
for oprofile_perf_exit should therefore be dropped.

The same has to be done for exit_driverfs as well, as this
function is called from oprofile_perf_exit. Else, we get
the following two linker errors.

  LD      .tmp_vmlinux1
`oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1

  LD      .tmp_vmlinux1
`exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>

b3b3a9b6