Commits · dba39448abb7340f86ae9b062f99d7acacb5d2d2 · nexedi / linux

19 Nov, 2014 17 commits

tracing: Remove return values of most trace_seq_*() functions · dba39448

Steven Rostedt (Red Hat) authored Nov 12, 2014

The trace_seq_printf() and friends are used to store strings into a buffer
that can be passed around from function to function. If the trace_seq buffer
fills up, it will not print any more. The return values were somewhat
inconsistant and using trace_seq_has_overflowed() was a better way to know
if the write to the trace_seq buffer succeeded or not.

Now that all users have removed reading the return value of the printf()
type functions, they can safely return void and keep future users of them
from reading the inconsistent values as well.

Link: http://lkml.kernel.org/r/20141114011411.992510720@goodmis.orgSigned-off-by: Steven Rostedt <rostedt@goodmis.org>

dba39448

tracing: Do not use return values of trace_seq_printf() in syscall tracing · 183742f0

Steven Rostedt (Red Hat) authored Nov 12, 2014

The functions trace_seq_printf() and friends will not be returning values
soon and will be void functions. To know if they succeeded or not, the
functions trace_seq_has_overflowed() and trace_handle_return() should be
used instead.
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

183742f0

tracing/uprobes: Do not use return values of trace_seq_printf() · 8579a107

Steven Rostedt (Red Hat) authored Nov 12, 2014

The functions trace_seq_printf() and friends will soon no longer have
return values. Using trace_seq_has_overflowed() and trace_handle_return()
should be used instead.

Link: http://lkml.kernel.org/r/20141114011411.693008134@goodmis.org
Link: http://lkml.kernel.org/r/20141115050602.333705855@goodmis.orgReviewed-by: Masami Hiramatsu <masami.hiramatu.pt@hitachi.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

8579a107

tracing/probes: Do not use return value of trace_seq_printf() · d2b0191a

Steven Rostedt (Red Hat) authored Nov 12, 2014

The functions trace_seq_printf() and friends will soon not have a return
value and will only be a void function. Use trace_seq_has_overflowed()
instead to know if the trace_seq operations succeeded or not.

Link: http://lkml.kernel.org/r/20141114011411.530216306@goodmis.orgReviewed-by: Petr Mladek <pmladek@suse.cz>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

d2b0191a

tracing: Do not check return values of trace_seq_p*() for mmio tracer · a72e10af

Steven Rostedt (Red Hat) authored Nov 12, 2014

The return values for trace_seq_printf() and friends are going to be
removed and they will become void functions. The mmio tracer checked
their return and even did so incorrectly.

Some of the funtions which returned the values were never checked
themselves. Removing all the checks simplifies the code.

Use trace_seq_has_overflowed() and trace_handle_return() where
necessary instead.
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

a72e10af

kprobes/tracing: Use trace_seq_has_overflowed() for overflow checks · 85224da0

Steven Rostedt (Red Hat) authored Nov 12, 2014

Instead of checking the return value of trace_seq_printf() and friends
for overflowing of the buffer, use the trace_seq_has_overflowed() helper
function.

This cleans up the code quite a bit and also takes us a step closer to
changing the return values of trace_seq_printf() and friends to void.

Link: http://lkml.kernel.org/r/20141114011411.181812785@goodmis.orgReviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

85224da0

tracing: Have function_graph use trace_seq_has_overflowed() · 9d9add34

Steven Rostedt (Red Hat) authored Nov 12, 2014

Instead of doing individual checks all over the place that makes the code
very messy. Just check trace_seq_has_overflowed() at the end or in
strategic places.

This makes the code much cleaner and also helps with getting closer
to removing the return values of trace_seq_printf() and friends.

Link: http://lkml.kernel.org/r/20141114011410.987913836@goodmis.orgReviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

9d9add34

tracing: Have branch tracer use trace_handle_return() helper function · 7d40f671

Steven Rostedt (Red Hat) authored Nov 12, 2014

The branch tracer should not be checking the trace_seq_printf() return value
as that will soon be void. There's a new trace_handle_return() helper function
that will return TRACE_TYPE_PARTIAL_LINE if the trace_seq overflowed
and TRACE_TYPE_HANDLED otherwise.
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

7d40f671

ring-buffer: Remove check of trace_seq_{puts,printf}() return values · c0cd93aa

Steven Rostedt (Red Hat) authored Nov 12, 2014

Remove checking the return value of all trace_seq_puts(). It was wrong
anyway as only the last return value mattered. But as the trace_seq_puts()
is going to be a void function in the future, we should not be checking
the return value of it anyway.

Just return !trace_seq_has_overflowed() instead.
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

c0cd93aa

blktrace/tracing: Use trace_seq_has_overflowed() helper function · f4a1d08c

Steven Rostedt (Red Hat) authored Nov 12, 2014

Checking the return code of every trace_seq_printf() operation and having
to return early if it overflowed makes the code messy.

Using the new trace_seq_has_overflowed() and trace_handle_return() functions
allows us to clean up the code.

In the future, trace_seq_printf() and friends will be turning into void
functions and not returning a value. The trace_seq_has_overflowed() is to
be used instead. This cleanup allows that change to take place.

Cc: Jens Axboe <axboe@fb.com>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

f4a1d08c

tracing: Add trace_seq_has_overflowed() and trace_handle_return() · 19a7fe20

Steven Rostedt (Red Hat) authored Nov 12, 2014

Adding a trace_seq_has_overflowed() which returns true if the trace_seq
had too much written into it allows us to simplify the code.

Instead of checking the return value of every call to trace_seq_printf()
and friends, they can all be called normally, and at the end we can
return !trace_seq_has_overflowed() instead.

Several functions also return TRACE_TYPE_PARTIAL_LINE when the trace_seq
overflowed and TRACE_TYPE_HANDLED otherwise. Another helper function
was created called trace_handle_return() which takes a trace_seq and
returns these enums. Using this helper function also simplifies the
code.

This change also makes it possible to remove the return values of
trace_seq_printf() and friends. They should instead just be
void functions.

Link: http://lkml.kernel.org/r/20141114011410.365183157@goodmis.orgReviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

19a7fe20

tracing: Fix trace_seq_bitmask() to start at current position · e400a40c

Steven Rostedt (Red Hat) authored Nov 12, 2014

In trace_seq_bitmask() it calls bitmap_scnprintf() not from the current
position of the trace_seq buffer (s->buffer + s->len), but instead from
the beginning of the buffer (s->buffer).

Luckily, the only user of this "ipi_raise tracepoint" uses it as the
first parameter, and as such, the start of the temp buffer in
include/trace/ftrace.h (see __get_bitmask()).
Reported-by: Petr Mladek <pmladek@suse.cz>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

e400a40c

RAS/tracing: Use trace_seq_buffer_ptr() helper instead of open coded · dbcf3e06

Steven Rostedt (Red Hat) authored Oct 31, 2014

Use the helper function trace_seq_buffer_ptr() to get the current location
of the next buffer write of a trace_seq object, instead of open coding
it.

This facilitates the conversion of trace_seq to use seq_buf.
Tested-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

dbcf3e06

x86/kvm/tracing: Use helper function trace_seq_buffer_ptr() · 467aa1f2

Steven Rostedt (Red Hat) authored Sep 24, 2014

To allow for the restructiong of the trace_seq code, we need users
of it to use the helper functions instead of accessing the internals
of the trace_seq structure itself.

Link: http://lkml.kernel.org/r/20141104160221.585025609@goodmis.orgTested-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Mark Rustad <mark.d.rustad@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

467aa1f2

ftrace/x86/extable: Add is_ftrace_trampoline() function · aec0be2d

Steven Rostedt (Red Hat) authored Nov 18, 2014

Stack traces that happen from function tracing check if the address
on the stack is a __kernel_text_address(). That is, is the address
kernel code. This calls core_kernel_text() which returns true
if the address is part of the builtin kernel code. It also calls
is_module_text_address() which returns true if the address belongs
to module code.

But what is missing is ftrace dynamically allocated trampolines.
These trampolines are allocated for individual ftrace_ops that
call the ftrace_ops callback functions directly. But if they do a
stack trace, the code checking the stack wont detect them as they
are neither core kernel code nor module address space.

Adding another field to ftrace_ops that also stores the size of
the trampoline assigned to it we can create a new function called
is_ftrace_trampoline() that returns true if the address is a
dynamically allocate ftrace trampoline. Note, it ignores trampolines
that are not dynamically allocated as they will return true with
the core_kernel_text() function.

Link: http://lkml.kernel.org/r/20141119034829.497125839@goodmis.org

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

aec0be2d

ftrace/x86: Add frames pointers to trampoline as necessary · 9960efeb

Steven Rostedt (Red Hat) authored Nov 18, 2014

When CONFIG_FRAME_POINTERS are enabled, it is required that the
ftrace_caller and ftrace_regs_caller trampolines set up frame pointers
otherwise a stack trace from a function call wont print the functions
that called the trampoline. This is due to a check in
__save_stack_address():

 #ifdef CONFIG_FRAME_POINTER
	if (!reliable)
		return;
 #endif

The "reliable" variable is only set if the function address is equal to
contents of the address before the address the frame pointer register
points to. If the frame pointer is not set up for the ftrace caller
then this will fail the reliable test. It will miss the function that
called the trampoline. Worse yet, if fentry is used (gcc 4.6 and
beyond), it will also miss the parent, as the fentry is called before
the stack frame is set up. That means the bp frame pointer points
to the stack of just before the parent function was called.

Link: http://lkml.kernel.org/r/20141119034829.355440340@goodmis.org

Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: stable@vger.kernel.org # 3.7+
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

9960efeb

tracing: Fix race of function probes counting · a9ce7c36

Steven Rostedt (Red Hat) authored Nov 17, 2014

The function probe counting for traceon and traceoff suffered a race
condition where if the probe was executing on two or more CPUs at the
same time, it could decrement the counter by more than one when
disabling (or enabling) the tracer only once.

The way the traceon and traceoff probes are suppose to work is that
they disable (or enable) tracing once per count. If a user were to
echo 'schedule:traceoff:3' into set_ftrace_filter, then when the
schedule function was called, it would disable tracing. But the count
should only be decremented once (to 2). Then if the user enabled tracing
again (via tracing_on file), the next call to schedule would disable
tracing again and the count would be decremented to 1.

But if multiple CPUS called schedule at the same time, it is possible
that the count would be decremented more than once because of the
simple "count--" used.

By reading the count into a local variable and using memory barriers
we can guarantee that the count would only be decremented once per
disable (or enable).

The stack trace probe had a similar race, but here the stack trace will
decrement for each time it is called. But this had the read-modify-
write race, where it could stack trace more than the number of times
that was specified. This case we use a cmpxchg to stack trace only the
number of times specified.

The dump probes can still use the old "update_count()" function as
they only run once, and that is controlled by the dump logic
itself.

Link: http://lkml.kernel.org/r/20141118134643.4b550ee4@gandalf.local.homeSigned-off-by: Steven Rostedt <rostedt@goodmis.org>

a9ce7c36

14 Nov, 2014 9 commits

function_graph: Fix micro seconds notations · 4526d067

Byungchul Park authored Nov 05, 2014

Usually, "msecs" notation means milli-seconds, and "usecs" notation
means micro-seconds. Since the unit used in the code is micro-seconds,
the notation should be replaced from msecs to usecs.

Link: http://lkml.kernel.org/r/1415171926-9782-2-git-send-email-byungchul.park@lge.comSigned-off-by: Byungchul Park <byungchul.park@lge.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

4526d067

ftrace-graph: show latency-format on print_graph_irq() · 678f845e

Daniel Bristot de Oliveira authored Nov 06, 2014

On the function_graph tracer, the print_graph_irq() function prints a
trace line with the flag ==========> on an irq handler entry, and the
flag <========== on an irq handler return.

But when the latency-format is enable, it is not printing the
latency-format flags, causing the following error in the trace output:

0) ==========> |
0) d... | smp_apic_timer_interrupt() {

This patch fixes this issue by printing the latency-format flags when
it is enable.

Link: http://lkml.kernel.org/r/7c2e226dac20c940b6242178fab7f0e3c9b5ce58.1415233316.git.bristot@redhat.comReviewed-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

678f845e

trace: Replace single-character seq_puts with seq_putc · 1177e436

Rasmus Villemoes authored Nov 08, 2014

Printing a single character to a seqfile might as well be done with
seq_putc instead of seq_puts; this avoids a strlen() call and a memory
access. It also shaves another few bytes off the generated code.

Link: http://lkml.kernel.org/r/1415479332-25944-4-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

1177e436

tracing: Merge consecutive seq_puts calls · d79ac28f

Rasmus Villemoes authored Nov 08, 2014

Consecutive seq_puts calls with literal strings can be merged to a
single call. This reduces the size of the generated code, and can also
lead to slight .rodata reduction (because of fewer nul and padding
bytes). It should also shave a off a few clock cycles.

Link: http://lkml.kernel.org/r/1415479332-25944-3-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

d79ac28f

tracing: Replace seq_printf by simpler equivalents · fa6f0cc7

Rasmus Villemoes authored Nov 08, 2014

Using seq_printf to print a simple string or a single character is a
lot more expensive than it needs to be, since seq_puts and seq_putc
exist.

These patches do

  seq_printf(m, s) -> seq_puts(m, s)
  seq_printf(m, "%s", s) -> seq_puts(m, s)
  seq_printf(m, "%c", c) -> seq_putc(m, c)

Subsequent patches will simplify further.

Link: http://lkml.kernel.org/r/1415479332-25944-2-git-send-email-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

fa6f0cc7

tracing: kdb: Fix kernel livelock with empty buffers · 8520dedb

Daniel Thompson authored Nov 06, 2014

Currently kdb's ftdump command will livelock by constantly printk'ing
the empty string at KERN_EMERG level if it run when the ftrace system is
not in use. This occurs because trace_empty() never returns false when
the ring buffers are left at the start of a non-consuming read [launched
by ring_buffer_read_start()].

This patch changes the loop exit condition to use the result of
trace_find_next_entry_inc(). Effectively this switches the non-consuming
kdb dumper to follow the approach of the non-consuming userspace
interface [s_next()] rather than the consuming ftrace_dump().

Link: http://lkml.kernel.org/r/1415277716-19419-3-git-send-email-daniel.thompson@linaro.org

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

8520dedb

tracing: kdb: Fix kernel panic during ftdump · c270cc75

Daniel Thompson authored Nov 06, 2014

Currently kdb's ftdump command unconditionally crashes due to a null
pointer de-reference whenever the command is run. This in turn causes
the kernel to panic.

The abridged stacktrace (gathered with ARCH=arm) is:

--- cut here ---
[<c09535ac>] (panic) from [<c02132dc>] (die+0x264/0x440)
[<c02132dc>] (die) from [<c0952eb8>]
(__do_kernel_fault.part.11+0x74/0x84)
[<c0952eb8>] (__do_kernel_fault.part.11) from [<c021f954>]
(do_page_fault+0x1d0/0x3c4)
[<c021f954>] (do_page_fault) from [<c020846c>] (do_DataAbort+0x48/0xac)

[<c020846c>] (do_DataAbort) from [<c0213c58>] (__dabt_svc+0x38/0x60)
Exception stack(0xc0deba88 to 0xc0debad0)
ba80:                   e8c29180 00000001 e9854304 e9854300 c0f567d8
c0df2580
baa0: 00000000 00000000 00000000 c0f117b8 c0e3a3c0 c0debb0c 00000000
c0debad0
bac0: 0000672e c02f4d60 60000193 ffffffff
[<c0213c58>] (__dabt_svc) from [<c02f4d60>] (kdb_ftdump+0x1e4/0x3d8)
[<c02f4d60>] (kdb_ftdump) from [<c02ce328>] (kdb_parse+0x2b8/0x698)
[<c02ce328>] (kdb_parse) from [<c02ceef0>] (kdb_main_loop+0x52c/0x784)
[<c02ceef0>] (kdb_main_loop) from [<c02d1b0c>] (kdb_stub+0x238/0x490)
--- cut here ---

The NULL deref occurs due to the initialized use of struct trace_iter's
buffer_iter member.

This is a regression, albeit a fairly elderly one. It was introduced
by commit 6d158a81 ("tracing: Remove NR_CPUS array from
trace_iterator").

This patch solves this by providing a collection of ring_buffer_iter(s)
and using this to initialize buffer_iter. Note that static allocation
is used solely because the trace_iter itself is also static allocated.
Static allocation also means that we have to NULL-ify the pointer during
cleanup to avoid use-after-free problems.

Link: http://lkml.kernel.org/r/1415277716-19419-2-git-send-email-daniel.thompson@linaro.org

Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

c270cc75

tracing: Fix traceoff_on_warning handling on boot command line · 933ff9f2

Luis Claudio R. Goncalves authored Nov 12, 2014

According to the documentation, adding "traceoff_on_warning" to the boot
command line should be enough to enable the feature. But right now it is
necessary to specify "traceoff_on_warning=". Along with fixing that, also
verify if the value passed, if any, is either "0" or "off".

Link: http://lkml.kernel.org/r/20141112231400.GL12281@uudg.orgSigned-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

933ff9f2

ftrace: Have the control_ops get a trampoline · fe578ba3

Steven Rostedt (Red Hat) authored Nov 13, 2014

With the new logic, if only a single user of ftrace function hooks is
used, it will get its own trampoline assigned to it.

The problem is that the control_ops is an indirect ops that perf ops
uses. What that means is that when perf registers its ops with
register_ftrace_function(), it has the CONTROL flag set and gets added
to the control list instead of the global ftrace list. The control_ops
gets added to that instead and the mcount trampoline calls the control_ops
function. The control_ops function will iterate the control list and
call the ops functions that are attached to it.

But currently the trampoline is added to the perf ops and not the
control ops, and when ftrace tries to find a trampoline hook for it,
it fails to find one and gives the following splat:

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 10133 at kernel/trace/ftrace.c:2033 ftrace_get_addr_new+0x6f/0xc0()
 Modules linked in: [...]
 CPU: 0 PID: 10133 Comm: perf Tainted: P               3.18.0-rc1-test+ #388
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  00000000000007f1 ffff8800c2643bc8 ffffffff814fca6e ffff88011ea0ed01
  0000000000000000 ffff8800c2643c08 ffffffff81041ffd 0000000000000000
  ffffffff810c388c ffffffff81a5a350 ffff880119b00000 ffffffff810001c8
 Call Trace:
  [<ffffffff814fca6e>] dump_stack+0x46/0x58
  [<ffffffff81041ffd>] warn_slowpath_common+0x81/0x9b
  [<ffffffff810c388c>] ? ftrace_get_addr_new+0x6f/0xc0
  [<ffffffff810001c8>] ? 0xffffffff810001c8
  [<ffffffff81042031>] warn_slowpath_null+0x1a/0x1c
  [<ffffffff810c388c>] ftrace_get_addr_new+0x6f/0xc0
  [<ffffffff8102e938>] ftrace_replace_code+0xd6/0x334
  [<ffffffff810c4116>] ftrace_modify_all_code+0x41/0xc5
  [<ffffffff8102eba6>] arch_ftrace_update_code+0x10/0x19
  [<ffffffff810c293c>] ftrace_run_update_code+0x21/0x42
  [<ffffffff810c298f>] ftrace_startup_enable+0x32/0x34
  [<ffffffff810c3049>] ftrace_startup+0x14e/0x15a
  [<ffffffff810c307c>] register_ftrace_function+0x27/0x40
  [<ffffffff810dc118>] perf_ftrace_event_register+0x3e/0xee
  [<ffffffff810dbfbe>] perf_trace_init+0x29d/0x2a9
  [<ffffffff810eb422>] perf_tp_event_init+0x27/0x3a
  [<ffffffff810f18bc>] perf_init_event+0x9e/0xed
  [<ffffffff810f1ba4>] perf_event_alloc+0x299/0x330
  [<ffffffff810f236b>] SYSC_perf_event_open+0x3ee/0x816
  [<ffffffff8115a066>] ? mntput+0x2d/0x2f
  [<ffffffff81142b00>] ? __fput+0xa7/0x1b2
  [<ffffffff81091300>] ? do_gettimeofday+0x22/0x3a
  [<ffffffff810f279c>] SyS_perf_event_open+0x9/0xb
  [<ffffffff81502a92>] system_call_fastpath+0x12/0x17
 ---[ end trace 81a53565150e4982 ]---
 Bad trampoline accounting at: ffffffff810001c8 (run_init_process+0x0/0x2d) (10000001)

Update the control_ops trampoline instead of the perf ops one.

Reported-by: lkp@01.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

fe578ba3

11 Nov, 2014 6 commits

tracing: Add entry->next_cpu to trace_ctxwake_bin() · 26488b37

Jiang Liu authored Aug 22, 2013

Function trace_ctxwake_bin() misses ctx_switch_entry->next_cpu field,
so user will get stale value for "next_cpu".

Link: http://lkml.kernel.org/p/1377176379-27908-1-git-send-email-liuj97@gmail.comSigned-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

26488b37

tracing: Move tracing_sched_{switch,wakeup}() into wakeup tracer · 243f7610

Steven Rostedt (Red Hat) authored Oct 30, 2014

The only code that references tracing_sched_switch_trace() and
tracing_sched_wakeup_trace() is the wakeup latency tracer. Those
two functions use to belong to the sched_switch tracer which has
long been removed. These functions were left behind because the
wakeup latency tracer used them. But since the wakeup latency tracer
is the only one to use them, they should be static functions inside
that code.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

243f7610

tracing: Kill the dead code in probe_sched_switch() and probe_sched_wakeup() · 458faf0b

Oleg Nesterov authored Jul 23, 2014

After the previous patch it is clear that "tracer_enabled" can never be
true, we can remove the "if (tracer_enabled)" code in probe_sched_switch()
and probe_sched_wakeup(). Plus we can obviously remove tracer_enabled,
ctx_trace, and sched_stopped as well.

Link: http://lkml.kernel.org/p/20140723193503.GA30217@redhat.comSigned-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

458faf0b

tracing: Kill tracing_{start,stop}_sched_switch_record() and tracing_sched_switch_assign_trace() · 63253725

Oleg Nesterov authored Jul 23, 2014

tracing_{start,stop}_sched_switch_record() have no callers since
87d80de2 "tracing: Remove obsolete sched_switch tracer".

The last caller of tracing_sched_switch_assign_trace() was removed
by 30dbb20e "tracing: Remove boot tracer".

Link: http://lkml.kernel.org/p/20140723193501.GA30214@redhat.comSigned-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

63253725

ftrace: Add more information to ftrace_bug() output · 4fd3279b

Steven Rostedt (Red Hat) authored Oct 24, 2014

With the introduction of the dynamic trampolines, it is useful that if
things go wrong that ftrace_bug() produces more information about what
the current state is. This can help debug issues that may arise.

Ftrace has lots of checks to make sure that the state of the system it
touchs is exactly what it expects it to be. When it detects an abnormality
it calls ftrace_bug() and disables itself to prevent any further damage.
It is crucial that ftrace_bug() produces sufficient information that
can be used to debug the situation.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

4fd3279b

ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines · 12cce594

Steven Rostedt (Red Hat) authored Jul 03, 2014

When the static ftrace_ops (like function tracer) enables tracing, and it
is the only callback that is referencing a function, a trampoline is
dynamically allocated to the function that calls the callback directly
instead of calling a loop function that iterates over all the registered
ftrace ops (if more than one ops is registered).

But when it comes to dynamically allocated ftrace_ops, where they may be
freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
to free the trampoline. If a task was preempted while executing on the
trampoline, there's currently no way to know when it will be off that
trampoline.

But this is not true when it comes to !CONFIG_PREEMPT. The current method
of calling schedule_on_each_cpu() will force tasks off the trampoline,
becaues they can not schedule while on it (kernel preemption is not
configured). That means it is safe to free a dynamically allocated
ftrace ops trampoline when CONFIG_PREEMPT is not configured.

Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

12cce594

31 Oct, 2014 2 commits

ftrace/x86: Show trampoline call function in enabled_functions · 15d5b02c

Steven Rostedt (Red Hat) authored Jul 03, 2014

The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
ftrace function hooks. Add to the output what function is being called
by the trampoline if the arch supports it.

Add support for this feature in x86_64.

Cc: H. Peter Anvin <hpa@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

15d5b02c

ftrace/x86: Add dynamic allocated trampoline for ftrace_ops · f3bea491

Steven Rostedt (Red Hat) authored Jul 02, 2014

The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.

For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!

Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.

For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

f3bea491

24 Oct, 2014 2 commits

ftrace: Fix checking of trampoline ftrace_ops in finding trampoline · 4fc40904

Steven Rostedt (Red Hat) authored Oct 24, 2014

When modifying code, ftrace has several checks to make sure things
are being done correctly. One of them is to make sure any code it
modifies is exactly what it expects it to be before it modifies it.
In order to do so with the new trampoline logic, it must be able
to find out what trampoline a function is hooked to in order to
see if the code that hooks to it is what's expected.

The logic to find the trampoline from a record (accounting descriptor
for a function that is hooked) needs to only look at the "old_hash"
of an ops that is being modified. The old_hash is the list of function
an ops is hooked to before its update. Since a record would only be
pointing to an ops that is being modified if it was already hooked
before.

Currently, it can pick a modified ops based on its new functions it
will be hooked to, and this picks the wrong trampoline and causes
the check to fail, disabling ftrace.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

ftrace: squash into ordering of ops for modification

4fc40904

ftrace: Set ops->old_hash on modifying what an ops hooks to · 8252ecf3

Steven Rostedt (Red Hat) authored Oct 24, 2014

The code that checks for trampolines when modifying function hooks
tests against a modified ops "old_hash". But the ops old_hash pointer
is not being updated before the changes are made, making it possible
to not find the right hash to the callback and possibly causing
ftrace to break in accounting and disable itself.

Have the ops set its old_hash before the modifying takes place.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

8252ecf3

20 Oct, 2014 2 commits

Linux 3.18-rc1 · f114040e
Linus Torvalds authored Oct 19, 2014

f114040e

Merge tag 'arm-soc-fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 4d3639ac

Linus Torvalds authored Oct 19, 2014

Pull ARM SoC fixes from Olof Johansson:
 "A batch of fixes that have come in during the merge window.

  Some of them are defconfig updates for things that have now landed,
  some errata additions and a few general scattered fixes.

  There's also a qcom DT update that adds support for SATA on AP148, and
  basic support for Sony Xperia Z1 and CM-QS600 platforms that seemed
  isolated enough that we could merge it even if it's late"

* tag 'arm-soc-fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: corrected bcm2835 search
  ARM: dts: Explicitly set dr_mode on exynos5420-arndale-octa
  ARM: dts: Explicitly set dr_mode on exynos Peach boards
  ARM: dts: qcom: add CM-QS600 board
  ARM: dts: qcom: Add initial DTS file for Sony Xperia Z1 phone
  ARM: dts: qcom: Add SATA support on IPQ8064/AP148
  MAINTAINERS: Update Santosh Shilimkar's email id
  ARM: sunxi_defconfig: enable CONFIG_REGULATOR
  ARM: dts: Disable smc91x on n900 until bootloader dependency is removed
  ARM: omap2plus_defconfig: Enable ARM erratum 430973 for omap3
  ARM: exynos_defconfig: enable USB gadget support
  ARM: exynos_defconfig: Enable Maxim 77693 and I2C GPIO drivers
  ARM: mm: Fix ifdef around cpu_*_do_[suspend, resume] ops
  ARM: EXYNOS: Fix build with PM_SLEEP=n and ARM_EXYNOS_CPUIDLE=n
  ARM: SAMSUNG: Restore Samsung PM Debug functionality
  ARM: dts: Fix pull setting in sd4_width8 pin group for exynos4x12
  ARM: exynos_defconfig: Enable SBS battery support
  ARM: exynos_defconfig: Enable Control Groups support
  ARM: exynos_defconfig: Enable Atmel maXTouch support
  ARM: exynos_defconfig: Enable MAX77802

4d3639ac

19 Oct, 2014 2 commits

Merge git://git.infradead.org/users/eparis/audit · ab074ade

Linus Torvalds authored Oct 19, 2014

Pull audit updates from Eric Paris:
 "So this change across a whole bunch of arches really solves one basic
  problem.  We want to audit when seccomp is killing a process.  seccomp
  hooks in before the audit syscall entry code.  audit_syscall_entry
  took as an argument the arch of the given syscall.  Since the arch is
  part of what makes a syscall number meaningful it's an important part
  of the record, but it isn't available when seccomp shoots the
  syscall...

  For most arch's we have a better way to get the arch (syscall_get_arch)
  So the solution was two fold: Implement syscall_get_arch() everywhere
  there is audit which didn't have it.  Use syscall_get_arch() in the
  seccomp audit code.  Having syscall_get_arch() everywhere meant it was
  a useless flag on the stack and we could get rid of it for the typical
  syscall entry.

  The other changes inside the audit system aren't grand, fixed some
  records that had invalid spaces.  Better locking around the task comm
  field.  Removing some dead functions and structs.  Make some things
  static.  Really minor stuff"

* git://git.infradead.org/users/eparis/audit: (31 commits)
  audit: rename audit_log_remove_rule to disambiguate for trees
  audit: cull redundancy in audit_rule_change
  audit: WARN if audit_rule_change called illegally
  audit: put rule existence check in canonical order
  next: openrisc: Fix build
  audit: get comm using lock to avoid race in string printing
  audit: remove open_arg() function that is never used
  audit: correct AUDIT_GET_FEATURE return message type
  audit: set nlmsg_len for multicast messages.
  audit: use union for audit_field values since they are mutually exclusive
  audit: invalid op= values for rules
  audit: use atomic_t to simplify audit_serial()
  kernel/audit.c: use ARRAY_SIZE instead of sizeof/sizeof[0]
  audit: reduce scope of audit_log_fcaps
  audit: reduce scope of audit_net_id
  audit: arm64: Remove the audit arch argument to audit_syscall_entry
  arm64: audit: Add audit hook in syscall_trace_enter/exit()
  audit: x86: drop arch from __audit_syscall_entry() interface
  sparc: implement is_32bit_task
  sparc: properly conditionalize use of TIF_32BIT
  ...

ab074ade

Merge tag 'qcom-dt-for-3.18-3' of... · 57764512

Olof Johansson authored Oct 19, 2014

Merge tag 'qcom-dt-for-3.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/linux-qcom into fixes

Merge "qcom DT changes for v3.18-3" from Kumar Gala:

Qualcomm ARM Based Device Tree Updates for v3.18-3

* Added Board support for CM-QS600 and Sony Xperia Z1 phone
* Added SATA support on IPQ8064/AP148

* tag 'qcom-dt-for-3.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/linux-qcom:
  ARM: dts: qcom: add CM-QS600 board
  ARM: dts: qcom: Add initial DTS file for Sony Xperia Z1 phone
  ARM: dts: qcom: Add SATA support on IPQ8064/AP148

57764512