Commit af9c191a authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'trace-ring-buffer-v6.12' of...

Merge tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer updates from Steven Rostedt:

 - tracing/ring-buffer: persistent buffer across reboots

   This allows for the tracing instance ring buffer to stay persistent
   across reboots. The way this is done is by adding to the kernel
   command line:

     trace_instance=boot_map@0x285400000:12M

   This will reserve 12 megabytes at the address 0x285400000, and then
   map the tracing instance "boot_map" ring buffer to that memory. This
   will appear as a normal instance in the tracefs system:

     /sys/kernel/tracing/instances/boot_map

   A user could enable tracing in that instance, and on reboot or kernel
   crash, if the memory is not wiped by the firmware, it will recreate
   the trace in that instance. For example, if one was debugging a
   shutdown of a kernel reboot:

     # cd /sys/kernel/tracing
     # echo function > instances/boot_map/current_tracer
     # reboot
     [..]
     # cd /sys/kernel/tracing
     # tail instances/boot_map/trace
           swapper/0-1       [000] d..1.   164.549800: restore_boot_irq_mode <-native_machine_shutdown
           swapper/0-1       [000] d..1.   164.549801: native_restore_boot_irq_mode <-native_machine_shutdown
           swapper/0-1       [000] d..1.   164.549802: disconnect_bsp_APIC <-native_machine_shutdown
           swapper/0-1       [000] d..1.   164.549811: hpet_disable <-native_machine_shutdown
           swapper/0-1       [000] d..1.   164.549812: iommu_shutdown_noop <-native_machine_restart
           swapper/0-1       [000] d..1.   164.549813: native_machine_emergency_restart <-__do_sys_reboot
           swapper/0-1       [000] d..1.   164.549813: tboot_shutdown <-native_machine_emergency_restart
           swapper/0-1       [000] d..1.   164.549820: acpi_reboot <-native_machine_emergency_restart
           swapper/0-1       [000] d..1.   164.549821: acpi_reset <-acpi_reboot
           swapper/0-1       [000] d..1.   164.549822: acpi_os_write_port <-acpi_reboot

   On reboot, the buffer is examined to make sure it is valid. The
   validation check even steps through every event to make sure the meta
   data of the event is correct. If any test fails, it will simply reset
   the buffer, and the buffer will be empty on boot.

 - Allow the tracing persistent boot buffer to use the "reserve_mem"
   option

   Instead of having the admin find a physical address to store the
   persistent buffer, which can be very tedious if they have to
   administrate several different machines, allow them to use the
   "reserve_mem" option that will find a location for them. It is not as
   reliable because of KASLR, as the loading of the kernel in different
   locations can cause the memory allocated to be inconsistent. Booting
   with "nokaslr" can make reserve_mem more reliable.

 - Have function graph tracer handle offsets from a previous boot.

   The ring buffer output from a previous boot may have different
   addresses due to kaslr. Have the function graph tracer handle these
   by using the delta from the previous boot to the new boot address
   space.

 - Only reset the saved meta offset when the buffer is started or reset

   In the persistent memory meta data, it holds the previous address
   space information, so that it can calculate the delta to have
   function tracing work. But this gets updated after being read to hold
   the new address space. But if the buffer isn't used for that boot, on
   reboot, the delta is now calculated from the previous boot and not
   the boot that holds the data in the ring buffer. This causes the
   functions not to be shown. Do not save the address space information
   of the current kernel until it is being recorded.

 - Add a magic variable to test the valid meta data

   Add a magic variable in the meta data that can also be used for
   validation. The validator of the previous buffer doesn't need this
   magic data, but it can be used if the meta data is changed by a new
   kernel, which may have the same format that passes the validator but
   is used differently. This magic number can also be used as a
   "versioning" of the meta data.

 - Align user space mapped ring buffer sub buffers to improve TLB
   entries

   Linus mentioned that the mapped ring buffer sub buffers were
   misaligned between the meta page and the sub-buffers, so that if the
   sub-buffers were bigger than PAGE_SIZE, it wouldn't allow the TLB to
   use bigger entries.

 - Add new kernel command line "traceoff" to disable tracing on boot for
   instances

   If tracing is enabled for a boot instance, there needs a way to be
   able to disable it on boot so that new events do not get entered into
   the ring buffer and be mixed with events from a previous boot, as
   that can be confusing.

 - Allow trace_printk() to go to other instances

   Currently, trace_printk() can only go to the top level instance. When
   debugging with a persistent buffer, it is really useful to be able to
   add trace_printk() to go to that buffer, so that you have access to
   them after a crash.

 - Do not use "bin_printk()" for traces to a boot instance

   The bin_printk() saves only a pointer to the printk format in the
   ring buffer, as the reader of the buffer can still have access to it.
   But this is not the case if the buffer is from a previous boot. If
   the trace_printk() is going to a "persistent" buffer, it will use the
   slower version that writes the printk format into the buffer.

 - Add command line option to allow trace_printk() to go to an instance

   Allow the kernel command line to define which instance the
   trace_printk() goes to, instead of forcing the admin to set it for
   every boot via the tracefs options.

 - Start a document that explains how to use tracefs to debug the kernel

 - Add some more kernel selftests to test user mapped ring buffer

* tag 'trace-ring-buffer-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits)
  selftests/ring-buffer: Handle meta-page bigger than the system
  selftests/ring-buffer: Verify the entire meta-page padding
  tracing/Documentation: Start a document on how to debug with tracing
  tracing: Add option to set an instance to be the trace_printk destination
  tracing: Have trace_printk not use binary prints if boot buffer
  tracing: Allow trace_printk() to go to other instance buffers
  tracing: Add "traceoff" flag to boot time tracing instances
  ring-buffer: Align meta-page to sub-buffers for improved TLB usage
  ring-buffer: Add magic and struct size to boot up meta data
  ring-buffer: Don't reset persistent ring-buffer meta saved addresses
  tracing/fgraph: Have fgraph handle previous boot function addresses
  tracing: Allow boot instances to use reserve_mem boot memory
  tracing: Fix ifdef of snapshots to not prevent last_boot_info file
  ring-buffer: Use vma_pages() helper function
  tracing: Fix NULL vs IS_ERR() check in enable_instances()
  tracing: Add last boot delta offset for stack traces
  tracing: Update function tracing output for previous boot buffer
  tracing: Handle old buffer mappings for event strings and functions
  tracing/ring-buffer: Add last_boot_info file to boot instance
  ring-buffer: Save text and data locations in mapped meta data
  ...
parents dd609b8a 75d7ff9a
...@@ -6808,6 +6808,51 @@ ...@@ -6808,6 +6808,51 @@
the same thing would happen if it was left off). The irq_handler_entry the same thing would happen if it was left off). The irq_handler_entry
event, and all events under the "initcall" system. event, and all events under the "initcall" system.
Flags can be added to the instance to modify its behavior when it is
created. The flags are separated by '^'.
The available flags are:
traceoff - Have the tracing instance tracing disabled after it is created.
traceprintk - Have trace_printk() write into this trace instance
(note, "printk" and "trace_printk" can also be used)
trace_instance=foo^traceoff^traceprintk,sched,irq
The flags must come before the defined events.
If memory has been reserved (see memmap for x86), the instance
can use that memory:
memmap=12M$0x284500000 trace_instance=boot_map@0x284500000:12M
The above will create a "boot_map" instance that uses the physical
memory at 0x284500000 that is 12Megs. The per CPU buffers of that
instance will be split up accordingly.
Alternatively, the memory can be reserved by the reserve_mem option:
reserve_mem=12M:4096:trace trace_instance=boot_map@trace
This will reserve 12 megabytes at boot up with a 4096 byte alignment
and place the ring buffer in this memory. Note that due to KASLR, the
memory may not be the same location each time, which will not preserve
the buffer content.
Also note that the layout of the ring buffer data may change between
kernel versions where the validator will fail and reset the ring buffer
if the layout is not the same as the previous kernel.
If the ring buffer is used for persistent bootups and has events enabled,
it is recommend to disable tracing so that events from a previous boot do not
mix with events of the current boot (unless you are debugging a random crash
at boot up).
reserve_mem=12M:4096:trace trace_instance=boot_map^traceoff^traceprintk@trace,sched,irq
See also Documentation/trace/debugging.rst
trace_options=[option-list] trace_options=[option-list]
[FTRACE] Enable or disable tracer options at boot. [FTRACE] Enable or disable tracer options at boot.
The option-list is a comma delimited list of options The option-list is a comma delimited list of options
......
==============================
Using the tracer for debugging
==============================
Copyright 2024 Google LLC.
:Author: Steven Rostedt <rostedt@goodmis.org>
:License: The GNU Free Documentation License, Version 1.2
(dual licensed under the GPL v2)
- Written for: 6.12
Introduction
------------
The tracing infrastructure can be very useful for debugging the Linux
kernel. This document is a place to add various methods of using the tracer
for debugging.
First, make sure that the tracefs file system is mounted::
$ sudo mount -t tracefs tracefs /sys/kernel/tracing
Using trace_printk()
--------------------
trace_printk() is a very lightweight utility that can be used in any context
inside the kernel, with the exception of "noinstr" sections. It can be used
in normal, softirq, interrupt and even NMI context. The trace data is
written to the tracing ring buffer in a lockless way. To make it even
lighter weight, when possible, it will only record the pointer to the format
string, and save the raw arguments into the buffer. The format and the
arguments will be post processed when the ring buffer is read. This way the
trace_printk() format conversions are not done during the hot path, where
the trace is being recorded.
trace_printk() is meant only for debugging, and should never be added into
a subsystem of the kernel. If you need debugging traces, add trace events
instead. If a trace_printk() is found in the kernel, the following will
appear in the dmesg::
**********************************************************
** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
** **
** trace_printk() being used. Allocating extra memory. **
** **
** This means that this is a DEBUG kernel and it is **
** unsafe for production use. **
** **
** If you see this message and you are not debugging **
** the kernel, report this immediately to your vendor! **
** **
** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
**********************************************************
Debugging kernel crashes
------------------------
There is various methods of acquiring the state of the system when a kernel
crash occurs. This could be from the oops message in printk, or one could
use kexec/kdump. But these just show what happened at the time of the crash.
It can be very useful in knowing what happened up to the point of the crash.
The tracing ring buffer, by default, is a circular buffer than will
overwrite older events with newer ones. When a crash happens, the content of
the ring buffer will be all the events that lead up to the crash.
There are several kernel command line parameters that can be used to help in
this. The first is "ftrace_dump_on_oops". This will dump the tracing ring
buffer when a oops occurs to the console. This can be useful if the console
is being logged somewhere. If a serial console is used, it may be prudent to
make sure the ring buffer is relatively small, otherwise the dumping of the
ring buffer may take several minutes to hours to finish. Here's an example
of the kernel command line::
ftrace_dump_on_oops trace_buf_size=50K
Note, the tracing buffer is made up of per CPU buffers where each of these
buffers is broken up into sub-buffers that are by default PAGE_SIZE. The
above trace_buf_size option above sets each of the per CPU buffers to 50K,
so, on a machine with 8 CPUs, that's actually 400K total.
Persistent buffers across boots
-------------------------------
If the system memory allows it, the tracing ring buffer can be specified at
a specific location in memory. If the location is the same across boots and
the memory is not modified, the tracing buffer can be retrieved from the
following boot. There's two ways to reserve memory for the use of the ring
buffer.
The more reliable way (on x86) is to reserve memory with the "memmap" kernel
command line option and then use that memory for the trace_instance. This
requires a bit of knowledge of the physical memory layout of the system. The
advantage of using this method, is that the memory for the ring buffer will
always be the same::
memmap==12M$0x284500000 trace_instance=boot_map@0x284500000:12M
The memmap above reserves 12 megabytes of memory at the physical memory
location 0x284500000. Then the trace_instance option will create a trace
instance "boot_map" at that same location with the same amount of memory
reserved. As the ring buffer is broke up into per CPU buffers, the 12
megabytes will be broken up evenly between those CPUs. If you have 8 CPUs,
each per CPU ring buffer will be 1.5 megabytes in size. Note, that also
includes meta data, so the amount of memory actually used by the ring buffer
will be slightly smaller.
Another more generic but less robust way to allocate a ring buffer mapping
at boot is with the "reserve_mem" option::
reserve_mem=12M:4096:trace trace_instance=boot_map@trace
The reserve_mem option above will find 12 megabytes that are available at
boot up, and align it by 4096 bytes. It will label this memory as "trace"
that can be used by later command line options.
The trace_instance option creates a "boot_map" instance and will use the
memory reserved by reserve_mem that was labeled as "trace". This method is
more generic but may not be as reliable. Due to KASLR, the memory reserved
by reserve_mem may not be located at the same location. If this happens,
then the ring buffer will not be from the previous boot and will be reset.
Sometimes, by using a larger alignment, it can keep KASLR from moving things
around in such a way that it will move the location of the reserve_mem. By
using a larger alignment, you may find better that the buffer is more
consistent to where it is placed::
reserve_mem=12M:0x2000000:trace trace_instance=boot_map@trace
On boot up, the memory reserved for the ring buffer is validated. It will go
through a series of tests to make sure that the ring buffer contains valid
data. If it is, it will then set it up to be available to read from the
instance. If it fails any of the tests, it will clear the entire ring buffer
and initialize it as new.
The layout of this mapped memory may not be consistent from kernel to
kernel, so only the same kernel is guaranteed to work if the mapping is
preserved. Switching to a different kernel version may find a different
layout and mark the buffer as invalid.
Using trace_printk() in the boot instance
-----------------------------------------
By default, the content of trace_printk() goes into the top level tracing
instance. But this instance is never preserved across boots. To have the
trace_printk() content, and some other internal tracing go to the preserved
buffer (like dump stacks), either set the instance to be the trace_printk()
destination from the kernel command line, or set it after boot up via the
trace_printk_dest option.
After boot up::
echo 1 > /sys/kernel/tracing/instances/boot_map/options/trace_printk_dest
From the kernel command line::
reserve_mem=12M:4096:trace trace_instance=boot_map^traceprintk^traceoff@trace
If setting it from the kernel command line, it is recommended to also
disable tracing with the "traceoff" flag, and enable tracing after boot up.
Otherwise the trace from the most recent boot will be mixed with the trace
from the previous boot, and may make it confusing to read.
...@@ -1186,6 +1186,18 @@ Here are the available options: ...@@ -1186,6 +1186,18 @@ Here are the available options:
trace_printk trace_printk
Can disable trace_printk() from writing into the buffer. Can disable trace_printk() from writing into the buffer.
trace_printk_dest
Set to have trace_printk() and similar internal tracing functions
write into this instance. Note, only one trace instance can have
this set. By setting this flag, it clears the trace_printk_dest flag
of the instance that had it set previously. By default, the top
level trace has this set, and will get it set again if another
instance has it set then clears it.
This flag cannot be cleared by the top level instance, as it is the
default instance. The only way the top level instance has this flag
cleared, is by it being set in another instance.
annotate annotate
It is sometimes confusing when the CPU buffers are full It is sometimes confusing when the CPU buffers are full
and one CPU buffer had a lot of events recently, thus and one CPU buffer had a lot of events recently, thus
......
...@@ -89,6 +89,14 @@ void ring_buffer_discard_commit(struct trace_buffer *buffer, ...@@ -89,6 +89,14 @@ void ring_buffer_discard_commit(struct trace_buffer *buffer,
struct trace_buffer * struct trace_buffer *
__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *key); __ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *key);
struct trace_buffer *__ring_buffer_alloc_range(unsigned long size, unsigned flags,
int order, unsigned long start,
unsigned long range_size,
struct lock_class_key *key);
bool ring_buffer_last_boot_delta(struct trace_buffer *buffer, long *text,
long *data);
/* /*
* Because the ring buffer is generic, if other users of the ring buffer get * Because the ring buffer is generic, if other users of the ring buffer get
* traced by ftrace, it can produce lockdep warnings. We need to keep each * traced by ftrace, it can produce lockdep warnings. We need to keep each
...@@ -100,6 +108,18 @@ __ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *k ...@@ -100,6 +108,18 @@ __ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *k
__ring_buffer_alloc((size), (flags), &__key); \ __ring_buffer_alloc((size), (flags), &__key); \
}) })
/*
* Because the ring buffer is generic, if other users of the ring buffer get
* traced by ftrace, it can produce lockdep warnings. We need to keep each
* ring buffer's lock class separate.
*/
#define ring_buffer_alloc_range(size, flags, order, start, range_size) \
({ \
static struct lock_class_key __key; \
__ring_buffer_alloc_range((size), (flags), (order), (start), \
(range_size), &__key); \
})
typedef bool (*ring_buffer_cond_fn)(void *data); typedef bool (*ring_buffer_cond_fn)(void *data);
int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full, int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full,
ring_buffer_cond_fn cond, void *data); ring_buffer_cond_fn cond, void *data);
......
This diff is collapsed.
This diff is collapsed.
...@@ -336,7 +336,6 @@ struct trace_array { ...@@ -336,7 +336,6 @@ struct trace_array {
bool allocated_snapshot; bool allocated_snapshot;
spinlock_t snapshot_trigger_lock; spinlock_t snapshot_trigger_lock;
unsigned int snapshot; unsigned int snapshot;
unsigned int mapped;
unsigned long max_latency; unsigned long max_latency;
#ifdef CONFIG_FSNOTIFY #ifdef CONFIG_FSNOTIFY
struct dentry *d_max_latency; struct dentry *d_max_latency;
...@@ -344,6 +343,13 @@ struct trace_array { ...@@ -344,6 +343,13 @@ struct trace_array {
struct irq_work fsnotify_irqwork; struct irq_work fsnotify_irqwork;
#endif #endif
#endif #endif
/* The below is for memory mapped ring buffer */
unsigned int mapped;
unsigned long range_addr_start;
unsigned long range_addr_size;
long text_delta;
long data_delta;
struct trace_pid_list __rcu *filtered_pids; struct trace_pid_list __rcu *filtered_pids;
struct trace_pid_list __rcu *filtered_no_pids; struct trace_pid_list __rcu *filtered_no_pids;
/* /*
...@@ -423,7 +429,8 @@ struct trace_array { ...@@ -423,7 +429,8 @@ struct trace_array {
}; };
enum { enum {
TRACE_ARRAY_FL_GLOBAL = (1 << 0) TRACE_ARRAY_FL_GLOBAL = BIT(0),
TRACE_ARRAY_FL_BOOT = BIT(1),
}; };
extern struct list_head ftrace_trace_arrays; extern struct list_head ftrace_trace_arrays;
...@@ -644,6 +651,8 @@ trace_buffer_lock_reserve(struct trace_buffer *buffer, ...@@ -644,6 +651,8 @@ trace_buffer_lock_reserve(struct trace_buffer *buffer,
unsigned long len, unsigned long len,
unsigned int trace_ctx); unsigned int trace_ctx);
int ring_buffer_meta_seq_init(struct file *file, struct trace_buffer *buffer, int cpu);
struct trace_entry *tracing_get_trace_entry(struct trace_array *tr, struct trace_entry *tracing_get_trace_entry(struct trace_array *tr,
struct trace_array_cpu *data); struct trace_array_cpu *data);
...@@ -1312,6 +1321,7 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf, ...@@ -1312,6 +1321,7 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
C(IRQ_INFO, "irq-info"), \ C(IRQ_INFO, "irq-info"), \
C(MARKERS, "markers"), \ C(MARKERS, "markers"), \
C(EVENT_FORK, "event-fork"), \ C(EVENT_FORK, "event-fork"), \
C(TRACE_PRINTK, "trace_printk_dest"), \
C(PAUSE_ON_TRACE, "pause-on-trace"), \ C(PAUSE_ON_TRACE, "pause-on-trace"), \
C(HASH_PTR, "hash-ptr"), /* Print hashed pointer */ \ C(HASH_PTR, "hash-ptr"), /* Print hashed pointer */ \
FUNCTION_FLAGS \ FUNCTION_FLAGS \
......
...@@ -544,6 +544,8 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr, ...@@ -544,6 +544,8 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
struct trace_seq *s = &iter->seq; struct trace_seq *s = &iter->seq;
struct trace_entry *ent = iter->ent; struct trace_entry *ent = iter->ent;
addr += iter->tr->text_delta;
if (addr < (unsigned long)__irqentry_text_start || if (addr < (unsigned long)__irqentry_text_start ||
addr >= (unsigned long)__irqentry_text_end) addr >= (unsigned long)__irqentry_text_end)
return; return;
...@@ -710,6 +712,7 @@ print_graph_entry_leaf(struct trace_iterator *iter, ...@@ -710,6 +712,7 @@ print_graph_entry_leaf(struct trace_iterator *iter,
struct ftrace_graph_ret *graph_ret; struct ftrace_graph_ret *graph_ret;
struct ftrace_graph_ent *call; struct ftrace_graph_ent *call;
unsigned long long duration; unsigned long long duration;
unsigned long func;
int cpu = iter->cpu; int cpu = iter->cpu;
int i; int i;
...@@ -717,6 +720,8 @@ print_graph_entry_leaf(struct trace_iterator *iter, ...@@ -717,6 +720,8 @@ print_graph_entry_leaf(struct trace_iterator *iter,
call = &entry->graph_ent; call = &entry->graph_ent;
duration = graph_ret->rettime - graph_ret->calltime; duration = graph_ret->rettime - graph_ret->calltime;
func = call->func + iter->tr->text_delta;
if (data) { if (data) {
struct fgraph_cpu_data *cpu_data; struct fgraph_cpu_data *cpu_data;
...@@ -747,10 +752,10 @@ print_graph_entry_leaf(struct trace_iterator *iter, ...@@ -747,10 +752,10 @@ print_graph_entry_leaf(struct trace_iterator *iter,
* enabled. * enabled.
*/ */
if (flags & __TRACE_GRAPH_PRINT_RETVAL) if (flags & __TRACE_GRAPH_PRINT_RETVAL)
print_graph_retval(s, graph_ret->retval, true, (void *)call->func, print_graph_retval(s, graph_ret->retval, true, (void *)func,
!!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX)); !!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX));
else else
trace_seq_printf(s, "%ps();\n", (void *)call->func); trace_seq_printf(s, "%ps();\n", (void *)func);
print_graph_irq(iter, graph_ret->func, TRACE_GRAPH_RET, print_graph_irq(iter, graph_ret->func, TRACE_GRAPH_RET,
cpu, iter->ent->pid, flags); cpu, iter->ent->pid, flags);
...@@ -766,6 +771,7 @@ print_graph_entry_nested(struct trace_iterator *iter, ...@@ -766,6 +771,7 @@ print_graph_entry_nested(struct trace_iterator *iter,
struct ftrace_graph_ent *call = &entry->graph_ent; struct ftrace_graph_ent *call = &entry->graph_ent;
struct fgraph_data *data = iter->private; struct fgraph_data *data = iter->private;
struct trace_array *tr = iter->tr; struct trace_array *tr = iter->tr;
unsigned long func;
int i; int i;
if (data) { if (data) {
...@@ -788,7 +794,9 @@ print_graph_entry_nested(struct trace_iterator *iter, ...@@ -788,7 +794,9 @@ print_graph_entry_nested(struct trace_iterator *iter,
for (i = 0; i < call->depth * TRACE_GRAPH_INDENT; i++) for (i = 0; i < call->depth * TRACE_GRAPH_INDENT; i++)
trace_seq_putc(s, ' '); trace_seq_putc(s, ' ');
trace_seq_printf(s, "%ps() {\n", (void *)call->func); func = call->func + iter->tr->text_delta;
trace_seq_printf(s, "%ps() {\n", (void *)func);
if (trace_seq_has_overflowed(s)) if (trace_seq_has_overflowed(s))
return TRACE_TYPE_PARTIAL_LINE; return TRACE_TYPE_PARTIAL_LINE;
...@@ -863,6 +871,8 @@ check_irq_entry(struct trace_iterator *iter, u32 flags, ...@@ -863,6 +871,8 @@ check_irq_entry(struct trace_iterator *iter, u32 flags,
int *depth_irq; int *depth_irq;
struct fgraph_data *data = iter->private; struct fgraph_data *data = iter->private;
addr += iter->tr->text_delta;
/* /*
* If we are either displaying irqs, or we got called as * If we are either displaying irqs, or we got called as
* a graph event and private data does not exist, * a graph event and private data does not exist,
...@@ -990,11 +1000,14 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, ...@@ -990,11 +1000,14 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
unsigned long long duration = trace->rettime - trace->calltime; unsigned long long duration = trace->rettime - trace->calltime;
struct fgraph_data *data = iter->private; struct fgraph_data *data = iter->private;
struct trace_array *tr = iter->tr; struct trace_array *tr = iter->tr;
unsigned long func;
pid_t pid = ent->pid; pid_t pid = ent->pid;
int cpu = iter->cpu; int cpu = iter->cpu;
int func_match = 1; int func_match = 1;
int i; int i;
func = trace->func + iter->tr->text_delta;
if (check_irq_return(iter, flags, trace->depth)) if (check_irq_return(iter, flags, trace->depth))
return TRACE_TYPE_HANDLED; return TRACE_TYPE_HANDLED;
...@@ -1033,7 +1046,7 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, ...@@ -1033,7 +1046,7 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
* function-retval option is enabled. * function-retval option is enabled.
*/ */
if (flags & __TRACE_GRAPH_PRINT_RETVAL) { if (flags & __TRACE_GRAPH_PRINT_RETVAL) {
print_graph_retval(s, trace->retval, false, (void *)trace->func, print_graph_retval(s, trace->retval, false, (void *)func,
!!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX)); !!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX));
} else { } else {
/* /*
...@@ -1046,7 +1059,7 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s, ...@@ -1046,7 +1059,7 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
if (func_match && !(flags & TRACE_GRAPH_PRINT_TAIL)) if (func_match && !(flags & TRACE_GRAPH_PRINT_TAIL))
trace_seq_puts(s, "}\n"); trace_seq_puts(s, "}\n");
else else
trace_seq_printf(s, "} /* %ps */\n", (void *)trace->func); trace_seq_printf(s, "} /* %ps */\n", (void *)func);
} }
/* Overrun */ /* Overrun */
......
...@@ -990,8 +990,11 @@ enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags, ...@@ -990,8 +990,11 @@ enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags,
} }
static void print_fn_trace(struct trace_seq *s, unsigned long ip, static void print_fn_trace(struct trace_seq *s, unsigned long ip,
unsigned long parent_ip, int flags) unsigned long parent_ip, long delta, int flags)
{ {
ip += delta;
parent_ip += delta;
seq_print_ip_sym(s, ip, flags); seq_print_ip_sym(s, ip, flags);
if ((flags & TRACE_ITER_PRINT_PARENT) && parent_ip) { if ((flags & TRACE_ITER_PRINT_PARENT) && parent_ip) {
...@@ -1009,7 +1012,7 @@ static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags, ...@@ -1009,7 +1012,7 @@ static enum print_line_t trace_fn_trace(struct trace_iterator *iter, int flags,
trace_assign_type(field, iter->ent); trace_assign_type(field, iter->ent);
print_fn_trace(s, field->ip, field->parent_ip, flags); print_fn_trace(s, field->ip, field->parent_ip, iter->tr->text_delta, flags);
trace_seq_putc(s, '\n'); trace_seq_putc(s, '\n');
return trace_handle_return(s); return trace_handle_return(s);
...@@ -1230,6 +1233,7 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter, ...@@ -1230,6 +1233,7 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter,
struct trace_seq *s = &iter->seq; struct trace_seq *s = &iter->seq;
unsigned long *p; unsigned long *p;
unsigned long *end; unsigned long *end;
long delta = iter->tr->text_delta;
trace_assign_type(field, iter->ent); trace_assign_type(field, iter->ent);
end = (unsigned long *)((long)iter->ent + iter->ent_size); end = (unsigned long *)((long)iter->ent + iter->ent_size);
...@@ -1242,7 +1246,7 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter, ...@@ -1242,7 +1246,7 @@ static enum print_line_t trace_stack_print(struct trace_iterator *iter,
break; break;
trace_seq_puts(s, " => "); trace_seq_puts(s, " => ");
seq_print_ip_sym(s, *p, flags); seq_print_ip_sym(s, (*p) + delta, flags);
trace_seq_putc(s, '\n'); trace_seq_putc(s, '\n');
} }
...@@ -1587,10 +1591,13 @@ static enum print_line_t trace_print_print(struct trace_iterator *iter, ...@@ -1587,10 +1591,13 @@ static enum print_line_t trace_print_print(struct trace_iterator *iter,
{ {
struct print_entry *field; struct print_entry *field;
struct trace_seq *s = &iter->seq; struct trace_seq *s = &iter->seq;
unsigned long ip;
trace_assign_type(field, iter->ent); trace_assign_type(field, iter->ent);
seq_print_ip_sym(s, field->ip, flags); ip = field->ip + iter->tr->text_delta;
seq_print_ip_sym(s, ip, flags);
trace_seq_printf(s, ": %s", field->buf); trace_seq_printf(s, ": %s", field->buf);
return trace_handle_return(s); return trace_handle_return(s);
...@@ -1674,7 +1681,7 @@ trace_func_repeats_print(struct trace_iterator *iter, int flags, ...@@ -1674,7 +1681,7 @@ trace_func_repeats_print(struct trace_iterator *iter, int flags,
trace_assign_type(field, iter->ent); trace_assign_type(field, iter->ent);
print_fn_trace(s, field->ip, field->parent_ip, flags); print_fn_trace(s, field->ip, field->parent_ip, iter->tr->text_delta, flags);
trace_seq_printf(s, " (repeats: %u, last_ts:", field->count); trace_seq_printf(s, " (repeats: %u, last_ts:", field->count);
trace_print_time(s, iter, trace_print_time(s, iter,
iter->ts - FUNC_REPEATS_GET_DELTA_TS(field)); iter->ts - FUNC_REPEATS_GET_DELTA_TS(field));
......
...@@ -92,12 +92,22 @@ int tracefs_cpu_map(struct tracefs_cpu_map_desc *desc, int cpu) ...@@ -92,12 +92,22 @@ int tracefs_cpu_map(struct tracefs_cpu_map_desc *desc, int cpu)
if (desc->cpu_fd < 0) if (desc->cpu_fd < 0)
return -ENODEV; return -ENODEV;
again:
map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, desc->cpu_fd, 0); map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, desc->cpu_fd, 0);
if (map == MAP_FAILED) if (map == MAP_FAILED)
return -errno; return -errno;
desc->meta = (struct trace_buffer_meta *)map; desc->meta = (struct trace_buffer_meta *)map;
/* the meta-page is bigger than the original mapping */
if (page_size < desc->meta->meta_struct_len) {
int meta_page_size = desc->meta->meta_page_size;
munmap(desc->meta, page_size);
page_size = meta_page_size;
goto again;
}
return 0; return 0;
} }
...@@ -228,6 +238,20 @@ TEST_F(map, data_mmap) ...@@ -228,6 +238,20 @@ TEST_F(map, data_mmap)
data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, data = mmap(NULL, data_len, PROT_READ, MAP_SHARED,
desc->cpu_fd, meta_len); desc->cpu_fd, meta_len);
ASSERT_EQ(data, MAP_FAILED); ASSERT_EQ(data, MAP_FAILED);
/* Verify meta-page padding */
if (desc->meta->meta_page_size > getpagesize()) {
data_len = desc->meta->meta_page_size;
data = mmap(NULL, data_len,
PROT_READ, MAP_SHARED, desc->cpu_fd, 0);
ASSERT_NE(data, MAP_FAILED);
for (int i = desc->meta->meta_struct_len;
i < desc->meta->meta_page_size; i += sizeof(int))
ASSERT_EQ(*(int *)(data + i), 0);
munmap(data, data_len);
}
} }
FIXTURE(snapshot) { FIXTURE(snapshot) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment