Commit aa1a8ce5 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "New tracing features:

   - The ring buffer is no longer disabled when reading the trace file.

     The trace_pipe file was made to be used for live tracing and
     reading as it acted like the normal producer/consumer. As the trace
     file would not consume the data, the easy way of handling it was to
     just disable writes to the ring buffer.

     This came to a surprise to the BPF folks who complained about lost
     events due to reading. This is no longer an issue. If someone wants
     to keep the old disabling there's a new option "pause-on-trace"
     that can be set.

   - New set_ftrace_notrace_pid file. PIDs in this file will not be
     traced by the function tracer.

     Similar to set_ftrace_pid, which makes the function tracer only
     trace those tasks with PIDs in the file, the set_ftrace_notrace_pid
     does the reverse.

   - New set_event_notrace_pid file. PIDs in this file will cause events
     not to be traced if triggered by a task with a matching PID.

     Similar to the set_event_pid file but will not be traced. Note,
     sched_waking and sched_switch events may still be traced if one of
     the tasks referenced by those events contains a PID that is allowed
     to be traced.

  Tracing related features:

   - New bootconfig option, that is attached to the initrd file.

     If bootconfig is on the command line, then the initrd file is
     searched looking for a bootconfig appended at the end.

   - New GPU tracepoint infrastructure to help the gfx drivers to get
     off debugfs (acked by Greg Kroah-Hartman)

  And other minor updates and fixes"

* tag 'trace-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
  tracing: Do not allocate buffer in trace_find_next_entry() in atomic
  tracing: Add documentation on set_ftrace_notrace_pid and set_event_notrace_pid
  selftests/ftrace: Add test to test new set_event_notrace_pid file
  selftests/ftrace: Add test to test new set_ftrace_notrace_pid file
  tracing: Create set_event_notrace_pid to not trace tasks
  ftrace: Create set_ftrace_notrace_pid to not trace tasks
  ftrace: Make function trace pid filtering a bit more exact
  ftrace/kprobe: Show the maxactive number on kprobe_events
  tracing: Have the document reflect that the trace file keeps tracing enabled
  ring-buffer/tracing: Have iterator acknowledge dropped events
  tracing: Do not disable tracing when reading the trace file
  ring-buffer: Do not disable recording when there is an iterator
  ring-buffer: Make resize disable per cpu buffer instead of total buffer
  ring-buffer: Optimize rb_iter_head_event()
  ring-buffer: Do not die if rb_iter_peek() fails more than thrice
  ring-buffer: Have rb_iter_head_event() handle concurrent writer
  ring-buffer: Add page_stamp to iterator for synchronization
  ring-buffer: Rename ring_buffer_read() to read_buffer_iter_advance()
  ring-buffer: Have ring_buffer_empty() not depend on tracing stopped
  tracing: Save off entry when peeking at next entry
  ...
parents 4c205c84 8e99cf91
......@@ -125,10 +125,13 @@ of ftrace. Here is a list of some of the key files:
trace:
This file holds the output of the trace in a human
readable format (described below). Note, tracing is temporarily
disabled when the file is open for reading. Once all readers
are closed, tracing is re-enabled. Opening this file for
readable format (described below). Opening this file for
writing with the O_TRUNC flag clears the ring buffer content.
Note, this file is not a consumer. If tracing is off
(no tracer running, or tracing_on is zero), it will produce
the same output each time it is read. When tracing is on,
it may produce inconsistent results as it tries to read
the entire buffer without consuming it.
trace_pipe:
......@@ -142,9 +145,7 @@ of ftrace. Here is a list of some of the key files:
will not be read again with a sequential read. The
"trace" file is static, and if the tracer is not
adding more data, it will display the same
information every time it is read. Unlike the
"trace" file, opening this file for reading will not
temporarily disable tracing.
information every time it is read.
trace_options:
......@@ -262,6 +263,20 @@ of ftrace. Here is a list of some of the key files:
traced by the function tracer as well. This option will also
cause PIDs of tasks that exit to be removed from the file.
set_ftrace_notrace_pid:
Have the function tracer ignore threads whose PID are listed in
this file.
If the "function-fork" option is set, then when a task whose
PID is listed in this file forks, the child's PID will
automatically be added to this file, and the child will not be
traced by the function tracer as well. This option will also
cause PIDs of tasks that exit to be removed from the file.
If a PID is in both this file and "set_ftrace_pid", then this
file takes precedence, and the thread will not be traced.
set_event_pid:
Have the events only trace a task with a PID listed in this file.
......@@ -273,6 +288,19 @@ of ftrace. Here is a list of some of the key files:
cause the PIDs of tasks to be removed from this file when the task
exits.
set_event_notrace_pid:
Have the events not trace a task with a PID listed in this file.
Note, sched_switch and sched_wakeup will trace threads not listed
in this file, even if a thread's PID is in the file if the
sched_switch or sched_wakeup events also trace a thread that should
be traced.
To have the PIDs of children of tasks with their PID in this file
added on fork, enable the "event-fork" option. That option will also
cause the PIDs of tasks to be removed from this file when the task
exits.
set_graph_function:
Functions listed in this file will cause the function graph
......@@ -1125,6 +1153,12 @@ Here are the available options:
the trace displays additional information about the
latency, as described in "Latency trace format".
pause-on-trace
When set, opening the trace file for read, will pause
writing to the ring buffer (as if tracing_on was set to zero).
This simulates the original behavior of the trace file.
When the file is closed, tracing will be enabled again.
record-cmd
When any event or tracer is enabled, a hook is enabled
in the sched_switch trace point to fill comm cache
......@@ -1176,6 +1210,8 @@ Here are the available options:
tasks fork. Also, when tasks with PIDs in set_event_pid exit,
their PIDs will be removed from the file.
This affects PIDs listed in set_event_notrace_pid as well.
function-trace
The latency tracers will enable function tracing
if this option is enabled (default it is). When
......@@ -1190,6 +1226,8 @@ Here are the available options:
set_ftrace_pid exit, their PIDs will be removed from the
file.
This affects PIDs in set_ftrace_notrace_pid as well.
display-graph
When set, the latency tracers (irqsoff, wakeup, etc) will
use function graph tracing instead of function tracing.
......@@ -2126,6 +2164,8 @@ periodically make a CPU constantly busy with interrupts disabled.
# cat trace
# tracer: hwlat
#
# entries-in-buffer/entries-written: 13/13 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
......@@ -2133,12 +2173,18 @@ periodically make a CPU constantly busy with interrupts disabled.
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<...>-3638 [001] d... 19452.055471: #1 inner/outer(us): 12/14 ts:1499801089.066141940
<...>-3638 [003] d... 19454.071354: #2 inner/outer(us): 11/9 ts:1499801091.082164365
<...>-3638 [002] dn.. 19461.126852: #3 inner/outer(us): 12/9 ts:1499801098.138150062
<...>-3638 [001] d... 19488.340960: #4 inner/outer(us): 8/12 ts:1499801125.354139633
<...>-3638 [003] d... 19494.388553: #5 inner/outer(us): 8/12 ts:1499801131.402150961
<...>-3638 [003] d... 19501.283419: #6 inner/outer(us): 0/12 ts:1499801138.297435289 nmi-total:4 nmi-count:1
<...>-1729 [001] d... 678.473449: #1 inner/outer(us): 11/12 ts:1581527483.343962693 count:6
<...>-1729 [004] d... 689.556542: #2 inner/outer(us): 16/9 ts:1581527494.889008092 count:1
<...>-1729 [005] d... 714.756290: #3 inner/outer(us): 16/16 ts:1581527519.678961629 count:5
<...>-1729 [001] d... 718.788247: #4 inner/outer(us): 9/17 ts:1581527523.889012713 count:1
<...>-1729 [002] d... 719.796341: #5 inner/outer(us): 13/9 ts:1581527524.912872606 count:1
<...>-1729 [006] d... 844.787091: #6 inner/outer(us): 9/12 ts:1581527649.889048502 count:2
<...>-1729 [003] d... 849.827033: #7 inner/outer(us): 18/9 ts:1581527654.889013793 count:1
<...>-1729 [007] d... 853.859002: #8 inner/outer(us): 9/12 ts:1581527658.889065736 count:1
<...>-1729 [001] d... 855.874978: #9 inner/outer(us): 9/11 ts:1581527660.861991877 count:1
<...>-1729 [001] d... 863.938932: #10 inner/outer(us): 9/11 ts:1581527668.970010500 count:1 nmi-total:7 nmi-count:1
<...>-1729 [007] d... 878.050780: #11 inner/outer(us): 9/12 ts:1581527683.385002600 count:1 nmi-total:5 nmi-count:1
<...>-1729 [007] d... 886.114702: #12 inner/outer(us): 9/12 ts:1581527691.385001600 count:1
The above output is somewhat the same in the header. All events will have
......@@ -2148,7 +2194,7 @@ interrupts disabled 'd'. Under the FUNCTION title there is:
This is the count of events recorded that were greater than the
tracing_threshold (See below).
inner/outer(us): 12/14
inner/outer(us): 11/11
This shows two numbers as "inner latency" and "outer latency". The test
runs in a loop checking a timestamp twice. The latency detected within
......@@ -2156,11 +2202,15 @@ interrupts disabled 'd'. Under the FUNCTION title there is:
after the previous timestamp and the next timestamp in the loop is
the "outer latency".
ts:1499801089.066141940
ts:1581527483.343962693
The absolute timestamp that the first latency was recorded in the window.
count:6
The absolute timestamp that the event happened.
The number of times a latency was detected during the window.
nmi-total:4 nmi-count:1
nmi-total:7 nmi-count:1
On architectures that support it, if an NMI comes in during the
test, the time spent in NMI is reported in "nmi-total" (in
......
......@@ -200,6 +200,8 @@ source "drivers/thunderbolt/Kconfig"
source "drivers/android/Kconfig"
source "drivers/gpu/trace/Kconfig"
source "drivers/nvdimm/Kconfig"
source "drivers/dax/Kconfig"
......
......@@ -5,3 +5,4 @@
obj-$(CONFIG_TEGRA_HOST1X) += host1x/
obj-y += drm/ vga/
obj-$(CONFIG_IMX_IPUV3_CORE) += ipu-v3/
obj-$(CONFIG_TRACE_GPU_MEM) += trace/
# SPDX-License-Identifier: GPL-2.0-only
config TRACE_GPU_MEM
bool
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_TRACE_GPU_MEM) += trace_gpu_mem.o
// SPDX-License-Identifier: GPL-2.0
/*
* GPU memory trace points
*
* Copyright (C) 2020 Google, Inc.
*/
#include <linux/module.h>
#define CREATE_TRACE_POINTS
#include <trace/events/gpu_mem.h>
EXPORT_TRACEPOINT_SYMBOL(gpu_mem_total);
......@@ -216,7 +216,8 @@ static inline int __init xbc_node_compose_key(struct xbc_node *node,
}
/* XBC node initializer */
int __init xbc_init(char *buf);
int __init xbc_init(char *buf, const char **emsg, int *epos);
/* XBC cleanup data structures */
void __init xbc_destroy_all(void);
......
......@@ -135,10 +135,10 @@ void ring_buffer_read_finish(struct ring_buffer_iter *iter);
struct ring_buffer_event *
ring_buffer_iter_peek(struct ring_buffer_iter *iter, u64 *ts);
struct ring_buffer_event *
ring_buffer_read(struct ring_buffer_iter *iter, u64 *ts);
void ring_buffer_iter_advance(struct ring_buffer_iter *iter);
void ring_buffer_iter_reset(struct ring_buffer_iter *iter);
int ring_buffer_iter_empty(struct ring_buffer_iter *iter);
bool ring_buffer_iter_dropped(struct ring_buffer_iter *iter);
unsigned long ring_buffer_size(struct trace_buffer *buffer, int cpu);
......
......@@ -85,6 +85,8 @@ struct trace_iterator {
struct mutex mutex;
struct ring_buffer_iter **buffer_iter;
unsigned long iter_flags;
void *temp; /* temp holder */
unsigned int temp_size;
/* trace_seq for __print_flags() and __print_symbolic() etc. */
struct trace_seq tmp_seq;
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* GPU memory trace points
*
* Copyright (C) 2020 Google, Inc.
*/
#undef TRACE_SYSTEM
#define TRACE_SYSTEM gpu_mem
#if !defined(_TRACE_GPU_MEM_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_GPU_MEM_H
#include <linux/tracepoint.h>
/*
* The gpu_memory_total event indicates that there's an update to either the
* global or process total gpu memory counters.
*
* This event should be emitted whenever the kernel device driver allocates,
* frees, imports, unimports memory in the GPU addressable space.
*
* @gpu_id: This is the gpu id.
*
* @pid: Put 0 for global total, while positive pid for process total.
*
* @size: Virtual size of the allocation in bytes.
*
*/
TRACE_EVENT(gpu_mem_total,
TP_PROTO(uint32_t gpu_id, uint32_t pid, uint64_t size),
TP_ARGS(gpu_id, pid, size),
TP_STRUCT__entry(
__field(uint32_t, gpu_id)
__field(uint32_t, pid)
__field(uint64_t, size)
),
TP_fast_assign(
__entry->gpu_id = gpu_id;
__entry->pid = pid;
__entry->size = size;
),
TP_printk("gpu_id=%u pid=%u size=%llu",
__entry->gpu_id,
__entry->pid,
__entry->size)
);
#endif /* _TRACE_GPU_MEM_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
......@@ -353,6 +353,8 @@ static int __init bootconfig_params(char *param, char *val,
static void __init setup_boot_config(const char *cmdline)
{
static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
const char *msg;
int pos;
u32 size, csum;
char *data, *copy;
u32 *hdr;
......@@ -400,10 +402,14 @@ static void __init setup_boot_config(const char *cmdline)
memcpy(copy, data, size);
copy[size] = '\0';
ret = xbc_init(copy);
if (ret < 0)
pr_err("Failed to parse bootconfig\n");
else {
ret = xbc_init(copy, &msg, &pos);
if (ret < 0) {
if (pos < 0)
pr_err("Failed to init bootconfig: %s.\n", msg);
else
pr_err("Failed to parse bootconfig: %s at %d.\n",
msg, pos);
} else {
pr_info("Load bootconfig: %d bytes %d nodes\n", size, ret);
/* keys starting with "kernel." are passed via cmdline */
extra_command_line = xbc_make_cmdline("kernel");
......
This diff is collapsed.
This diff is collapsed.
......@@ -386,16 +386,22 @@ trace_find_filtered_pid(struct trace_pid_list *filtered_pids, pid_t search_pid)
* Returns false if @task should be traced.
*/
bool
trace_ignore_this_task(struct trace_pid_list *filtered_pids, struct task_struct *task)
trace_ignore_this_task(struct trace_pid_list *filtered_pids,
struct trace_pid_list *filtered_no_pids,
struct task_struct *task)
{
/*
* Return false, because if filtered_pids does not exist,
* all pids are good to trace.
* If filterd_no_pids is not empty, and the task's pid is listed
* in filtered_no_pids, then return true.
* Otherwise, if filtered_pids is empty, that means we can
* trace all tasks. If it has content, then only trace pids
* within filtered_pids.
*/
if (!filtered_pids)
return false;
return !trace_find_filtered_pid(filtered_pids, task->pid);
return (filtered_pids &&
!trace_find_filtered_pid(filtered_pids, task->pid)) ||
(filtered_no_pids &&
trace_find_filtered_pid(filtered_no_pids, task->pid));
}
/**
......@@ -3378,7 +3384,7 @@ static void trace_iterator_increment(struct trace_iterator *iter)
iter->idx++;
if (buf_iter)
ring_buffer_read(buf_iter, NULL);
ring_buffer_iter_advance(buf_iter);
}
static struct trace_entry *
......@@ -3388,11 +3394,15 @@ peek_next_entry(struct trace_iterator *iter, int cpu, u64 *ts,
struct ring_buffer_event *event;
struct ring_buffer_iter *buf_iter = trace_buffer_iter(iter, cpu);
if (buf_iter)
if (buf_iter) {
event = ring_buffer_iter_peek(buf_iter, ts);
else
if (lost_events)
*lost_events = ring_buffer_iter_dropped(buf_iter) ?
(unsigned long)-1 : 0;
} else {
event = ring_buffer_peek(iter->array_buffer->buffer, cpu, ts,
lost_events);
}
if (event) {
iter->ent_size = ring_buffer_event_length(event);
......@@ -3462,11 +3472,51 @@ __find_next_entry(struct trace_iterator *iter, int *ent_cpu,
return next;
}
#define STATIC_TEMP_BUF_SIZE 128
static char static_temp_buf[STATIC_TEMP_BUF_SIZE];
/* Find the next real entry, without updating the iterator itself */
struct trace_entry *trace_find_next_entry(struct trace_iterator *iter,
int *ent_cpu, u64 *ent_ts)
{
return __find_next_entry(iter, ent_cpu, NULL, ent_ts);
/* __find_next_entry will reset ent_size */
int ent_size = iter->ent_size;
struct trace_entry *entry;
/*
* If called from ftrace_dump(), then the iter->temp buffer
* will be the static_temp_buf and not created from kmalloc.
* If the entry size is greater than the buffer, we can
* not save it. Just return NULL in that case. This is only
* used to add markers when two consecutive events' time
* stamps have a large delta. See trace_print_lat_context()
*/
if (iter->temp == static_temp_buf &&
STATIC_TEMP_BUF_SIZE < ent_size)
return NULL;
/*
* The __find_next_entry() may call peek_next_entry(), which may
* call ring_buffer_peek() that may make the contents of iter->ent
* undefined. Need to copy iter->ent now.
*/
if (iter->ent && iter->ent != iter->temp) {
if ((!iter->temp || iter->temp_size < iter->ent_size) &&
!WARN_ON_ONCE(iter->temp == static_temp_buf)) {
kfree(iter->temp);
iter->temp = kmalloc(iter->ent_size, GFP_KERNEL);
if (!iter->temp)
return NULL;
}
memcpy(iter->temp, iter->ent, iter->ent_size);
iter->temp_size = iter->ent_size;
iter->ent = iter->temp;
}
entry = __find_next_entry(iter, ent_cpu, NULL, ent_ts);
/* Put back the original ent_size */
iter->ent_size = ent_size;
return entry;
}
/* Find the next real entry, and increment the iterator to the next entry */
......@@ -3538,7 +3588,7 @@ void tracing_iter_reset(struct trace_iterator *iter, int cpu)
if (ts >= iter->array_buffer->time_start)
break;
entries++;
ring_buffer_read(buf_iter, NULL);
ring_buffer_iter_advance(buf_iter);
}
per_cpu_ptr(iter->array_buffer->data, cpu)->skipped_entries = entries;
......@@ -3981,8 +4031,12 @@ enum print_line_t print_trace_line(struct trace_iterator *iter)
enum print_line_t ret;
if (iter->lost_events) {
trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
iter->cpu, iter->lost_events);
if (iter->lost_events == (unsigned long)-1)
trace_seq_printf(&iter->seq, "CPU:%d [LOST EVENTS]\n",
iter->cpu);
else
trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
iter->cpu, iter->lost_events);
if (trace_seq_has_overflowed(&iter->seq))
return TRACE_TYPE_PARTIAL_LINE;
}
......@@ -4197,6 +4251,18 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
if (!iter->buffer_iter)
goto release;
/*
* trace_find_next_entry() may need to save off iter->ent.
* It will place it into the iter->temp buffer. As most
* events are less than 128, allocate a buffer of that size.
* If one is greater, then trace_find_next_entry() will
* allocate a new buffer to adjust for the bigger iter->ent.
* It's not critical if it fails to get allocated here.
*/
iter->temp = kmalloc(128, GFP_KERNEL);
if (iter->temp)
iter->temp_size = 128;
/*
* We make a copy of the current tracer to avoid concurrent
* changes on it while we are reading.
......@@ -4237,8 +4303,11 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
if (trace_clocks[tr->clock_id].in_ns)
iter->iter_flags |= TRACE_FILE_TIME_IN_NS;
/* stop the trace while dumping if we are not opening "snapshot" */
if (!iter->snapshot)
/*
* If pause-on-trace is enabled, then stop the trace while
* dumping, unless this is the "snapshot" file
*/
if (!iter->snapshot && (tr->trace_flags & TRACE_ITER_PAUSE_ON_TRACE))
tracing_stop_tr(tr);
if (iter->cpu_file == RING_BUFFER_ALL_CPUS) {
......@@ -4269,6 +4338,7 @@ __tracing_open(struct inode *inode, struct file *file, bool snapshot)
fail:
mutex_unlock(&trace_types_lock);
kfree(iter->trace);
kfree(iter->temp);
kfree(iter->buffer_iter);
release:
seq_release_private(inode, file);
......@@ -4334,7 +4404,7 @@ static int tracing_release(struct inode *inode, struct file *file)
if (iter->trace && iter->trace->close)
iter->trace->close(iter);
if (!iter->snapshot)
if (!iter->snapshot && tr->stop_count)
/* reenable tracing if it was previously enabled */
tracing_start_tr(tr);
......@@ -4344,6 +4414,7 @@ static int tracing_release(struct inode *inode, struct file *file)
mutex_destroy(&iter->mutex);
free_cpumask_var(iter->started);
kfree(iter->temp);
kfree(iter->trace);
kfree(iter->buffer_iter);
seq_release_private(inode, file);
......@@ -4964,6 +5035,8 @@ static const char readme_msg[] =
#ifdef CONFIG_FUNCTION_TRACER
" set_ftrace_pid\t- Write pid(s) to only function trace those pids\n"
"\t\t (function)\n"
" set_ftrace_notrace_pid\t- Write pid(s) to not function trace those pids\n"
"\t\t (function)\n"
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
" set_graph_function\t- Trace the nested calls of a function (function_graph)\n"
......@@ -9146,6 +9219,9 @@ void ftrace_dump(enum ftrace_dump_mode oops_dump_mode)
/* Simulate the iterator */
trace_init_global_iter(&iter);
/* Can not use kmalloc for iter.temp */
iter.temp = static_temp_buf;
iter.temp_size = STATIC_TEMP_BUF_SIZE;
for_each_tracing_cpu(cpu) {
atomic_inc(&per_cpu_ptr(iter.array_buffer->data, cpu)->disabled);
......@@ -9334,7 +9410,7 @@ __init static int tracer_alloc_buffers(void)
goto out_free_buffer_mask;
/* Only allocate trace_printk buffers if a trace_printk exists */
if (__stop___trace_bprintk_fmt != __start___trace_bprintk_fmt)
if (&__stop___trace_bprintk_fmt != &__start___trace_bprintk_fmt)
/* Must be called before global_trace.buffer is allocated */
trace_printk_init_buffers();
......
......@@ -178,10 +178,10 @@ struct trace_array_cpu {
kuid_t uid;
char comm[TASK_COMM_LEN];
bool ignore_pid;
#ifdef CONFIG_FUNCTION_TRACER
bool ftrace_ignore_pid;
int ftrace_ignore_pid;
#endif
bool ignore_pid;
};
struct tracer;
......@@ -207,6 +207,30 @@ struct trace_pid_list {
unsigned long *pids;
};
enum {
TRACE_PIDS = BIT(0),
TRACE_NO_PIDS = BIT(1),
};
static inline bool pid_type_enabled(int type, struct trace_pid_list *pid_list,
struct trace_pid_list *no_pid_list)
{
/* Return true if the pid list in type has pids */
return ((type & TRACE_PIDS) && pid_list) ||
((type & TRACE_NO_PIDS) && no_pid_list);
}
static inline bool still_need_pid_events(int type, struct trace_pid_list *pid_list,
struct trace_pid_list *no_pid_list)
{
/*
* Turning off what is in @type, return true if the "other"
* pid list, still has pids in it.
*/
return (!(type & TRACE_PIDS) && pid_list) ||
(!(type & TRACE_NO_PIDS) && no_pid_list);
}
typedef bool (*cond_update_fn_t)(struct trace_array *tr, void *cond_data);
/**
......@@ -285,6 +309,7 @@ struct trace_array {
#endif
#endif
struct trace_pid_list __rcu *filtered_pids;
struct trace_pid_list __rcu *filtered_no_pids;
/*
* max_lock is used to protect the swapping of buffers
* when taking a max snapshot. The buffers themselves are
......@@ -331,6 +356,7 @@ struct trace_array {
#ifdef CONFIG_FUNCTION_TRACER
struct ftrace_ops *ops;
struct trace_pid_list __rcu *function_pids;
struct trace_pid_list __rcu *function_no_pids;
#ifdef CONFIG_DYNAMIC_FTRACE
/* All of these are protected by the ftrace_lock */
struct list_head func_probes;
......@@ -557,12 +583,7 @@ struct tracer {
* caller, and we can skip the current check.
*/
enum {
TRACE_BUFFER_BIT,
TRACE_BUFFER_NMI_BIT,
TRACE_BUFFER_IRQ_BIT,
TRACE_BUFFER_SIRQ_BIT,
/* Start of function recursion bits */
/* Function recursion bits */
TRACE_FTRACE_BIT,
TRACE_FTRACE_NMI_BIT,
TRACE_FTRACE_IRQ_BIT,
......@@ -787,6 +808,7 @@ extern int pid_max;
bool trace_find_filtered_pid(struct trace_pid_list *filtered_pids,
pid_t search_pid);
bool trace_ignore_this_task(struct trace_pid_list *filtered_pids,
struct trace_pid_list *filtered_no_pids,
struct task_struct *task);
void trace_filter_add_remove_task(struct trace_pid_list *pid_list,
struct task_struct *self,
......@@ -1307,6 +1329,7 @@ extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
C(IRQ_INFO, "irq-info"), \
C(MARKERS, "markers"), \
C(EVENT_FORK, "event-fork"), \
C(PAUSE_ON_TRACE, "pause-on-trace"), \
FUNCTION_FLAGS \
FGRAPH_FLAGS \
STACK_FLAGS \
......
......@@ -325,14 +325,16 @@ FTRACE_ENTRY(hwlat, hwlat_entry,
__field_desc( long, timestamp, tv_nsec )
__field( unsigned int, nmi_count )
__field( unsigned int, seqnum )
__field( unsigned int, count )
),
F_printk("cnt:%u\tts:%010llu.%010lu\tinner:%llu\touter:%llu\tnmi-ts:%llu\tnmi-count:%u\n",
F_printk("cnt:%u\tts:%010llu.%010lu\tinner:%llu\touter:%llu\tcount:%d\tnmi-ts:%llu\tnmi-count:%u\n",
__entry->seqnum,
__entry->tv_sec,
__entry->tv_nsec,
__entry->duration,
__entry->outer_duration,
__entry->count,
__entry->nmi_total_ts,
__entry->nmi_count)
);
This diff is collapsed.
......@@ -482,7 +482,7 @@ get_return_for_leaf(struct trace_iterator *iter,
/* this is a leaf, now advance the iterator */
if (ring_iter)
ring_buffer_read(ring_iter, NULL);
ring_buffer_iter_advance(ring_iter);
return next;
}
......
......@@ -83,6 +83,7 @@ struct hwlat_sample {
u64 nmi_total_ts; /* Total time spent in NMIs */
struct timespec64 timestamp; /* wall time */
int nmi_count; /* # NMIs during this sample */
int count; /* # of iteratons over threash */
};
/* keep the global state somewhere. */
......@@ -124,6 +125,7 @@ static void trace_hwlat_sample(struct hwlat_sample *sample)
entry->timestamp = sample->timestamp;
entry->nmi_total_ts = sample->nmi_total_ts;
entry->nmi_count = sample->nmi_count;
entry->count = sample->count;
if (!call_filter_check_discard(call, entry, buffer, event))
trace_buffer_unlock_commit_nostack(buffer, event);
......@@ -167,12 +169,14 @@ void trace_hwlat_callback(bool enter)
static int get_sample(void)
{
struct trace_array *tr = hwlat_trace;
struct hwlat_sample s;
time_type start, t1, t2, last_t2;
s64 diff, total, last_total = 0;
s64 diff, outer_diff, total, last_total = 0;
u64 sample = 0;
u64 thresh = tracing_thresh;
u64 outer_sample = 0;
int ret = -1;
unsigned int count = 0;
do_div(thresh, NSEC_PER_USEC); /* modifies interval value */
......@@ -186,6 +190,7 @@ static int get_sample(void)
init_time(last_t2, 0);
start = time_get(); /* start timestamp */
outer_diff = 0;
do {
......@@ -194,14 +199,14 @@ static int get_sample(void)
if (time_u64(last_t2)) {
/* Check the delta from outer loop (t2 to next t1) */
diff = time_to_us(time_sub(t1, last_t2));
outer_diff = time_to_us(time_sub(t1, last_t2));
/* This shouldn't happen */
if (diff < 0) {
if (outer_diff < 0) {
pr_err(BANNER "time running backwards\n");
goto out;
}
if (diff > outer_sample)
outer_sample = diff;
if (outer_diff > outer_sample)
outer_sample = outer_diff;
}
last_t2 = t2;
......@@ -217,6 +222,12 @@ static int get_sample(void)
/* This checks the inner loop (t1 to t2) */
diff = time_to_us(time_sub(t2, t1)); /* current diff */
if (diff > thresh || outer_diff > thresh) {
if (!count)
ktime_get_real_ts64(&s.timestamp);
count++;
}
/* This shouldn't happen */
if (diff < 0) {
pr_err(BANNER "time running backwards\n");
......@@ -236,7 +247,6 @@ static int get_sample(void)
/* If we exceed the threshold value, we have found a hardware latency */
if (sample > thresh || outer_sample > thresh) {
struct hwlat_sample s;
u64 latency;
ret = 1;
......@@ -249,9 +259,9 @@ static int get_sample(void)
s.seqnum = hwlat_data.count;
s.duration = sample;
s.outer_duration = outer_sample;
ktime_get_real_ts64(&s.timestamp);
s.nmi_total_ts = nmi_total_ts;
s.nmi_count = nmi_count;
s.count = count;
trace_hwlat_sample(&s);
latency = max(sample, outer_sample);
......
......@@ -1078,6 +1078,8 @@ static int trace_kprobe_show(struct seq_file *m, struct dyn_event *ev)
int i;
seq_putc(m, trace_kprobe_is_return(tk) ? 'r' : 'p');
if (trace_kprobe_is_return(tk) && tk->rp.maxactive)
seq_printf(m, "%d", tk->rp.maxactive);
seq_printf(m, ":%s/%s", trace_probe_group_name(&tk->tp),
trace_probe_name(&tk->tp));
......
......@@ -617,22 +617,19 @@ int trace_print_context(struct trace_iterator *iter)
int trace_print_lat_context(struct trace_iterator *iter)
{
struct trace_entry *entry, *next_entry;
struct trace_array *tr = iter->tr;
/* trace_find_next_entry will reset ent_size */
int ent_size = iter->ent_size;
struct trace_seq *s = &iter->seq;
u64 next_ts;
struct trace_entry *entry = iter->ent,
*next_entry = trace_find_next_entry(iter, NULL,
&next_ts);
unsigned long verbose = (tr->trace_flags & TRACE_ITER_VERBOSE);
u64 next_ts;
/* Restore the original ent_size */
iter->ent_size = ent_size;
next_entry = trace_find_next_entry(iter, NULL, &next_ts);
if (!next_entry)
next_ts = iter->ts;
/* trace_find_next_entry() may change iter->ent */
entry = iter->ent;
if (verbose) {
char comm[TASK_COMM_LEN];
......@@ -1158,12 +1155,12 @@ trace_hwlat_print(struct trace_iterator *iter, int flags,
trace_assign_type(field, entry);
trace_seq_printf(s, "#%-5u inner/outer(us): %4llu/%-5llu ts:%lld.%09ld",
trace_seq_printf(s, "#%-5u inner/outer(us): %4llu/%-5llu ts:%lld.%09ld count:%d",
field->seqnum,
field->duration,
field->outer_duration,
(long long)field->timestamp.tv_sec,
field->timestamp.tv_nsec);
field->timestamp.tv_nsec, field->count);
if (field->nmi_count) {
/*
......
......@@ -29,12 +29,14 @@ static int xbc_node_num __initdata;
static char *xbc_data __initdata;
static size_t xbc_data_size __initdata;
static struct xbc_node *last_parent __initdata;
static const char *xbc_err_msg __initdata;
static int xbc_err_pos __initdata;
static int __init xbc_parse_error(const char *msg, const char *p)
{
int pos = p - xbc_data;
xbc_err_msg = msg;
xbc_err_pos = (int)(p - xbc_data);
pr_err("Parse error at pos %d: %s\n", pos, msg);
return -EINVAL;
}
......@@ -738,33 +740,44 @@ void __init xbc_destroy_all(void)
/**
* xbc_init() - Parse given XBC file and build XBC internal tree
* @buf: boot config text
* @emsg: A pointer of const char * to store the error message
* @epos: A pointer of int to store the error position
*
* This parses the boot config text in @buf. @buf must be a
* null terminated string and smaller than XBC_DATA_MAX.
* Return the number of stored nodes (>0) if succeeded, or -errno
* if there is any error.
* In error cases, @emsg will be updated with an error message and
* @epos will be updated with the error position which is the byte offset
* of @buf. If the error is not a parser error, @epos will be -1.
*/
int __init xbc_init(char *buf)
int __init xbc_init(char *buf, const char **emsg, int *epos)
{
char *p, *q;
int ret, c;
if (epos)
*epos = -1;
if (xbc_data) {
pr_err("Error: bootconfig is already initialized.\n");
if (emsg)
*emsg = "Bootconfig is already initialized";
return -EBUSY;
}
ret = strlen(buf);
if (ret > XBC_DATA_MAX - 1 || ret == 0) {
pr_err("Error: Config data is %s.\n",
ret ? "too big" : "empty");
if (emsg)
*emsg = ret ? "Config data is too big" :
"Config data is empty";
return -ERANGE;
}
xbc_nodes = memblock_alloc(sizeof(struct xbc_node) * XBC_NODE_MAX,
SMP_CACHE_BYTES);
if (!xbc_nodes) {
pr_err("Failed to allocate memory for bootconfig nodes.\n");
if (emsg)
*emsg = "Failed to allocate bootconfig nodes";
return -ENOMEM;
}
memset(xbc_nodes, 0, sizeof(struct xbc_node) * XBC_NODE_MAX);
......@@ -814,9 +827,13 @@ int __init xbc_init(char *buf)
if (!ret)
ret = xbc_verify_tree();
if (ret < 0)
if (ret < 0) {
if (epos)
*epos = xbc_err_pos;
if (emsg)
*emsg = xbc_err_msg;
xbc_destroy_all();
else
} else
ret = xbc_node_num;
return ret;
......
# SPDX-License-Identifier: GPL-2.0
# Makefile for bootconfig command
include ../scripts/Makefile.include
bindir ?= /usr/bin
HEADER = include/linux/bootconfig.h
CFLAGS = -Wall -g -I./include
ifeq ($(srctree),)
srctree := $(patsubst %/,%,$(dir $(CURDIR)))
srctree := $(patsubst %/,%,$(dir $(srctree)))
endif
PROGS = bootconfig
LIBSRC = $(srctree)/lib/bootconfig.c $(srctree)/include/linux/bootconfig.h
CFLAGS = -Wall -g -I$(CURDIR)/include
all: $(PROGS)
ALL_TARGETS := bootconfig
ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS))
bootconfig: ../../lib/bootconfig.c main.c $(HEADER)
all: $(ALL_PROGRAMS)
$(OUTPUT)bootconfig: main.c $(LIBSRC)
$(CC) $(filter %.c,$^) $(CFLAGS) -o $@
install: $(PROGS)
install bootconfig $(DESTDIR)$(bindir)
test: $(ALL_PROGRAMS) test-bootconfig.sh
./test-bootconfig.sh $(OUTPUT)
test: bootconfig
./test-bootconfig.sh
install: $(ALL_PROGRAMS)
install $(OUTPUT)bootconfig $(DESTDIR)$(bindir)
clean:
$(RM) -f *.o bootconfig
$(RM) -f $(OUTPUT)*.o $(ALL_PROGRAMS)
......@@ -130,6 +130,7 @@ int load_xbc_from_initrd(int fd, char **buf)
int ret;
u32 size = 0, csum = 0, rcsum;
char magic[BOOTCONFIG_MAGIC_LEN];
const char *msg;
ret = fstat(fd, &stat);
if (ret < 0)
......@@ -182,10 +183,12 @@ int load_xbc_from_initrd(int fd, char **buf)
return -EINVAL;
}
ret = xbc_init(*buf);
ret = xbc_init(*buf, &msg, NULL);
/* Wrong data */
if (ret < 0)
if (ret < 0) {
pr_err("parse error: %s.\n", msg);
return ret;
}
return size;
}
......@@ -244,11 +247,34 @@ int delete_xbc(const char *path)
return ret;
}
static void show_xbc_error(const char *data, const char *msg, int pos)
{
int lin = 1, col, i;
if (pos < 0) {
pr_err("Error: %s.\n", msg);
return;
}
/* Note that pos starts from 0 but lin and col should start from 1. */
col = pos + 1;
for (i = 0; i < pos; i++) {
if (data[i] == '\n') {
lin++;
col = pos - i;
}
}
pr_err("Parse Error: %s at %d:%d\n", msg, lin, col);
}
int apply_xbc(const char *path, const char *xbc_path)
{
u32 size, csum;
char *buf, *data;
int ret, fd;
const char *msg;
int pos;
ret = load_xbc_file(xbc_path, &buf);
if (ret < 0) {
......@@ -267,11 +293,12 @@ int apply_xbc(const char *path, const char *xbc_path)
*(u32 *)(data + size + 4) = csum;
/* Check the data format */
ret = xbc_init(buf);
ret = xbc_init(buf, &msg, &pos);
if (ret < 0) {
pr_err("Failed to parse %s: %d\n", xbc_path, ret);
show_xbc_error(data, msg, pos);
free(data);
free(buf);
return ret;
}
printf("Apply %s to %s\n", xbc_path, path);
......
......@@ -3,9 +3,16 @@
echo "Boot config test script"
BOOTCONF=./bootconfig
INITRD=`mktemp initrd-XXXX`
TEMPCONF=`mktemp temp-XXXX.bconf`
if [ -d "$1" ]; then
TESTDIR=$1
else
TESTDIR=.
fi
BOOTCONF=${TESTDIR}/bootconfig
INITRD=`mktemp ${TESTDIR}/initrd-XXXX`
TEMPCONF=`mktemp ${TESTDIR}/temp-XXXX.bconf`
OUTFILE=`mktemp ${TESTDIR}/tempout-XXXX`
NG=0
cleanup() {
......@@ -65,7 +72,6 @@ new_size=$(stat -c %s $INITRD)
xpass test $new_size -eq $initrd_size
echo "No error messge while applying"
OUTFILE=`mktemp tempout-XXXX`
dd if=/dev/zero of=$INITRD bs=4096 count=1
printf " \0\0\0 \0\0\0" >> $INITRD
$BOOTCONF -a $TEMPCONF $INITRD > $OUTFILE 2>&1
......
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: event tracing - restricts events based on pid notrace filtering
# flags: instance
do_reset() {
echo > set_event
echo > set_event_pid
echo > set_event_notrace_pid
echo 0 > options/event-fork
echo 0 > events/enable
clear_trace
echo 1 > tracing_on
}
fail() { #msg
cat trace
do_reset
echo $1
exit_fail
}
count_pid() {
pid=$@
cat trace | grep -v '^#' | sed -e 's/[^-]*-\([0-9]*\).*/\1/' | grep $pid | wc -l
}
count_no_pid() {
pid=$1
cat trace | grep -v '^#' | sed -e 's/[^-]*-\([0-9]*\).*/\1/' | grep -v $pid | wc -l
}
enable_system() {
system=$1
if [ -d events/$system ]; then
echo 1 > events/$system/enable
fi
}
enable_events() {
echo 0 > tracing_on
# Enable common groups of events, as all events can allow for
# events to be traced via scheduling that we don't care to test.
enable_system syscalls
enable_system rcu
enable_system block
enable_system exceptions
enable_system irq
enable_system net
enable_system power
enable_system signal
enable_system sock
enable_system timer
enable_system thermal
echo 1 > tracing_on
}
if [ ! -f set_event -o ! -d events/sched ]; then
echo "event tracing is not supported"
exit_unsupported
fi
if [ ! -f set_event_pid -o ! -f set_event_notrace_pid ]; then
echo "event pid notrace filtering is not supported"
exit_unsupported
fi
echo 0 > options/event-fork
do_reset
read mypid rest < /proc/self/stat
echo $mypid > set_event_notrace_pid
grep -q $mypid set_event_notrace_pid
enable_events
yield
echo 0 > tracing_on
cnt=`count_pid $mypid`
if [ $cnt -ne 0 ]; then
fail "Filtered out task has events"
fi
cnt=`count_no_pid $mypid`
if [ $cnt -eq 0 ]; then
fail "No other events were recorded"
fi
do_reset
echo $mypid > set_event_notrace_pid
echo 1 > options/event-fork
enable_events
yield &
child=$!
echo "child = $child"
wait $child
echo 0 > tracing_on
cnt=`count_pid $mypid`
if [ $cnt -ne 0 ]; then
fail "Filtered out task has events"
fi
cnt=`count_pid $child`
if [ $cnt -ne 0 ]; then
fail "Child of filtered out taskhas events"
fi
cnt=`count_no_pid $mypid`
if [ $cnt -eq 0 ]; then
fail "No other events were recorded"
fi
do_reset
exit 0
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# description: ftrace - function pid notrace filters
# flags: instance
# Make sure that function pid matching filter with notrace works.
if ! grep -q function available_tracers; then
echo "no function tracer configured"
exit_unsupported
fi
if [ ! -f set_ftrace_notrace_pid ]; then
echo "set_ftrace_notrace_pid not found? Is function tracer not set?"
exit_unsupported
fi
if [ ! -f set_ftrace_filter ]; then
echo "set_ftrace_filter not found? Is function tracer not set?"
exit_unsupported
fi
do_function_fork=1
if [ ! -f options/function-fork ]; then
do_function_fork=0
echo "no option for function-fork found. Option will not be tested."
fi
read PID _ < /proc/self/stat
if [ $do_function_fork -eq 1 ]; then
# default value of function-fork option
orig_value=`grep function-fork trace_options`
fi
do_reset() {
if [ $do_function_fork -eq 0 ]; then
return
fi
echo > set_ftrace_notrace_pid
echo $orig_value > trace_options
}
fail() { # msg
do_reset
echo $1
exit_fail
}
do_test() {
disable_tracing
echo do_execve* > set_ftrace_filter
echo *do_fork >> set_ftrace_filter
echo $PID > set_ftrace_notrace_pid
echo function > current_tracer
if [ $do_function_fork -eq 1 ]; then
# don't allow children to be traced
echo nofunction-fork > trace_options
fi
enable_tracing
yield
count_pid=`cat trace | grep -v ^# | grep $PID | wc -l`
count_other=`cat trace | grep -v ^# | grep -v $PID | wc -l`
# count_pid should be 0
if [ $count_pid -ne 0 -o $count_other -eq 0 ]; then
fail "PID filtering not working? traced task = $count_pid; other tasks = $count_other "
fi
disable_tracing
clear_trace
if [ $do_function_fork -eq 0 ]; then
return
fi
# allow children to be traced
echo function-fork > trace_options
# With pid in both set_ftrace_pid and set_ftrace_notrace_pid
# there should not be any tasks traced.
echo $PID > set_ftrace_pid
enable_tracing
yield
count_pid=`cat trace | grep -v ^# | grep $PID | wc -l`
count_other=`cat trace | grep -v ^# | grep -v $PID | wc -l`
# both should be zero
if [ $count_pid -ne 0 -o $count_other -ne 0 ]; then
fail "PID filtering not following fork? traced task = $count_pid; other tasks = $count_other "
fi
}
do_test
do_reset
exit 0
......@@ -41,7 +41,7 @@ fi
echo '** ENABLE EVENTS'
echo 1 > events/enable
echo 1 > events/sched/enable
echo '** ENABLE TRACING'
enable_tracing
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment