Commit 54d5f16e authored by Ingo Molnar's avatar Ingo Molnar

Merge branch 'for-mingo' of...

Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu

Pull RCU changes from Paul E. McKenney:

 - Documentation updates.  Just some simple changes, no design-level
   additions.

 - Miscellaneous fixes.

 - Torture-test updates.
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parents 4c2e07c6 4d03754f
......@@ -2391,6 +2391,41 @@ and <tt>RCU_NONIDLE()</tt> on the other while inspecting
idle-loop code.
Steven Rostedt supplied <tt>_rcuidle</tt> event tracing,
which is used quite heavily in the idle loop.
However, there are some restrictions on the code placed within
<tt>RCU_NONIDLE()</tt>:
<ol>
<li> Blocking is prohibited.
In practice, this is not a serious restriction given that idle
tasks are prohibited from blocking to begin with.
<li> Although nesting <tt>RCU_NONIDLE()</tt> is permited, they cannot
nest indefinitely deeply.
However, given that they can be nested on the order of a million
deep, even on 32-bit systems, this should not be a serious
restriction.
This nesting limit would probably be reached long after the
compiler OOMed or the stack overflowed.
<li> Any code path that enters <tt>RCU_NONIDLE()</tt> must sequence
out of that same <tt>RCU_NONIDLE()</tt>.
For example, the following is grossly illegal:
<blockquote>
<pre>
1 RCU_NONIDLE({
2 do_something();
3 goto bad_idea; /* BUG!!! */
4 do_something_else();});
5 bad_idea:
</pre>
</blockquote>
<p>
It is just as illegal to transfer control into the middle of
<tt>RCU_NONIDLE()</tt>'s argument.
Yes, in theory, you could transfer in as long as you also
transferred out, but in practice you could also expect to get sharply
worded review comments.
</ol>
<p>
It is similarly socially unacceptable to interrupt an
......
......@@ -49,7 +49,7 @@ rcupdate.rcu_task_stall_timeout
This boot/sysfs parameter controls the RCU-tasks stall warning
interval. A value of zero or less suppresses RCU-tasks stall
warnings. A positive value sets the stall-warning interval
in jiffies. An RCU-tasks stall warning starts wtih the line:
in jiffies. An RCU-tasks stall warning starts with the line:
INFO: rcu_tasks detected stalls on tasks:
......
......@@ -5,6 +5,9 @@ to start learning about RCU:
2. What is RCU? Part 2: Usage http://lwn.net/Articles/263130/
3. RCU part 3: the RCU API http://lwn.net/Articles/264090/
4. The RCU API, 2010 Edition http://lwn.net/Articles/418853/
2010 Big API Table http://lwn.net/Articles/419086/
5. The RCU API, 2014 Edition http://lwn.net/Articles/609904/
2014 Big API Table http://lwn.net/Articles/609973/
What is RCU?
......
......@@ -58,6 +58,7 @@ show up in /proc/sys/kernel:
- panic_on_stackoverflow
- panic_on_unrecovered_nmi
- panic_on_warn
- panic_on_rcu_stall
- perf_cpu_time_max_percent
- perf_event_paranoid
- perf_event_max_stack
......@@ -618,6 +619,17 @@ a kernel rebuild when attempting to kdump at the location of a WARN().
==============================================================
panic_on_rcu_stall:
When set to 1, calls panic() after RCU stall detection messages. This
is useful to define the root cause of RCU stalls using a vmcore.
0: do not panic() when RCU stall takes place, default behavior.
1: panic() after printing RCU stall messages.
==============================================================
perf_cpu_time_max_percent:
Hints to the kernel how much CPU time it should be allowed to
......
......@@ -451,6 +451,7 @@ extern int panic_on_oops;
extern int panic_on_unrecovered_nmi;
extern int panic_on_io_nmi;
extern int panic_on_warn;
extern int sysctl_panic_on_rcu_stall;
extern int sysctl_panic_on_stackoverflow;
extern bool crash_kexec_post_notifiers;
......
......@@ -45,6 +45,7 @@
#include <linux/bug.h>
#include <linux/compiler.h>
#include <linux/ktime.h>
#include <linux/irqflags.h>
#include <asm/barrier.h>
......@@ -379,12 +380,13 @@ static inline void rcu_init_nohz(void)
* in the inner idle loop.
*
* This macro provides the way out: RCU_NONIDLE(do_something_with_RCU())
* will tell RCU that it needs to pay attending, invoke its argument
* (in this example, a call to the do_something_with_RCU() function),
* will tell RCU that it needs to pay attention, invoke its argument
* (in this example, calling the do_something_with_RCU() function),
* and then tell RCU to go back to ignoring this CPU. It is permissible
* to nest RCU_NONIDLE() wrappers, but the nesting level is currently
* quite limited. If deeper nesting is required, it will be necessary
* to adjust DYNTICK_TASK_NESTING_VALUE accordingly.
* to nest RCU_NONIDLE() wrappers, but not indefinitely (but the limit is
* on the order of a million or so, even on 32-bit systems). It is
* not legal to block within RCU_NONIDLE(), nor is it permissible to
* transfer control either into or out of RCU_NONIDLE()'s statement.
*/
#define RCU_NONIDLE(a) \
do { \
......@@ -649,7 +651,16 @@ static inline void rcu_preempt_sleep_check(void)
* please be careful when making changes to rcu_assign_pointer() and the
* other macros that it invokes.
*/
#define rcu_assign_pointer(p, v) smp_store_release(&p, RCU_INITIALIZER(v))
#define rcu_assign_pointer(p, v) \
({ \
uintptr_t _r_a_p__v = (uintptr_t)(v); \
\
if (__builtin_constant_p(v) && (_r_a_p__v) == (uintptr_t)NULL) \
WRITE_ONCE((p), (typeof(p))(_r_a_p__v)); \
else \
smp_store_release(&p, RCU_INITIALIZER((typeof(p))_r_a_p__v)); \
_r_a_p__v; \
})
/**
* rcu_access_pointer() - fetch RCU pointer with no dereferencing
......
......@@ -50,6 +50,10 @@
do { if (verbose) pr_alert("%s" TORTURE_FLAG "!!! %s\n", torture_type, s); } while (0)
/* Definitions for online/offline exerciser. */
bool torture_offline(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_offl, int *min_onl, int *max_onl);
bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_onl, int *min_onl, int *max_onl);
int torture_onoff_init(long ooholdoff, long oointerval);
void torture_onoff_stats(void);
bool torture_onoff_failures(void);
......
......@@ -517,6 +517,7 @@ config SRCU
config TASKS_RCU
bool
default n
depends on !UML
select SRCU
help
This option enables a task-based RCU implementation that uses
......
......@@ -58,7 +58,7 @@ MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.vnet.ibm.com>");
#define VERBOSE_PERFOUT_ERRSTRING(s) \
do { if (verbose) pr_alert("%s" PERF_FLAG "!!! %s\n", perf_type, s); } while (0)
torture_param(bool, gp_exp, true, "Use expedited GP wait primitives");
torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
torture_param(int, nreaders, -1, "Number of RCU reader threads");
torture_param(int, nwriters, -1, "Number of RCU updater threads");
......@@ -96,12 +96,7 @@ static int rcu_perf_writer_state;
#define MAX_MEAS 10000
#define MIN_MEAS 100
#if defined(MODULE) || defined(CONFIG_RCU_PERF_TEST_RUNNABLE)
#define RCUPERF_RUNNABLE_INIT 1
#else
#define RCUPERF_RUNNABLE_INIT 0
#endif
static int perf_runnable = RCUPERF_RUNNABLE_INIT;
static int perf_runnable = IS_ENABLED(MODULE);
module_param(perf_runnable, int, 0444);
MODULE_PARM_DESC(perf_runnable, "Start rcuperf at boot");
......@@ -363,8 +358,6 @@ rcu_perf_writer(void *arg)
u64 *wdpp = writer_durations[me];
VERBOSE_PERFOUT_STRING("rcu_perf_writer task started");
WARN_ON(rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp);
WARN_ON(rcu_gp_is_normal() && gp_exp);
WARN_ON(!wdpp);
set_cpus_allowed_ptr(current, cpumask_of(me % nr_cpu_ids));
sp.sched_priority = 1;
......@@ -631,12 +624,24 @@ rcu_perf_init(void)
firsterr = -ENOMEM;
goto unwind;
}
if (rcu_gp_is_expedited() && !rcu_gp_is_normal() && !gp_exp) {
VERBOSE_PERFOUT_ERRSTRING("All grace periods expedited, no normal ones to measure!");
firsterr = -EINVAL;
goto unwind;
}
if (rcu_gp_is_normal() && gp_exp) {
VERBOSE_PERFOUT_ERRSTRING("All grace periods normal, no expedited ones to measure!");
firsterr = -EINVAL;
goto unwind;
}
for (i = 0; i < nrealwriters; i++) {
writer_durations[i] =
kcalloc(MAX_MEAS, sizeof(*writer_durations[i]),
GFP_KERNEL);
if (!writer_durations[i])
if (!writer_durations[i]) {
firsterr = -ENOMEM;
goto unwind;
}
firsterr = torture_create_kthread(rcu_perf_writer, (void *)i,
writer_tasks[i]);
if (firsterr)
......
......@@ -182,12 +182,7 @@ static const char *rcu_torture_writer_state_getname(void)
return rcu_torture_writer_state_names[i];
}
#if defined(MODULE) || defined(CONFIG_RCU_TORTURE_TEST_RUNNABLE)
#define RCUTORTURE_RUNNABLE_INIT 1
#else
#define RCUTORTURE_RUNNABLE_INIT 0
#endif
static int torture_runnable = RCUTORTURE_RUNNABLE_INIT;
static int torture_runnable = IS_ENABLED(MODULE);
module_param(torture_runnable, int, 0444);
MODULE_PARM_DESC(torture_runnable, "Start rcutorture at boot");
......@@ -1476,7 +1471,7 @@ static int rcu_torture_barrier_cbs(void *arg)
break;
/*
* The above smp_load_acquire() ensures barrier_phase load
* is ordered before the folloiwng ->call().
* is ordered before the following ->call().
*/
local_irq_disable(); /* Just to test no-irq call_rcu(). */
cur_ops->call(&rcu, rcu_torture_barrier_cbf);
......
This diff is collapsed.
......@@ -253,6 +253,13 @@ struct rcu_node {
wait_queue_head_t exp_wq[4];
} ____cacheline_internodealigned_in_smp;
/*
* Bitmasks in an rcu_node cover the interval [grplo, grphi] of CPU IDs, and
* are indexed relative to this interval rather than the global CPU ID space.
* This generates the bit for a CPU in node-local masks.
*/
#define leaf_node_cpu_bit(rnp, cpu) (1UL << ((cpu) - (rnp)->grplo))
/*
* Do a full breadth-first scan of the rcu_node structures for the
* specified rcu_state structure.
......@@ -280,6 +287,14 @@ struct rcu_node {
for ((rnp) = (rsp)->level[rcu_num_lvls - 1]; \
(rnp) < &(rsp)->node[rcu_num_nodes]; (rnp)++)
/*
* Iterate over all possible CPUs in a leaf RCU node.
*/
#define for_each_leaf_node_possible_cpu(rnp, cpu) \
for ((cpu) = cpumask_next(rnp->grplo - 1, cpu_possible_mask); \
cpu <= rnp->grphi; \
cpu = cpumask_next((cpu), cpu_possible_mask))
/*
* Union to allow "aggregate OR" operation on the need for a quiescent
* state by the normal and expedited grace periods.
......
This diff is collapsed.
......@@ -79,8 +79,6 @@ static void __init rcu_bootup_announce_oddness(void)
pr_info("\tRCU dyntick-idle grace-period acceleration is enabled.\n");
if (IS_ENABLED(CONFIG_PROVE_RCU))
pr_info("\tRCU lockdep checking is enabled.\n");
if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST_RUNNABLE))
pr_info("\tRCU torture testing starts during boot.\n");
if (RCU_NUM_LVLS >= 4)
pr_info("\tFour(or more)-level hierarchy is enabled.\n");
if (RCU_FANOUT_LEAF != 16)
......@@ -681,84 +679,6 @@ void synchronize_rcu(void)
}
EXPORT_SYMBOL_GPL(synchronize_rcu);
/*
* Remote handler for smp_call_function_single(). If there is an
* RCU read-side critical section in effect, request that the
* next rcu_read_unlock() record the quiescent state up the
* ->expmask fields in the rcu_node tree. Otherwise, immediately
* report the quiescent state.
*/
static void sync_rcu_exp_handler(void *info)
{
struct rcu_data *rdp;
struct rcu_state *rsp = info;
struct task_struct *t = current;
/*
* Within an RCU read-side critical section, request that the next
* rcu_read_unlock() report. Unless this RCU read-side critical
* section has already blocked, in which case it is already set
* up for the expedited grace period to wait on it.
*/
if (t->rcu_read_lock_nesting > 0 &&
!t->rcu_read_unlock_special.b.blocked) {
t->rcu_read_unlock_special.b.exp_need_qs = true;
return;
}
/*
* We are either exiting an RCU read-side critical section (negative
* values of t->rcu_read_lock_nesting) or are not in one at all
* (zero value of t->rcu_read_lock_nesting). Or we are in an RCU
* read-side critical section that blocked before this expedited
* grace period started. Either way, we can immediately report
* the quiescent state.
*/
rdp = this_cpu_ptr(rsp->rda);
rcu_report_exp_rdp(rsp, rdp, true);
}
/**
* synchronize_rcu_expedited - Brute-force RCU grace period
*
* Wait for an RCU-preempt grace period, but expedite it. The basic
* idea is to IPI all non-idle non-nohz online CPUs. The IPI handler
* checks whether the CPU is in an RCU-preempt critical section, and
* if so, it sets a flag that causes the outermost rcu_read_unlock()
* to report the quiescent state. On the other hand, if the CPU is
* not in an RCU read-side critical section, the IPI handler reports
* the quiescent state immediately.
*
* Although this is a greate improvement over previous expedited
* implementations, it is still unfriendly to real-time workloads, so is
* thus not recommended for any sort of common-case code. In fact, if
* you are using synchronize_rcu_expedited() in a loop, please restructure
* your code to batch your updates, and then Use a single synchronize_rcu()
* instead.
*/
void synchronize_rcu_expedited(void)
{
struct rcu_state *rsp = rcu_state_p;
unsigned long s;
/* If expedited grace periods are prohibited, fall back to normal. */
if (rcu_gp_is_normal()) {
wait_rcu_gp(call_rcu);
return;
}
s = rcu_exp_gp_seq_snap(rsp);
if (exp_funnel_lock(rsp, s))
return; /* Someone else did our work for us. */
/* Initialize the rcu_node tree in preparation for the wait. */
sync_rcu_exp_select_cpus(rsp, sync_rcu_exp_handler);
/* Wait for ->blkd_tasks lists to drain, then wake everyone up. */
rcu_exp_wait_wake(rsp, s);
}
EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
/**
* rcu_barrier - Wait until all in-flight call_rcu() callbacks complete.
*
......@@ -882,16 +802,6 @@ static void rcu_preempt_check_callbacks(void)
{
}
/*
* Wait for an rcu-preempt grace period, but make it happen quickly.
* But because preemptible RCU does not exist, map to rcu-sched.
*/
void synchronize_rcu_expedited(void)
{
synchronize_sched_expedited();
}
EXPORT_SYMBOL_GPL(synchronize_rcu_expedited);
/*
* Because preemptible RCU does not exist, rcu_barrier() is just
* another name for rcu_barrier_sched().
......@@ -1254,8 +1164,9 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
return;
if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
return;
for (cpu = rnp->grplo; cpu <= rnp->grphi; cpu++, mask >>= 1)
if ((mask & 0x1) && cpu != outgoingcpu)
for_each_leaf_node_possible_cpu(rnp, cpu)
if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
cpu != outgoingcpu)
cpumask_set_cpu(cpu, cm);
if (cpumask_weight(cm) == 0)
cpumask_setall(cm);
......
......@@ -528,6 +528,7 @@ static int rcu_task_stall_timeout __read_mostly = HZ * 60 * 10;
module_param(rcu_task_stall_timeout, int, 0644);
static void rcu_spawn_tasks_kthread(void);
static struct task_struct *rcu_tasks_kthread_ptr;
/*
* Post an RCU-tasks callback. First call must be from process context
......@@ -537,6 +538,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
{
unsigned long flags;
bool needwake;
bool havetask = READ_ONCE(rcu_tasks_kthread_ptr);
rhp->next = NULL;
rhp->func = func;
......@@ -545,7 +547,9 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
*rcu_tasks_cbs_tail = rhp;
rcu_tasks_cbs_tail = &rhp->next;
raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
if (needwake) {
/* We can't create the thread unless interrupts are enabled. */
if ((needwake && havetask) ||
(!havetask && !irqs_disabled_flags(flags))) {
rcu_spawn_tasks_kthread();
wake_up(&rcu_tasks_cbs_wq);
}
......@@ -790,7 +794,6 @@ static int __noreturn rcu_tasks_kthread(void *arg)
static void rcu_spawn_tasks_kthread(void)
{
static DEFINE_MUTEX(rcu_tasks_kthread_mutex);
static struct task_struct *rcu_tasks_kthread_ptr;
struct task_struct *t;
if (READ_ONCE(rcu_tasks_kthread_ptr)) {
......
......@@ -1204,6 +1204,17 @@ static struct ctl_table kern_table[] = {
.extra1 = &one,
.extra2 = &one,
},
#endif
#if defined(CONFIG_TREE_RCU) || defined(CONFIG_PREEMPT_RCU)
{
.procname = "panic_on_rcu_stall",
.data = &sysctl_panic_on_rcu_stall,
.maxlen = sizeof(sysctl_panic_on_rcu_stall),
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = &zero,
.extra2 = &one,
},
#endif
{ }
};
......
......@@ -82,37 +82,26 @@ static int min_online = -1;
static int max_online;
/*
* Execute random CPU-hotplug operations at the interval specified
* by the onoff_interval.
* Attempt to take a CPU offline. Return false if the CPU is already
* offline or if it is not subject to CPU-hotplug operations. The
* caller can detect other failures by looking at the statistics.
*/
static int
torture_onoff(void *arg)
bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
unsigned long *sum_offl, int *min_offl, int *max_offl)
{
int cpu;
unsigned long delta;
int maxcpu = -1;
DEFINE_TORTURE_RANDOM(rand);
int ret;
unsigned long starttime;
VERBOSE_TOROUT_STRING("torture_onoff task started");
for_each_online_cpu(cpu)
maxcpu = cpu;
WARN_ON(maxcpu < 0);
if (onoff_holdoff > 0) {
VERBOSE_TOROUT_STRING("torture_onoff begin holdoff");
schedule_timeout_interruptible(onoff_holdoff);
VERBOSE_TOROUT_STRING("torture_onoff end holdoff");
}
while (!torture_must_stop()) {
cpu = (torture_random(&rand) >> 4) % (maxcpu + 1);
if (cpu_online(cpu) && cpu_is_hotpluggable(cpu)) {
if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
if (verbose)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlining %d\n",
torture_type, cpu);
starttime = jiffies;
n_offline_attempts++;
(*n_offl_attempts)++;
ret = cpu_down(cpu);
if (ret) {
if (verbose)
......@@ -124,25 +113,44 @@ torture_onoff(void *arg)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: offlined %d\n",
torture_type, cpu);
n_offline_successes++;
(*n_offl_successes)++;
delta = jiffies - starttime;
sum_offline += delta;
if (min_offline < 0) {
min_offline = delta;
max_offline = delta;
sum_offl += delta;
if (*min_offl < 0) {
*min_offl = delta;
*max_offl = delta;
}
if (min_offline > delta)
min_offline = delta;
if (max_offline < delta)
max_offline = delta;
if (*min_offl > delta)
*min_offl = delta;
if (*max_offl < delta)
*max_offl = delta;
}
} else if (cpu_is_hotpluggable(cpu)) {
return true;
}
EXPORT_SYMBOL_GPL(torture_offline);
/*
* Attempt to bring a CPU online. Return false if the CPU is already
* online or if it is not subject to CPU-hotplug operations. The
* caller can detect other failures by looking at the statistics.
*/
bool torture_online(int cpu, long *n_onl_attempts, long *n_onl_successes,
unsigned long *sum_onl, int *min_onl, int *max_onl)
{
unsigned long delta;
int ret;
unsigned long starttime;
if (cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
if (verbose)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlining %d\n",
torture_type, cpu);
starttime = jiffies;
n_online_attempts++;
(*n_onl_attempts)++;
ret = cpu_up(cpu);
if (ret) {
if (verbose)
......@@ -154,21 +162,61 @@ torture_onoff(void *arg)
pr_alert("%s" TORTURE_FLAG
"torture_onoff task: onlined %d\n",
torture_type, cpu);
n_online_successes++;
(*n_onl_successes)++;
delta = jiffies - starttime;
sum_online += delta;
if (min_online < 0) {
min_online = delta;
max_online = delta;
*sum_onl += delta;
if (*min_onl < 0) {
*min_onl = delta;
*max_onl = delta;
}
if (min_online > delta)
min_online = delta;
if (max_online < delta)
max_online = delta;
if (*min_onl > delta)
*min_onl = delta;
if (*max_onl < delta)
*max_onl = delta;
}
return true;
}
EXPORT_SYMBOL_GPL(torture_online);
/*
* Execute random CPU-hotplug operations at the interval specified
* by the onoff_interval.
*/
static int
torture_onoff(void *arg)
{
int cpu;
int maxcpu = -1;
DEFINE_TORTURE_RANDOM(rand);
VERBOSE_TOROUT_STRING("torture_onoff task started");
for_each_online_cpu(cpu)
maxcpu = cpu;
WARN_ON(maxcpu < 0);
if (maxcpu == 0) {
VERBOSE_TOROUT_STRING("Only one CPU, so CPU-hotplug testing is disabled");
goto stop;
}
if (onoff_holdoff > 0) {
VERBOSE_TOROUT_STRING("torture_onoff begin holdoff");
schedule_timeout_interruptible(onoff_holdoff);
VERBOSE_TOROUT_STRING("torture_onoff end holdoff");
}
while (!torture_must_stop()) {
cpu = (torture_random(&rand) >> 4) % (maxcpu + 1);
if (!torture_offline(cpu,
&n_offline_attempts, &n_offline_successes,
&sum_offline, &min_offline, &max_offline))
torture_online(cpu,
&n_online_attempts, &n_online_successes,
&sum_online, &min_online, &max_online);
schedule_timeout_interruptible(onoff_interval);
}
stop:
torture_kthread_stopping("torture_onoff");
return 0;
}
......
......@@ -1307,22 +1307,6 @@ config RCU_PERF_TEST
Say M if you want the RCU performance tests to build as a module.
Say N if you are unsure.
config RCU_PERF_TEST_RUNNABLE
bool "performance tests for RCU runnable by default"
depends on RCU_PERF_TEST = y
default n
help
This option provides a way to build the RCU performance tests
directly into the kernel without them starting up at boot time.
You can use /sys/module to manually override this setting.
This /proc file is available only when the RCU performance
tests have been built into the kernel.
Say Y here if you want the RCU performance tests to start during
boot (you probably don't).
Say N here if you want the RCU performance tests to start only
after being manually enabled via /sys/module.
config RCU_TORTURE_TEST
tristate "torture tests for RCU"
depends on DEBUG_KERNEL
......@@ -1340,23 +1324,6 @@ config RCU_TORTURE_TEST
Say M if you want the RCU torture tests to build as a module.
Say N if you are unsure.
config RCU_TORTURE_TEST_RUNNABLE
bool "torture tests for RCU runnable by default"
depends on RCU_TORTURE_TEST = y
default n
help
This option provides a way to build the RCU torture tests
directly into the kernel without them starting up at boot
time. You can use /proc/sys/kernel/rcutorture_runnable
to manually override this setting. This /proc file is
available only when the RCU torture tests have been built
into the kernel.
Say Y here if you want the RCU torture tests to start during
boot (you probably don't).
Say N here if you want the RCU torture tests to start only
after being manually enabled via /proc.
config RCU_TORTURE_TEST_SLOW_PREINIT
bool "Slow down RCU grace-period pre-initialization to expose races"
depends on RCU_TORTURE_TEST
......
......@@ -99,8 +99,9 @@ configfrag_hotplug_cpu () {
# identify_boot_image qemu-cmd
#
# Returns the relative path to the kernel build image. This will be
# arch/<arch>/boot/bzImage unless overridden with the TORTURE_BOOT_IMAGE
# environment variable.
# arch/<arch>/boot/bzImage or vmlinux if bzImage is not a target for the
# architecture, unless overridden with the TORTURE_BOOT_IMAGE environment
# variable.
identify_boot_image () {
if test -n "$TORTURE_BOOT_IMAGE"
then
......@@ -110,11 +111,8 @@ identify_boot_image () {
qemu-system-x86_64|qemu-system-i386)
echo arch/x86/boot/bzImage
;;
qemu-system-ppc64)
echo arch/powerpc/boot/bzImage
;;
*)
echo ""
echo vmlinux
;;
esac
fi
......@@ -175,7 +173,7 @@ identify_qemu_args () {
qemu-system-x86_64|qemu-system-i386)
;;
qemu-system-ppc64)
echo -enable-kvm -M pseries -cpu POWER7 -nodefaults
echo -enable-kvm -M pseries -nodefaults
echo -device spapr-vscsi
if test -n "$TORTURE_QEMU_INTERACTIVE" -a -n "$TORTURE_QEMU_MAC"
then
......
......@@ -8,9 +8,9 @@
#
# Usage: kvm-test-1-run.sh config builddir resdir seconds qemu-args boot_args
#
# qemu-args defaults to "-enable-kvm -soundhw pcspk -nographic", along with
# arguments specifying the number of CPUs and other
# options generated from the underlying CPU architecture.
# qemu-args defaults to "-enable-kvm -nographic", along with arguments
# specifying the number of CPUs and other options
# generated from the underlying CPU architecture.
# boot_args defaults to value returned by the per_version_boot_params
# shell function.
#
......@@ -96,7 +96,8 @@ if test "$base_resdir" != "$resdir" -a -f $base_resdir/bzImage -a -f $base_resdi
then
# Rerunning previous test, so use that test's kernel.
QEMU="`identify_qemu $base_resdir/vmlinux`"
KERNEL=$base_resdir/bzImage
BOOT_IMAGE="`identify_boot_image $QEMU`"
KERNEL=$base_resdir/${BOOT_IMAGE##*/} # use the last component of ${BOOT_IMAGE}
ln -s $base_resdir/Make*.out $resdir # for kvm-recheck.sh
ln -s $base_resdir/.config $resdir # for kvm-recheck.sh
elif kvm-build.sh $config_template $builddir $T
......@@ -110,7 +111,7 @@ then
if test -n "$BOOT_IMAGE"
then
cp $builddir/$BOOT_IMAGE $resdir
KERNEL=$resdir/bzImage
KERNEL=$resdir/${BOOT_IMAGE##*/}
else
echo No identifiable boot image, not running KVM, see $resdir.
echo Do the torture scripts know about your architecture?
......@@ -147,7 +148,7 @@ then
fi
# Generate -smp qemu argument.
qemu_args="-enable-kvm -soundhw pcspk -nographic $qemu_args"
qemu_args="-enable-kvm -nographic $qemu_args"
cpu_count=`configNR_CPUS.sh $config_template`
cpu_count=`configfrag_boot_cpus "$boot_args" "$config_template" "$cpu_count"`
vcpus=`identify_qemu_vcpus`
......@@ -229,6 +230,7 @@ fi
if test $commandcompleted -eq 0 -a -n "$qemu_pid"
then
echo Grace period for qemu job at pid $qemu_pid
oldline="`tail $resdir/console.log`"
while :
do
kruntime=`awk 'BEGIN { print systime() - '"$kstarttime"' }' < /dev/null`
......@@ -238,13 +240,29 @@ then
else
break
fi
if test $kruntime -ge $((seconds + $TORTURE_SHUTDOWN_GRACE))
must_continue=no
newline="`tail $resdir/console.log`"
if test "$newline" != "$oldline" && echo $newline | grep -q ' [0-9]\+us : '
then
must_continue=yes
fi
last_ts="`tail $resdir/console.log | grep '^\[ *[0-9]\+\.[0-9]\+]' | tail -1 | sed -e 's/^\[ *//' -e 's/\..*$//'`"
if test -z "last_ts"
then
last_ts=0
fi
if test "$newline" != "$oldline" -a "$last_ts" -lt $((seconds + $TORTURE_SHUTDOWN_GRACE))
then
must_continue=yes
fi
if test $must_continue = no -a $kruntime -ge $((seconds + $TORTURE_SHUTDOWN_GRACE))
then
echo "!!! PID $qemu_pid hung at $kruntime vs. $seconds seconds" >> $resdir/Warnings 2>&1
kill -KILL $qemu_pid
break
fi
sleep 1
oldline=$newline
sleep 10
done
elif test -z "$qemu_pid"
then
......
......@@ -48,7 +48,7 @@ resdir=""
configs=""
cpus=0
ds=`date +%Y.%m.%d-%H:%M:%S`
jitter=0
jitter="-1"
. functions.sh
......
......@@ -33,7 +33,7 @@ if grep -Pq '\x00' < $file
then
print_warning Console output contains nul bytes, old qemu still running?
fi
egrep 'Badness|WARNING:|Warn|BUG|===========|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state' < $file | grep -v 'ODEBUG: ' | grep -v 'Warning: unable to open an initial console' > $1.diags
egrep 'Badness|WARNING:|Warn|BUG|===========|Call Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved for' < $file | grep -v 'ODEBUG: ' | grep -v 'Warning: unable to open an initial console' > $1.diags
if test -s $1.diags
then
print_warning Assertion failure in $file $title
......@@ -69,6 +69,11 @@ then
then
summary="$summary Stalls: $n_stalls"
fi
n_starves=`grep -c 'rcu_.*kthread starved for' $1`
if test "$n_starves" -ne 0
then
summary="$summary Starves: $n_starves"
fi
print_warning Summary: $summary
else
rm $1.diags
......
......@@ -13,6 +13,22 @@ cd initrd
cpio -id < /tmp/initrd.img.zcat
------------------------------------------------------------------------
Another way to create an initramfs image is using "dracut"[1], which is
available on many distros, however the initramfs dracut generates is a cpio
archive with another cpio archive in it, so an extra step is needed to create
the initrd directory hierarchy.
Here are the commands to create a initrd directory for rcutorture using
dracut:
------------------------------------------------------------------------
dracut --no-hostonly --no-hostonly-cmdline --module "base bash shutdown" /tmp/initramfs.img
cd tools/testing/selftests/rcutorture
mkdir initrd
cd initrd
/usr/lib/dracut/skipcpio /tmp/initramfs.img | zcat | cpio -id < /tmp/initramfs.img
------------------------------------------------------------------------
Interestingly enough, if you are running rcutorture, you don't really
need userspace in many cases. Running without userspace has the
advantage of allowing you to test your kernel independently of the
......@@ -89,3 +105,9 @@ while :
do
sleep 10
done
------------------------------------------------------------------------
References:
[1]: https://dracut.wiki.kernel.org/index.php/Main_Page
[2]: http://blog.elastocloud.org/2015/06/rapid-linux-kernel-devtest-with-qemu.html
[3]: https://www.centos.org/forums/viewtopic.php?t=51621
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment