Commit c0b9620b authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'rcu.next.v6.10' of https://github.com/urezki/linux

Pull RCU updates from Uladzislau Rezki:

 - Fix a lockdep complain for lazy-preemptible kernel, remove redundant
   BH disable for TINY_RCU, remove redundant READ_ONCE() in tree.c, fix
   false positives KCSAN splat and fix buffer overflow in the
   print_cpu_stall_info().

 - Misc updates related to bpf, tracing and update the MAINTAINERS file.

 - An improvement of a normal synchronize_rcu() call in terms of
   latency. It maintains a separate track for sync. users only. This
   approach bypasses per-cpu nocb-lists thus sync-users do not depend on
   nocb-list length and how fast regular callbacks are processed.

 - RCU tasks: switch tasks RCU grace periods to sleep at TASK_IDLE
   priority, fix some comments, add some diagnostic warning to the
   exit_tasks_rcu_start() and fix a buffer overflow in the
   show_rcu_tasks_trace_gp_kthread().

 - RCU torture: Increase memory to guest OS, fix a Tasks Rude RCU
   testing, some updates for TREE09, dump mode information to debug GP
   kthread state, remove redundant READ_ONCE(), fix some comments about
   RCU_TORTURE_PIPE_LEN and pipe_count, remove some redundant pointer
   initialization, fix a hung splat task by when the rcutorture tests
   start to exit, fix invalid context warning, add '--do-kvfree'
   parameter to torture test and use slow register unregister callbacks
   only for rcutype test.

* tag 'rcu.next.v6.10' of https://github.com/urezki/linux: (48 commits)
  rcutorture: Use rcu_gp_slow_register/unregister() only for rcutype test
  torture: Scale --do-kvfree test time
  rcutorture: Fix invalid context warning when enable srcu barrier testing
  rcutorture: Make stall-tasks directly exit when rcutorture tests end
  rcutorture: Removing redundant function pointer initialization
  rcutorture: Make rcutorture support print rcu-tasks gp state
  rcutorture: Use the gp_kthread_dbg operation specified by cur_ops
  rcutorture: Re-use value stored to ->rtort_pipe_count instead of re-reading
  rcutorture: Fix rcu_torture_one_read() pipe_count overflow comment
  rcutorture: Remove extraneous rcu_torture_pipe_update_one() READ_ONCE()
  rcu: Allocate WQ with WQ_MEM_RECLAIM bit set
  rcu: Support direct wake-up of synchronize_rcu() users
  rcu: Add a trace event for synchronize_rcu_normal()
  rcu: Reduce synchronize_rcu() latency
  rcu: Fix buffer overflow in print_cpu_stall_info()
  rcu: Mollify sparse with RCU guard
  rcu-tasks: Fix show_rcu_tasks_trace_gp_kthread buffer overflow
  rcu-tasks: Fix the comments for tasks_rcu_exit_srcu_stall_timer
  rcu-tasks: Replace exit_tasks_rcu_start() initialization with WARN_ON_ONCE()
  rcu: Remove redundant CONFIG_PROVE_RCU #if condition
  ...
parents 736676f5 64619b28
......@@ -467,7 +467,8 @@ Nadia Yvette Chambers <nyc@holomorphy.com> William Lee Irwin III <wli@holomorphy
Naoya Horiguchi <nao.horiguchi@gmail.com> <n-horiguchi@ah.jp.nec.com>
Naoya Horiguchi <nao.horiguchi@gmail.com> <naoya.horiguchi@nec.com>
Nathan Chancellor <nathan@kernel.org> <natechancellor@gmail.com>
Neeraj Upadhyay <quic_neeraju@quicinc.com> <neeraju@codeaurora.org>
Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <quic_neeraju@quicinc.com>
Neeraj Upadhyay <neeraj.upadhyay@kernel.org> <neeraju@codeaurora.org>
Neil Armstrong <neil.armstrong@linaro.org> <narmstrong@baylibre.com>
Nguyen Anh Quynh <aquynh@gmail.com>
Nicholas Piggin <npiggin@gmail.com> <npiggen@suse.de>
......
......@@ -427,7 +427,7 @@ their assorted primitives.
This section shows a simple use of the core RCU API to protect a
global pointer to a dynamically allocated structure. More-typical
uses of RCU may be found in listRCU.rst, arrayRCU.rst, and NMI-RCU.rst.
uses of RCU may be found in listRCU.rst and NMI-RCU.rst.
::
struct foo {
......@@ -510,8 +510,8 @@ So, to sum up:
data item.
See checklist.rst for additional rules to follow when using RCU.
And again, more-typical uses of RCU may be found in listRCU.rst,
arrayRCU.rst, and NMI-RCU.rst.
And again, more-typical uses of RCU may be found in listRCU.rst
and NMI-RCU.rst.
.. _4_whatisRCU:
......
......@@ -5098,6 +5098,20 @@
delay, memory pressure or callback list growing too
big.
rcutree.rcu_normal_wake_from_gp= [KNL]
Reduces a latency of synchronize_rcu() call. This approach
maintains its own track of synchronize_rcu() callers, so it
does not interact with regular callbacks because it does not
use a call_rcu[_hurry]() path. Please note, this is for a
normal grace period.
How to enable it:
echo 1 > /sys/module/rcutree/parameters/rcu_normal_wake_from_gp
or pass a boot parameter "rcutree.rcu_normal_wake_from_gp=1"
Default is 0.
rcuscale.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().
......
......@@ -18600,7 +18600,7 @@ F: tools/testing/selftests/resctrl/
READ-COPY UPDATE (RCU)
M: "Paul E. McKenney" <paulmck@kernel.org>
M: Frederic Weisbecker <frederic@kernel.org> (kernel/rcu/tree_nocb.h)
M: Neeraj Upadhyay <quic_neeraju@quicinc.com> (kernel/rcu/tasks.h)
M: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> (kernel/rcu/tasks.h)
M: Joel Fernandes <joel@joelfernandes.org>
M: Josh Triplett <josh@joshtriplett.org>
M: Boqun Feng <boqun.feng@gmail.com>
......
......@@ -63,7 +63,7 @@ config KPROBES
depends on MODULES
depends on HAVE_KPROBES
select KALLSYMS
select TASKS_RCU if PREEMPTION
select NEED_TASKS_RCU
help
Kprobes allows you to trap at almost any kernel address and
execute a callback function. register_kprobe() establishes
......@@ -112,7 +112,7 @@ config STATIC_CALL_SELFTEST
config OPTPROBES
def_bool y
depends on KPROBES && HAVE_OPTPROBES
select TASKS_RCU if PREEMPTION
select NEED_TASKS_RCU
config KPROBES_ON_FTRACE
def_bool y
......
......@@ -401,15 +401,15 @@ static inline int debug_lockdep_rcu_enabled(void)
} \
} while (0)
#if defined(CONFIG_PROVE_RCU) && !defined(CONFIG_PREEMPT_RCU)
#ifndef CONFIG_PREEMPT_RCU
static inline void rcu_preempt_sleep_check(void)
{
RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map),
"Illegal context switch in RCU read-side critical section");
}
#else /* #ifdef CONFIG_PROVE_RCU */
#else // #ifndef CONFIG_PREEMPT_RCU
static inline void rcu_preempt_sleep_check(void) { }
#endif /* #else #ifdef CONFIG_PROVE_RCU */
#endif // #else // #ifndef CONFIG_PREEMPT_RCU
#define rcu_sleep_check() \
do { \
......@@ -809,9 +809,9 @@ static inline void rcu_read_unlock(void)
{
RCU_LOCKDEP_WARN(!rcu_is_watching(),
"rcu_read_unlock() used illegally while idle");
rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
__release(RCU);
__rcu_read_unlock();
rcu_lock_release(&rcu_lock_map); /* Keep acq info for rls diags. */
}
/**
......@@ -1090,6 +1090,18 @@ rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
extern int rcu_expedited;
extern int rcu_normal;
DEFINE_LOCK_GUARD_0(rcu, rcu_read_lock(), rcu_read_unlock())
DEFINE_LOCK_GUARD_0(rcu,
do {
rcu_read_lock();
/*
* sparse doesn't call the cleanup function,
* so just release immediately and don't track
* the context. We don't need to anyway, since
* the whole point of the guard is to not need
* the explicit unlock.
*/
__release(RCU);
} while (0),
rcu_read_unlock())
#endif /* __LINUX_RCUPDATE_H */
......@@ -19,18 +19,18 @@ struct rcu_synchronize {
};
void wakeme_after_rcu(struct rcu_head *head);
void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
struct rcu_synchronize *rs_array);
#define _wait_rcu_gp(checktiny, ...) \
do { \
call_rcu_func_t __crcu_array[] = { __VA_ARGS__ }; \
struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)]; \
__wait_rcu_gp(checktiny, ARRAY_SIZE(__crcu_array), \
__crcu_array, __rs_array); \
#define _wait_rcu_gp(checktiny, state, ...) \
do { \
call_rcu_func_t __crcu_array[] = { __VA_ARGS__ }; \
struct rcu_synchronize __rs_array[ARRAY_SIZE(__crcu_array)]; \
__wait_rcu_gp(checktiny, state, ARRAY_SIZE(__crcu_array), __crcu_array, __rs_array); \
} while (0)
#define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__)
#define wait_rcu_gp(...) _wait_rcu_gp(false, TASK_UNINTERRUPTIBLE, __VA_ARGS__)
#define wait_rcu_gp_state(state, ...) _wait_rcu_gp(false, state, __VA_ARGS__)
/**
* synchronize_rcu_mult - Wait concurrently for multiple grace periods
......@@ -54,7 +54,7 @@ do { \
* grace period.
*/
#define synchronize_rcu_mult(...) \
_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)
_wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), TASK_UNINTERRUPTIBLE, __VA_ARGS__)
static inline void cond_resched_rcu(void)
{
......
......@@ -64,8 +64,10 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
{
int idx;
preempt_disable(); // Needed for PREEMPT_AUTO
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1);
preempt_enable();
return idx;
}
......
......@@ -707,6 +707,33 @@ TRACE_EVENT_RCU(rcu_invoke_kfree_bulk_callback,
__entry->rcuname, __entry->p, __entry->nr_records)
);
/*
* Tracepoint for a normal synchronize_rcu() states. The first argument
* is the RCU flavor, the second argument is a pointer to rcu_head the
* last one is an event.
*/
TRACE_EVENT_RCU(rcu_sr_normal,
TP_PROTO(const char *rcuname, struct rcu_head *rhp, const char *srevent),
TP_ARGS(rcuname, rhp, srevent),
TP_STRUCT__entry(
__field(const char *, rcuname)
__field(void *, rhp)
__field(const char *, srevent)
),
TP_fast_assign(
__entry->rcuname = rcuname;
__entry->rhp = rhp;
__entry->srevent = srevent;
),
TP_printk("%s rhp=0x%p event=%s",
__entry->rcuname, __entry->rhp, __entry->srevent)
);
/*
* Tracepoint for exiting rcu_do_batch after RCU callbacks have been
* invoked. The first argument is the name of the RCU flavor,
......
......@@ -28,7 +28,7 @@ config BPF_SYSCALL
bool "Enable bpf() system call"
select BPF
select IRQ_WORK
select TASKS_RCU if PREEMPTION
select NEED_TASKS_RCU
select TASKS_TRACE_RCU
select BINARY_PRINTF
select NET_SOCK_MSG if NET
......
......@@ -333,7 +333,7 @@ static void bpf_tramp_image_put(struct bpf_tramp_image *im)
int err = bpf_arch_text_poke(im->ip_after_call, BPF_MOD_JUMP,
NULL, im->ip_epilogue);
WARN_ON(err);
if (IS_ENABLED(CONFIG_PREEMPTION))
if (IS_ENABLED(CONFIG_TASKS_RCU))
call_rcu_tasks(&im->rcu, __bpf_tramp_image_put_rcu_tasks);
else
percpu_ref_kill(&im->pcref);
......
......@@ -31,7 +31,7 @@ config PREEMPT_RCU
config TINY_RCU
bool
default y if !PREEMPTION && !SMP
default y if !PREEMPT_RCU && !SMP
help
This option selects the RCU implementation that is
designed for UP systems from which real-time response
......@@ -85,9 +85,13 @@ config FORCE_TASKS_RCU
idle, and user-mode execution as quiescent states. Not for
manual selection in most cases.
config TASKS_RCU
config NEED_TASKS_RCU
bool
default n
config TASKS_RCU
bool
default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
select IRQ_WORK
config FORCE_TASKS_RUDE_RCU
......
......@@ -522,12 +522,18 @@ static inline void show_rcu_tasks_gp_kthreads(void) {}
#ifdef CONFIG_TASKS_RCU
struct task_struct *get_rcu_tasks_gp_kthread(void);
void rcu_tasks_get_gp_data(int *flags, unsigned long *gp_seq);
#endif // # ifdef CONFIG_TASKS_RCU
#ifdef CONFIG_TASKS_RUDE_RCU
struct task_struct *get_rcu_tasks_rude_gp_kthread(void);
void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq);
#endif // # ifdef CONFIG_TASKS_RUDE_RCU
#ifdef CONFIG_TASKS_TRACE_RCU
void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq);
#endif
#ifdef CONFIG_TASKS_RCU_GENERIC
void tasks_cblist_init_generic(void);
#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
......@@ -557,8 +563,7 @@ static inline void rcu_set_jiffies_lazy_flush(unsigned long j) { }
#endif
#if defined(CONFIG_TREE_RCU)
void rcutorture_get_gp_data(enum rcutorture_type test_type, int *flags,
unsigned long *gp_seq);
void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq);
void do_trace_rcu_torture_read(const char *rcutorturename,
struct rcu_head *rhp,
unsigned long secs,
......@@ -566,8 +571,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
unsigned long c);
void rcu_gp_set_torture_wait(int duration);
#else
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
int *flags, unsigned long *gp_seq)
static inline void rcutorture_get_gp_data(int *flags, unsigned long *gp_seq)
{
*flags = 0;
*gp_seq = 0;
......@@ -587,20 +591,16 @@ static inline void rcu_gp_set_torture_wait(int duration) { }
#ifdef CONFIG_TINY_SRCU
static inline void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags,
static inline void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
unsigned long *gp_seq)
{
if (test_type != SRCU_FLAVOR)
return;
*flags = 0;
*gp_seq = sp->srcu_idx;
}
#elif defined(CONFIG_TREE_SRCU)
void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *sp, int *flags,
void srcutorture_get_gp_data(struct srcu_struct *sp, int *flags,
unsigned long *gp_seq);
#endif
......
This diff is collapsed.
......@@ -96,9 +96,12 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
*/
void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
{
int newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
int newval;
preempt_disable(); // Needed for PREEMPT_AUTO
newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
preempt_enable();
if (!newval && READ_ONCE(ssp->srcu_gp_waiting) && in_task())
swake_up_one(&ssp->srcu_wq);
}
......@@ -117,8 +120,11 @@ void srcu_drive_gp(struct work_struct *wp)
struct srcu_struct *ssp;
ssp = container_of(wp, struct srcu_struct, srcu_work);
if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
preempt_disable(); // Needed for PREEMPT_AUTO
if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) {
return; /* Already running or nothing to do. */
preempt_enable();
}
/* Remove recently arrived callbacks and wait for readers. */
WRITE_ONCE(ssp->srcu_gp_running, true);
......@@ -130,9 +136,12 @@ void srcu_drive_gp(struct work_struct *wp)
idx = (ssp->srcu_idx & 0x2) / 2;
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
preempt_enable();
swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
preempt_disable(); // Needed for PREEMPT_AUTO
WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
preempt_enable();
/* Invoke the callbacks we removed above. */
while (lh) {
......@@ -150,8 +159,11 @@ void srcu_drive_gp(struct work_struct *wp)
* at interrupt level, but the ->srcu_gp_running checks will
* straighten that out.
*/
preempt_disable(); // Needed for PREEMPT_AUTO
WRITE_ONCE(ssp->srcu_gp_running, false);
if (ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max)))
idx = ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max));
preempt_enable();
if (idx)
schedule_work(&ssp->srcu_work);
}
EXPORT_SYMBOL_GPL(srcu_drive_gp);
......@@ -160,9 +172,12 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
{
unsigned long cookie;
preempt_disable(); // Needed for PREEMPT_AUTO
cookie = get_state_synchronize_srcu(ssp);
if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie))
if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie)) {
preempt_enable();
return;
}
WRITE_ONCE(ssp->srcu_idx_max, cookie);
if (!READ_ONCE(ssp->srcu_gp_running)) {
if (likely(srcu_init_done))
......@@ -170,6 +185,7 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
else if (list_empty(&ssp->srcu_work.entry))
list_add(&ssp->srcu_work.entry, &srcu_boot_list);
}
preempt_enable();
}
/*
......@@ -183,11 +199,13 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rhp->func = func;
rhp->next = NULL;
preempt_disable(); // Needed for PREEMPT_AUTO
local_irq_save(flags);
*ssp->srcu_cb_tail = rhp;
ssp->srcu_cb_tail = &rhp->next;
local_irq_restore(flags);
srcu_gp_start_if_needed(ssp);
preempt_enable();
}
EXPORT_SYMBOL_GPL(call_srcu);
......@@ -241,9 +259,12 @@ EXPORT_SYMBOL_GPL(get_state_synchronize_srcu);
*/
unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
{
unsigned long ret = get_state_synchronize_srcu(ssp);
unsigned long ret;
preempt_disable(); // Needed for PREEMPT_AUTO
ret = get_state_synchronize_srcu(ssp);
srcu_gp_start_if_needed(ssp);
preempt_enable();
return ret;
}
EXPORT_SYMBOL_GPL(start_poll_synchronize_srcu);
......
......@@ -1826,12 +1826,9 @@ static void process_srcu(struct work_struct *work)
srcu_reschedule(ssp, curdelay);
}
void srcutorture_get_gp_data(enum rcutorture_type test_type,
struct srcu_struct *ssp, int *flags,
void srcutorture_get_gp_data(struct srcu_struct *ssp, int *flags,
unsigned long *gp_seq)
{
if (test_type != SRCU_FLAVOR)
return;
*flags = 0;
*gp_seq = rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq);
}
......
......@@ -122,7 +122,7 @@ void rcu_sync_enter(struct rcu_sync *rsp)
* we are called at early boot time but this shouldn't happen.
*/
}
rsp->gp_count++;
WRITE_ONCE(rsp->gp_count, rsp->gp_count + 1);
spin_unlock_irq(&rsp->rss_lock);
if (gp_state == GP_IDLE) {
......@@ -151,11 +151,15 @@ void rcu_sync_enter(struct rcu_sync *rsp)
*/
void rcu_sync_exit(struct rcu_sync *rsp)
{
int gpc;
WARN_ON_ONCE(READ_ONCE(rsp->gp_state) == GP_IDLE);
WARN_ON_ONCE(READ_ONCE(rsp->gp_count) == 0);
spin_lock_irq(&rsp->rss_lock);
if (!--rsp->gp_count) {
gpc = rsp->gp_count - 1;
WRITE_ONCE(rsp->gp_count, gpc);
if (!gpc) {
if (rsp->gp_state == GP_PASSED) {
WRITE_ONCE(rsp->gp_state, GP_EXIT);
rcu_sync_call(rsp);
......
......@@ -74,6 +74,7 @@ struct rcu_tasks_percpu {
* @holdouts_func: This flavor's holdout-list scan function (optional).
* @postgp_func: This flavor's post-grace-period function (optional).
* @call_func: This flavor's call_rcu()-equivalent function.
* @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE).
* @rtpcpu: This flavor's rcu_tasks_percpu structure.
* @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks.
* @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing.
......@@ -107,6 +108,7 @@ struct rcu_tasks {
holdouts_func_t holdouts_func;
postgp_func_t postgp_func;
call_rcu_func_t call_func;
unsigned int wait_state;
struct rcu_tasks_percpu __percpu *rtpcpu;
int percpu_enqueue_shift;
int percpu_enqueue_lim;
......@@ -134,6 +136,7 @@ static struct rcu_tasks rt_name = \
.tasks_gp_mutex = __MUTEX_INITIALIZER(rt_name.tasks_gp_mutex), \
.gp_func = gp, \
.call_func = call, \
.wait_state = TASK_UNINTERRUPTIBLE, \
.rtpcpu = &rt_name ## __percpu, \
.lazy_jiffies = DIV_ROUND_UP(HZ, 4), \
.name = n, \
......@@ -147,7 +150,7 @@ static struct rcu_tasks rt_name = \
#ifdef CONFIG_TASKS_RCU
/* Report delay in synchronize_srcu() completion in rcu_tasks_postscan(). */
/* Report delay of scan exiting tasklist in rcu_tasks_postscan(). */
static void tasks_rcu_exit_srcu_stall(struct timer_list *unused);
static DEFINE_TIMER(tasks_rcu_exit_srcu_stall_timer, tasks_rcu_exit_srcu_stall);
#endif
......@@ -638,7 +641,7 @@ static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
// If the grace-period kthread is running, use it.
if (READ_ONCE(rtp->kthread_ptr)) {
wait_rcu_gp(rtp->call_func);
wait_rcu_gp_state(rtp->wait_state, rtp->call_func);
return;
}
rcu_tasks_one_gp(rtp, true);
......@@ -1160,6 +1163,7 @@ static int __init rcu_spawn_tasks_kthread(void)
rcu_tasks.postscan_func = rcu_tasks_postscan;
rcu_tasks.holdouts_func = check_all_holdout_tasks;
rcu_tasks.postgp_func = rcu_tasks_postgp;
rcu_tasks.wait_state = TASK_IDLE;
rcu_spawn_tasks_kthread_generic(&rcu_tasks);
return 0;
}
......@@ -1178,6 +1182,13 @@ struct task_struct *get_rcu_tasks_gp_kthread(void)
}
EXPORT_SYMBOL_GPL(get_rcu_tasks_gp_kthread);
void rcu_tasks_get_gp_data(int *flags, unsigned long *gp_seq)
{
*flags = 0;
*gp_seq = rcu_seq_current(&rcu_tasks.tasks_gp_seq);
}
EXPORT_SYMBOL_GPL(rcu_tasks_get_gp_data);
/*
* Protect against tasklist scan blind spot while the task is exiting and
* may be removed from the tasklist. Do this by adding the task to yet
......@@ -1199,8 +1210,7 @@ void exit_tasks_rcu_start(void)
rtpcp = this_cpu_ptr(rcu_tasks.rtpcpu);
t->rcu_tasks_exit_cpu = smp_processor_id();
raw_spin_lock_irqsave_rcu_node(rtpcp, flags);
if (!rtpcp->rtp_exit_list.next)
INIT_LIST_HEAD(&rtpcp->rtp_exit_list);
WARN_ON_ONCE(!rtpcp->rtp_exit_list.next);
list_add(&t->rcu_tasks_exit_list, &rtpcp->rtp_exit_list);
raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags);
preempt_enable();
......@@ -1358,6 +1368,13 @@ struct task_struct *get_rcu_tasks_rude_gp_kthread(void)
}
EXPORT_SYMBOL_GPL(get_rcu_tasks_rude_gp_kthread);
void rcu_tasks_rude_get_gp_data(int *flags, unsigned long *gp_seq)
{
*flags = 0;
*gp_seq = rcu_seq_current(&rcu_tasks_rude.tasks_gp_seq);
}
EXPORT_SYMBOL_GPL(rcu_tasks_rude_get_gp_data);
#endif /* #ifdef CONFIG_TASKS_RUDE_RCU */
////////////////////////////////////////////////////////////////////////
......@@ -1457,6 +1474,7 @@ static void rcu_st_need_qs(struct task_struct *t, u8 v)
/*
* Do a cmpxchg() on ->trc_reader_special.b.need_qs, allowing for
* the four-byte operand-size restriction of some platforms.
*
* Returns the old value, which is often ignored.
*/
u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
......@@ -1468,7 +1486,14 @@ u8 rcu_trc_cmpxchg_need_qs(struct task_struct *t, u8 old, u8 new)
if (trs_old.b.need_qs != old)
return trs_old.b.need_qs;
trs_new.b.need_qs = new;
ret.s = cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s);
// Although cmpxchg() appears to KCSAN to update all four bytes,
// only the .b.need_qs byte actually changes.
instrument_atomic_read_write(&t->trc_reader_special.b.need_qs,
sizeof(t->trc_reader_special.b.need_qs));
// Avoid false-positive KCSAN failures.
ret.s = data_race(cmpxchg(&t->trc_reader_special.s, trs_old.s, trs_new.s));
return ret.b.need_qs;
}
EXPORT_SYMBOL_GPL(rcu_trc_cmpxchg_need_qs);
......@@ -1994,7 +2019,7 @@ void show_rcu_tasks_trace_gp_kthread(void)
{
char buf[64];
sprintf(buf, "N%lu h:%lu/%lu/%lu",
snprintf(buf, sizeof(buf), "N%lu h:%lu/%lu/%lu",
data_race(n_trc_holdouts),
data_race(n_heavy_reader_ofl_updates),
data_race(n_heavy_reader_updates),
......@@ -2010,6 +2035,13 @@ struct task_struct *get_rcu_tasks_trace_gp_kthread(void)
}
EXPORT_SYMBOL_GPL(get_rcu_tasks_trace_gp_kthread);
void rcu_tasks_trace_get_gp_data(int *flags, unsigned long *gp_seq)
{
*flags = 0;
*gp_seq = rcu_seq_current(&rcu_tasks_trace.tasks_gp_seq);
}
EXPORT_SYMBOL_GPL(rcu_tasks_trace_get_gp_data);
#else /* #ifdef CONFIG_TASKS_TRACE_RCU */
static void exit_tasks_rcu_finish_trace(struct task_struct *t) { }
#endif /* #else #ifdef CONFIG_TASKS_TRACE_RCU */
......
......@@ -130,9 +130,7 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
next = list->next;
prefetch(next);
debug_rcu_head_unqueue(list);
local_bh_disable();
rcu_reclaim_tiny(list);
local_bh_enable();
list = next;
}
}
......@@ -155,7 +153,9 @@ void synchronize_rcu(void)
lock_is_held(&rcu_lock_map) ||
lock_is_held(&rcu_sched_lock_map),
"Illegal synchronize_rcu() in RCU read-side critical section");
preempt_disable();
WRITE_ONCE(rcu_ctrlblk.gp_seq, rcu_ctrlblk.gp_seq + 2);
preempt_enable();
}
EXPORT_SYMBOL_GPL(synchronize_rcu);
......
This diff is collapsed.
......@@ -273,9 +273,9 @@ struct rcu_data {
bool rcu_iw_pending; /* Is ->rcu_iw pending? */
unsigned long rcu_iw_gp_seq; /* ->gp_seq associated with ->rcu_iw. */
unsigned long rcu_ofl_gp_seq; /* ->gp_seq at last offline. */
short rcu_ofl_gp_flags; /* ->gp_flags at last offline. */
short rcu_ofl_gp_state; /* ->gp_state at last offline. */
unsigned long rcu_onl_gp_seq; /* ->gp_seq at last online. */
short rcu_onl_gp_flags; /* ->gp_flags at last online. */
short rcu_onl_gp_state; /* ->gp_state at last online. */
unsigned long last_fqs_resched; /* Time of last rcu_resched(). */
unsigned long last_sched_clock; /* Jiffies of last rcu_sched_clock_irq(). */
struct rcu_snap_record snap_record; /* Snapshot of core stats at half of */
......@@ -315,6 +315,19 @@ do { \
__set_current_state(TASK_RUNNING); \
} while (0)
/*
* A max threshold for synchronize_rcu() users which are
* awaken directly by the rcu_gp_kthread(). Left part is
* deferred to the main worker.
*/
#define SR_MAX_USERS_WAKE_FROM_GP 5
#define SR_NORMAL_GP_WAIT_HEAD_MAX 5
struct sr_wait_node {
atomic_t inuse;
struct llist_node node;
};
/*
* RCU global state, including node hierarchy. This hierarchy is
* represented in "heap" form in a dense array. The root (first level)
......@@ -400,6 +413,13 @@ struct rcu_state {
/* Synchronize offline with */
/* GP pre-initialization. */
int nocb_is_setup; /* nocb is setup from boot */
/* synchronize_rcu() part. */
struct llist_head srs_next; /* request a GP users. */
struct llist_node *srs_wait_tail; /* wait for GP users. */
struct llist_node *srs_done_tail; /* ready for GP users. */
struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX];
struct work_struct srs_cleanup_work;
};
/* Values for rcu_state structure's gp_flags field. */
......
......@@ -930,7 +930,7 @@ void synchronize_rcu_expedited(void)
/* If expedited grace periods are prohibited, fall back to normal. */
if (rcu_gp_is_normal()) {
wait_rcu_gp(call_rcu_hurry);
synchronize_rcu_normal();
return;
}
......
......@@ -805,8 +805,8 @@ dump_blkd_tasks(struct rcu_node *rnp, int ncheck)
rdp = per_cpu_ptr(&rcu_data, cpu);
pr_info("\t%d: %c online: %ld(%d) offline: %ld(%d)\n",
cpu, ".o"[rcu_rdp_cpu_online(rdp)],
(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_flags,
(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_flags);
(long)rdp->rcu_onl_gp_seq, rdp->rcu_onl_gp_state,
(long)rdp->rcu_ofl_gp_seq, rdp->rcu_ofl_gp_state);
}
}
......
......@@ -504,7 +504,8 @@ static void print_cpu_stall_info(int cpu)
rcu_dynticks_in_eqs(rcu_dynticks_snap(cpu));
rcuc_starved = rcu_is_rcuc_kthread_starving(rdp, &j);
if (rcuc_starved)
sprintf(buf, " rcuc=%ld jiffies(starved)", j);
// Print signed value, as negative values indicate a probable bug.
snprintf(buf, sizeof(buf), " rcuc=%ld jiffies(starved)", j);
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%04x/%ld/%#lx softirq=%u/%u fqs=%ld%s%s\n",
cpu,
"O."[!!cpu_online(cpu)],
......@@ -579,7 +580,7 @@ static void rcu_check_gp_kthread_expired_fqs_timer(void)
pr_err("%s kthread timer wakeup didn't happen for %ld jiffies! g%ld f%#x %s(%d) ->state=%#x\n",
rcu_state.name, (jiffies - jiffies_fqs),
(long)rcu_seq_current(&rcu_state.gp_seq),
data_race(rcu_state.gp_flags),
data_race(READ_ONCE(rcu_state.gp_flags)), // Diagnostic read
gp_state_getname(RCU_GP_WAIT_FQS), RCU_GP_WAIT_FQS,
data_race(READ_ONCE(gpk->__state)));
pr_err("\tPossible timer handling issue on cpu=%d timer-softirq=%u\n",
......@@ -628,7 +629,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_err("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu ncpus=%d)\n",
smp_processor_id(), (long)(jiffies - gps),
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen,
data_race(rcu_state.n_online_cpus)); // Diagnostic read
if (ndetected) {
rcu_dump_cpu_stacks();
......@@ -689,7 +691,8 @@ static void print_cpu_stall(unsigned long gps)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_err("\t(t=%lu jiffies g=%ld q=%lu ncpus=%d)\n",
jiffies - gps,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen, rcu_state.n_online_cpus);
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen,
data_race(rcu_state.n_online_cpus)); // Diagnostic read
rcu_check_gp_kthread_expired_fqs_timer();
rcu_check_gp_kthread_starvation();
......
......@@ -408,7 +408,7 @@ void wakeme_after_rcu(struct rcu_head *head)
}
EXPORT_SYMBOL_GPL(wakeme_after_rcu);
void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
void __wait_rcu_gp(bool checktiny, unsigned int state, int n, call_rcu_func_t *crcu_array,
struct rcu_synchronize *rs_array)
{
int i;
......@@ -440,7 +440,7 @@ void __wait_rcu_gp(bool checktiny, int n, call_rcu_func_t *crcu_array,
if (crcu_array[j] == crcu_array[i])
break;
if (j == i) {
wait_for_completion(&rs_array[i].completion);
wait_for_completion_state(&rs_array[i].completion, state);
destroy_rcu_head_on_stack(&rs_array[i].head);
}
}
......
......@@ -163,7 +163,7 @@ config TRACING
select BINARY_PRINTF
select EVENT_TRACING
select TRACE_CLOCK
select TASKS_RCU if PREEMPTION
select NEED_TASKS_RCU
config GENERIC_TRACER
bool
......@@ -204,7 +204,7 @@ config FUNCTION_TRACER
select GENERIC_TRACER
select CONTEXT_SWITCH_TRACER
select GLOB
select TASKS_RCU if PREEMPTION
select NEED_TASKS_RCU
select TASKS_RUDE_RCU
help
Enable the kernel to trace every kernel function. This is done
......
......@@ -3157,8 +3157,7 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
* synchronize_rcu_tasks() will wait for those tasks to
* execute and either schedule voluntarily or enter user space.
*/
if (IS_ENABLED(CONFIG_PREEMPTION))
synchronize_rcu_tasks();
synchronize_rcu_tasks();
ftrace_trampoline_free(ops);
}
......
......@@ -391,7 +391,7 @@ __EOF__
forceflavor="`echo $flavor | sed -e 's/^CONFIG/CONFIG_FORCE/'`"
deselectedflavors="`grep -v $flavor $T/rcutasksflavors | tr '\012' ' ' | tr -s ' ' | sed -e 's/ *$//'`"
echo " --- Running RCU Tasks Trace flavor $flavor `date`" >> $rtfdir/log
tools/testing/selftests/rcutorture/bin/kvm.sh --datestamp "$ds/results-rcutasksflavors/$flavor" --buildonly --configs "TINY01 TREE04" --kconfig "CONFIG_RCU_EXPERT=y CONFIG_RCU_SCALE_TEST=y $forceflavor=y $deselectedflavors" --trust-make > $T/$flavor.out 2>&1
tools/testing/selftests/rcutorture/bin/kvm.sh --datestamp "$ds/results-rcutasksflavors/$flavor" --buildonly --configs "TINY01 TREE04" --kconfig "CONFIG_RCU_EXPERT=y CONFIG_RCU_SCALE_TEST=y CONFIG_KPROBES=n CONFIG_RCU_TRACE=n CONFIG_TRACING=n CONFIG_BLK_DEV_IO_TRACE=n CONFIG_UPROBE_EVENTS=n $forceflavor=y $deselectedflavors" --trust-make > $T/$flavor.out 2>&1
retcode=$?
if test "$retcode" -ne 0
then
......@@ -425,7 +425,7 @@ fi
if test "$do_scftorture" = "yes"
then
# Scale memory based on the number of CPUs.
scfmem=$((2+HALF_ALLOTED_CPUS/16))
scfmem=$((3+HALF_ALLOTED_CPUS/16))
torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1"
torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make
fi
......@@ -559,7 +559,7 @@ do_kcsan="$do_kcsan_save"
if test "$do_kvfree" = "yes"
then
torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot"
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make
fi
if test "$do_clocksourcewd" = "yes"
......
......@@ -10,8 +10,9 @@ CONFIG_NO_HZ_FULL=n
CONFIG_RCU_TRACE=n
CONFIG_RCU_NOCB_CPU=n
CONFIG_DEBUG_LOCK_ALLOC=n
CONFIG_RCU_BOOST=n
CONFIG_RCU_BOOST=y
CONFIG_RCU_BOOST_DELAY=100
CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
#CHECK#CONFIG_RCU_EXPERT=n
CONFIG_RCU_EXPERT=y
CONFIG_KPROBES=n
CONFIG_FTRACE=n
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment