1. 17 Dec, 2010 8 commits
    • Paul E. McKenney's avatar
    • Mariusz Kozlowski's avatar
      rculist: fix borked __list_for_each_rcu() macro · 8a9c1cee
      Mariusz Kozlowski authored
      This restores parentheses blance.
      Signed-off-by: default avatarMariusz Kozlowski <mk@lab.zgora.pl>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8a9c1cee
    • Paul E. McKenney's avatar
      rcu: reduce __call_rcu()-induced contention on rcu_node structures · b52573d2
      Paul E. McKenney authored
      When the current __call_rcu() function was written, the expedited
      APIs did not exist.  The __call_rcu() implementation therefore went
      to great lengths to detect the end of old grace periods and to start
      new ones, all in the name of reducing grace-period latency.  Now the
      expedited APIs do exist, and the usage of __call_rcu() has increased
      considerably.  This commit therefore causes __call_rcu() to avoid
      worrying about grace periods unless there are a large number of
      RCU callbacks stacked up on the current CPU.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b52573d2
    • Paul E. McKenney's avatar
      rcu: limit rcu_node leaf-level fanout · 0209f649
      Paul E. McKenney authored
      Some recent benchmarks have indicated possible lock contention on the
      leaf-level rcu_node locks.  This commit therefore limits the number of
      CPUs per leaf-level rcu_node structure to 16, in other words, there
      can be at most 16 rcu_data structures fanning into a given rcu_node
      structure.  Prior to this, the limit was 32 on 32-bit systems and 64 on
      64-bit systems.
      
      Note that the fanout of non-leaf rcu_node structures is unchanged.  The
      organization of accesses to the rcu_node tree is such that references
      to non-leaf rcu_node structures are much less frequent than to the
      leaf structures.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      0209f649
    • Paul E. McKenney's avatar
      rcu: fine-tune grace-period begin/end checks · 121dfc4b
      Paul E. McKenney authored
      Use the CPU's bit in rnp->qsmask to determine whether or not the CPU
      should try to report a quiescent state.  Handle overflow in the check
      for rdp->gpnum having fallen behind.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      121dfc4b
    • Frederic Weisbecker's avatar
      rcu: Keep gpnum and completed fields synchronized · 5ff8e6f0
      Frederic Weisbecker authored
      When a CPU that was in an extended quiescent state wakes
      up and catches up with grace periods that remote CPUs
      completed on its behalf, we update the completed field
      but not the gpnum that keeps a stale value of a backward
      grace period ID.
      
      Later, note_new_gpnum() will interpret the shift between
      the local CPU and the node grace period ID as some new grace
      period to handle and will then start to hunt quiescent state.
      
      But if every grace periods have already been completed, this
      interpretation becomes broken. And we'll be stuck in clusters
      of spurious softirqs because rcu_report_qs_rdp() will make
      this broken state run into infinite loop.
      
      The solution, as suggested by Lai Jiangshan, is to ensure that
      the gpnum and completed fields are well synchronized when we catch
      up with completed grace periods on their behalf by other cpus.
      This way we won't start noting spurious new grace periods.
      Suggested-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      5ff8e6f0
    • Frederic Weisbecker's avatar
      rcu: Stop chasing QS if another CPU did it for us · 20377f32
      Frederic Weisbecker authored
      When a CPU is idle and others CPUs handled its extended
      quiescent state to complete grace periods on its behalf,
      it will catch up with completed grace periods numbers
      when it wakes up.
      
      But at this point there might be no more grace period to
      complete, but still the woken CPU always keeps its stale
      qs_pending value and will then continue to chase quiescent
      states even if its not needed anymore.
      
      This results in clusters of spurious softirqs until a new
      real grace period is started. Because if we continue to
      chase quiescent states but we have completed every grace
      periods, rcu_report_qs_rdp() is puzzled and makes that
      state run into infinite loops.
      
      As suggested by Lai Jiangshan, just reset qs_pending if
      someone completed every grace periods on our behalf.
      Suggested-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      20377f32
    • Tejun Heo's avatar
      rcu: increase synchronize_sched_expedited() batching · e27fc964
      Tejun Heo authored
      The fix in commit #6a0cc49 requires more than three concurrent instances
      of synchronize_sched_expedited() before batching is possible.  This
      patch uses a ticket-counter-like approach that is also not unrelated to
      Lai Jiangshan's Ring RCU to allow sharing of expedited grace periods even
      when there are only two concurrent instances of synchronize_sched_expedited().
      
      This commit builds on Tejun's original posting, which may be found at
      http://lkml.org/lkml/2010/11/9/204, adding memory barriers, avoiding
      overflow of signed integers (other than via atomic_t), and fixing the
      detection of batching.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      e27fc964
  2. 30 Nov, 2010 10 commits
  3. 17 Nov, 2010 1 commit
    • Paul E. McKenney's avatar
      rcu: move TINY_RCU from softirq to kthread · b2c0710c
      Paul E. McKenney authored
      If RCU priority boosting is to be meaningful, callback invocation must
      be boosted in addition to preempted RCU readers.  Otherwise, in presence
      of CPU real-time threads, the grace period ends, but the callbacks don't
      get invoked.  If the callbacks don't get invoked, the associated memory
      doesn't get freed, so the system is still subject to OOM.
      
      But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
      moves the callback invocations to a kthread, which can be boosted easily.
      Signed-off-by: default avatarPaul E. McKenney <paul.mckenney@linaro.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      b2c0710c
  4. 07 Oct, 2010 7 commits
    • Paul E. McKenney's avatar
      rcu: add priority-inversion testing to rcutorture · 8e8be45e
      Paul E. McKenney authored
      Add an optional test to force long-term preemption of RCU read-side
      critical sections, controlled by new test_boost, test_boost_interval,
      and test_boost_duration module parameters.  This is to be used to
      test RCU priority boosting.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      8e8be45e
    • Peter Zijlstra's avatar
      sched: fix RCU lockdep splat from task_group() · 6506cf6c
      Peter Zijlstra authored
      This addresses the following RCU lockdep splat:
      
      [0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
      [0.052999] lockdep: fixing up alternatives.
      [0.054105]
      [0.054106] ===================================================
      [0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
      [0.054999] ---------------------------------------------------
      [0.054999] kernel/sched.c:616 invoked rcu_dereference_check() without protection!
      [0.054999]
      [0.054999] other info that might help us debug this:
      [0.054999]
      [0.054999]
      [0.054999] rcu_scheduler_active = 1, debug_locks = 1
      [0.054999] 3 locks held by swapper/1:
      [0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff814be933>] cpu_up+0x42/0x6a
      [0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff810400d8>] cpu_hotplug_begin+0x2a/0x51
      [0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>] init_idle+0x2f/0x113
      [0.054999]
      [0.054999] stack backtrace:
      [0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
      [0.054999] Call Trace:
      [0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
      [0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
      [0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
      [0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
      [0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
      [0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
      [0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
      [0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
      [0.054999]  [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
      [0.054999]  [<ffffffff814be876>] _cpu_up+0xac/0x127
      [0.054999]  [<ffffffff814be946>] cpu_up+0x55/0x6a
      [0.054999]  [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
      [0.054999]  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
      [0.054999]  [<ffffffff814c353c>] ? restore_args+0x0/0x30
      [0.054999]  [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
      [0.054999]  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
      [0.056074] Booting Node   0, Processors  #1lockdep: fixing up alternatives.
      [0.130045]  #2lockdep: fixing up alternatives.
      [0.203089]  #3 Ok.
      [0.275286] Brought up 4 CPUs
      [0.276005] Total of 4 processors activated (16017.17 BogoMIPS).
      
      The cgroup_subsys_state structures referenced by idle tasks are never
      freed, because the idle tasks should be part of the root cgroup,
      which is not removable.
      
      The problem is that while we do in-fact hold rq->lock, the newly spawned
      idle thread's cpu is not yet set to the correct cpu so the lockdep check
      in task_group():
      
        lockdep_is_held(&task_rq(p)->lock)
      
      will fail.
      
      But this is a chicken and egg problem.  Setting the CPU's runqueue requires
      that the CPU's runqueue already be set.  ;-)
      
      So insert an RCU read-side critical section to avoid the complaint.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      6506cf6c
    • Dongdong Deng's avatar
      rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value · 4ee0a603
      Dongdong Deng authored
      Using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value
      due to the caller didn't hold the root_rcu/rnp node's lock.  Although
      use without ACCESS_ONCE() is safe due to the value loaded being used
      but once, the ACCESS_ONCE() is a good documentation aid -- the variables
      are being loaded without the services of a lock.
      Signed-off-by: default avatarDongdong Deng <dongdong.deng@windriver.com>
      CC: Dipankar Sarma <dipankar@in.ibm.com>
      CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      4ee0a603
    • Paul E. McKenney's avatar
      sched: suppress RCU lockdep splat in task_fork_fair · b0a0f667
      Paul E. McKenney authored
      > ===================================================
      > [ INFO: suspicious rcu_dereference_check() usage. ]
      > ---------------------------------------------------
      > /home/greearb/git/linux.wireless-testing/kernel/sched.c:618 invoked rcu_dereference_check() without protection!
      >
      > other info that might help us debug this:
      >
      > rcu_scheduler_active = 1, debug_locks = 1
      > 1 lock held by ifup/23517:
      >   #0:  (&rq->lock){-.-.-.}, at: [<c042f782>] task_fork_fair+0x3b/0x108
      >
      > stack backtrace:
      > Pid: 23517, comm: ifup Not tainted 2.6.36-rc6-wl+ #5
      > Call Trace:
      >   [<c075e219>] ? printk+0xf/0x16
      >   [<c0455842>] lockdep_rcu_dereference+0x74/0x7d
      >   [<c0426854>] task_group+0x6d/0x79
      >   [<c042686e>] set_task_rq+0xe/0x57
      >   [<c042f79e>] task_fork_fair+0x57/0x108
      >   [<c042e965>] sched_fork+0x82/0xf9
      >   [<c04334b3>] copy_process+0x569/0xe8e
      >   [<c0433ef0>] do_fork+0x118/0x262
      >   [<c076302f>] ? do_page_fault+0x16a/0x2cf
      >   [<c044b80c>] ? up_read+0x16/0x2a
      >   [<c04085ae>] sys_clone+0x1b/0x20
      >   [<c04030a5>] ptregs_clone+0x15/0x30
      >   [<c0402f1c>] ? sysenter_do_call+0x12/0x38
      
      Here a newly created task is having its runqueue assigned.  The new task
      is not yet on the tasklist, so cannot go away.  This is therefore a false
      positive, suppress with an RCU read-side critical section.
      
      Reported-by: Ben Greear <greearb@candelatech.com
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: Ben Greear <greearb@candelatech.com
      b0a0f667
    • Paul E. McKenney's avatar
      net: suppress RCU lockdep false positive in sock_update_classid · 1144182a
      Paul E. McKenney authored
      > ===================================================
      > [ INFO: suspicious rcu_dereference_check() usage. ]
      > ---------------------------------------------------
      > include/linux/cgroup.h:542 invoked rcu_dereference_check() without protection!
      >
      > other info that might help us debug this:
      >
      >
      > rcu_scheduler_active = 1, debug_locks = 0
      > 1 lock held by swapper/1:
      >  #0:  (net_mutex){+.+.+.}, at: [<ffffffff813e9010>]
      > register_pernet_subsys+0x1f/0x47
      >
      > stack backtrace:
      > Pid: 1, comm: swapper Not tainted 2.6.35.4-28.fc14.x86_64 #1
      > Call Trace:
      >  [<ffffffff8107bd3a>] lockdep_rcu_dereference+0xaa/0xb3
      >  [<ffffffff813e04b9>] sock_update_classid+0x7c/0xa2
      >  [<ffffffff813e054a>] sk_alloc+0x6b/0x77
      >  [<ffffffff8140b281>] __netlink_create+0x37/0xab
      >  [<ffffffff813f941c>] ? rtnetlink_rcv+0x0/0x2d
      >  [<ffffffff8140cee1>] netlink_kernel_create+0x74/0x19d
      >  [<ffffffff8149c3ca>] ? __mutex_lock_common+0x339/0x35b
      >  [<ffffffff813f7e9c>] rtnetlink_net_init+0x2e/0x48
      >  [<ffffffff813e8d7a>] ops_init+0xe9/0xff
      >  [<ffffffff813e8f0d>] register_pernet_operations+0xab/0x130
      >  [<ffffffff813e901f>] register_pernet_subsys+0x2e/0x47
      >  [<ffffffff81db7bca>] rtnetlink_init+0x53/0x102
      >  [<ffffffff81db835c>] netlink_proto_init+0x126/0x143
      >  [<ffffffff81db8236>] ? netlink_proto_init+0x0/0x143
      >  [<ffffffff810021b8>] do_one_initcall+0x72/0x186
      >  [<ffffffff81d78ebc>] kernel_init+0x23b/0x2c9
      >  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
      >  [<ffffffff8149e2d0>] ? restore_args+0x0/0x30
      >  [<ffffffff81d78c81>] ? kernel_init+0x0/0x2c9
      >  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
      
      The sock_update_classid() function calls task_cls_classid(current),
      but the calling task cannot go away, so there is no danger of
      the associated structures disappearing.  Insert an RCU read-side
      critical section to suppress the false positive.
      Reported-by: default avatarSubrata Modak <subrata@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      1144182a
    • Ingo Molnar's avatar
      Merge commit 'v2.6.36-rc7' into core/rcu · 556ef632
      Ingo Molnar authored
      Merge reason: Update from -rc3 to -rc7.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      556ef632
    • Ingo Molnar's avatar
      Merge branch 'rcu/urgent' of... · d4f8f217
      Ingo Molnar authored
      Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu
      d4f8f217
  5. 06 Oct, 2010 6 commits
  6. 05 Oct, 2010 7 commits
  7. 04 Oct, 2010 1 commit