1. 14 Feb, 2024 6 commits
    • Zqiang's avatar
      rcu/nocb: Check rdp_gp->nocb_timer in __call_rcu_nocb_wake() · f3c4c007
      Zqiang authored
      Currently, only rdp_gp->nocb_timer is used, for nocb_timer of
      no-rdp_gp structure, the timer_pending() is always return false,
      this commit therefore need to check rdp_gp->nocb_timer in
      __call_rcu_nocb_wake().
      Signed-off-by: default avatarZqiang <qiang.zhang1211@gmail.com>
      Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      f3c4c007
    • Zqiang's avatar
      rcu/nocb: Fix WARN_ON_ONCE() in the rcu_nocb_bypass_lock() · dda98810
      Zqiang authored
      For the kernels built with CONFIG_RCU_NOCB_CPU_DEFAULT_ALL=y and
      CONFIG_RCU_LAZY=y, the following scenarios will trigger WARN_ON_ONCE()
      in the rcu_nocb_bypass_lock() and rcu_nocb_wait_contended() functions:
      
              CPU2                                               CPU11
      kthread
      rcu_nocb_cb_kthread                                       ksys_write
      rcu_do_batch                                              vfs_write
      rcu_torture_timer_cb                                      proc_sys_write
      __kmem_cache_free                                         proc_sys_call_handler
      kmemleak_free                                             drop_caches_sysctl_handler
      delete_object_full                                        drop_slab
      __delete_object                                           shrink_slab
      put_object                                                lazy_rcu_shrink_scan
      call_rcu                                                  rcu_nocb_flush_bypass
      __call_rcu_commn                                            rcu_nocb_bypass_lock
                                                                  raw_spin_trylock(&rdp->nocb_bypass_lock) fail
                                                                  atomic_inc(&rdp->nocb_lock_contended);
      rcu_nocb_wait_contended                                     WARN_ON_ONCE(smp_processor_id() != rdp->cpu);
       WARN_ON_ONCE(atomic_read(&rdp->nocb_lock_contended))                                          |
                                  |_ _ _ _ _ _ _ _ _ _same rdp and rdp->cpu != 11_ _ _ _ _ _ _ _ _ __|
      
      Reproduce this bug with "echo 3 > /proc/sys/vm/drop_caches".
      
      This commit therefore uses rcu_nocb_try_flush_bypass() instead of
      rcu_nocb_flush_bypass() in lazy_rcu_shrink_scan().  If the nocb_bypass
      queue is being flushed, then rcu_nocb_try_flush_bypass will return
      directly.
      Signed-off-by: default avatarZqiang <qiang.zhang1211@gmail.com>
      Reviewed-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      dda98810
    • Frederic Weisbecker's avatar
      rcu/nocb: Re-arrange call_rcu() NOCB specific code · afd4e696
      Frederic Weisbecker authored
      Currently the call_rcu() function interleaves NOCB and !NOCB enqueue
      code in a complicated way such that:
      
      * The bypass enqueue code may or may not have enqueued and may or may
        not have locked the ->nocb_lock. Everything that follows is in a
        Schrödinger locking state for the unwary reviewer's eyes.
      
      * The was_alldone is always set but only used in NOCB related code.
      
      * The NOCB wake up is distantly related to the locking hopefully
        performed by the bypass enqueue code that did not enqueue on the
        bypass list.
      
      Unconfuse the whole and gather NOCB and !NOCB specific enqueue code to
      their own functions.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      afd4e696
    • Frederic Weisbecker's avatar
      rcu/nocb: Make IRQs disablement symmetric · b913c3fe
      Frederic Weisbecker authored
      Currently IRQs are disabled on call_rcu() and then depending on the
      context:
      
      * If the CPU is in nocb mode:
      
         - If the callback is enqueued in the bypass list, IRQs are re-enabled
           implictly by rcu_nocb_try_bypass()
      
         - If the callback is enqueued in the normal list, IRQs are re-enabled
           implicitly by __call_rcu_nocb_wake()
      
      * If the CPU is NOT in nocb mode, IRQs are reenabled explicitly from call_rcu()
      
      This makes the code a bit hard to follow, especially as it interleaves
      with nocb locking.
      
      To make the IRQ flags coverage clearer and also in order to prepare for
      moving all the nocb enqueue code to its own function, always re-enable
      the IRQ flags explicitly from call_rcu().
      Reviewed-by: default avatarNeeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      b913c3fe
    • Frederic Weisbecker's avatar
      rcu/nocb: Remove needless full barrier after callback advancing · 1e8e6951
      Frederic Weisbecker authored
      A full barrier is issued from nocb_gp_wait() upon callbacks advancing
      to order grace period completion with callbacks execution.
      
      However these two events are already ordered by the
      smp_mb__after_unlock_lock() barrier within the call to
      raw_spin_lock_rcu_node() that is necessary for callbacks advancing to
      happen.
      
      The following litmus test shows the kind of guarantee that this barrier
      provides:
      
      	C smp_mb__after_unlock_lock
      
      	{}
      
      	// rcu_gp_cleanup()
      	P0(spinlock_t *rnp_lock, int *gpnum)
      	{
      		// Grace period cleanup increase gp sequence number
      		spin_lock(rnp_lock);
      		WRITE_ONCE(*gpnum, 1);
      		spin_unlock(rnp_lock);
      	}
      
      	// nocb_gp_wait()
      	P1(spinlock_t *rnp_lock, spinlock_t *nocb_lock, int *gpnum, int *cb_ready)
      	{
      		int r1;
      
      		// Call rcu_advance_cbs() from nocb_gp_wait()
      		spin_lock(nocb_lock);
      		spin_lock(rnp_lock);
      		smp_mb__after_unlock_lock();
      		r1 = READ_ONCE(*gpnum);
      		WRITE_ONCE(*cb_ready, 1);
      		spin_unlock(rnp_lock);
      		spin_unlock(nocb_lock);
      	}
      
      	// nocb_cb_wait()
      	P2(spinlock_t *nocb_lock, int *cb_ready, int *cb_executed)
      	{
      		int r2;
      
      		// rcu_do_batch() -> rcu_segcblist_extract_done_cbs()
      		spin_lock(nocb_lock);
      		r2 = READ_ONCE(*cb_ready);
      		spin_unlock(nocb_lock);
      
      		// Actual callback execution
      		WRITE_ONCE(*cb_executed, 1);
      	}
      
      	P3(int *cb_executed, int *gpnum)
      	{
      		int r3;
      
      		WRITE_ONCE(*cb_executed, 2);
      		smp_mb();
      		r3 = READ_ONCE(*gpnum);
      	}
      
      	exists (1:r1=1 /\ 2:r2=1 /\ cb_executed=2 /\ 3:r3=0) (* Bad outcome. *)
      
      Here the bad outcome only occurs if the smp_mb__after_unlock_lock() is
      removed. This barrier orders the grace period completion against
      callbacks advancing and even later callbacks invocation, thanks to the
      opportunistic propagation via the ->nocb_lock to nocb_cb_wait().
      
      Therefore the smp_mb() placed after callbacks advancing can be safely
      removed.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      1e8e6951
    • Frederic Weisbecker's avatar
      rcu/nocb: Remove needless LOAD-ACQUIRE · ca16265a
      Frederic Weisbecker authored
      The LOAD-ACQUIRE access performed on rdp->nocb_cb_sleep advertizes
      ordering callback execution against grace period completion. However
      this is contradicted by the following:
      
      * This LOAD-ACQUIRE doesn't pair with anything. The only counterpart
        barrier that can be found is the smp_mb() placed after callbacks
        advancing in nocb_gp_wait(). However the barrier is placed _after_
        ->nocb_cb_sleep write.
      
      * Callbacks can be concurrently advanced between the LOAD-ACQUIRE on
        ->nocb_cb_sleep and the call to rcu_segcblist_extract_done_cbs() in
        rcu_do_batch(), making any ordering based on ->nocb_cb_sleep broken.
      
      * Both rcu_segcblist_extract_done_cbs() and rcu_advance_cbs() are called
        under the nocb_lock, the latter hereby providing already the desired
        ACQUIRE semantics.
      
      Therefore it is safe to access ->nocb_cb_sleep with a simple compiler
      barrier.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      ca16265a
  2. 29 Jan, 2024 1 commit
  3. 28 Jan, 2024 7 commits
  4. 27 Jan, 2024 9 commits
  5. 26 Jan, 2024 17 commits