• Frederic Weisbecker's avatar
    rcu/nocb: Fix RT throttling hrtimer armed from offline CPU · 9139f932
    Frederic Weisbecker authored
    After a CPU is marked offline and until it reaches its final trip to
    idle, rcuo has several opportunities to be woken up, either because
    a callback has been queued in the meantime or because
    rcutree_report_cpu_dead() has issued the final deferred NOCB wake up.
    
    If RCU-boosting is enabled, RCU kthreads are set to SCHED_FIFO policy.
    And if RT-bandwidth is enabled, the related hrtimer might be armed.
    However this then happens after hrtimers have been migrated at the
    CPUHP_AP_HRTIMERS_DYING stage, which is broken as reported by the
    following warning:
    
     Call trace:
      enqueue_hrtimer+0x7c/0xf8
      hrtimer_start_range_ns+0x2b8/0x300
      enqueue_task_rt+0x298/0x3f0
      enqueue_task+0x94/0x188
      ttwu_do_activate+0xb4/0x27c
      try_to_wake_up+0x2d8/0x79c
      wake_up_process+0x18/0x28
      __wake_nocb_gp+0x80/0x1a0
      do_nocb_deferred_wakeup_common+0x3c/0xcc
      rcu_report_dead+0x68/0x1ac
      cpuhp_report_idle_dead+0x48/0x9c
      do_idle+0x288/0x294
      cpu_startup_entry+0x34/0x3c
      secondary_start_kernel+0x138/0x158
    
    Fix this with waking up rcuo using an IPI if necessary. Since the
    existing API to deal with this situation only handles swait queue, rcuo
    is only woken up from offline CPUs if it's not already waiting on a
    grace period. In the worst case some callbacks will just wait for a
    grace period to complete before being assigned to a subsequent one.
    Reported-by: default avatar"Cheng-Jui Wang (王正睿)" <Cheng-Jui.Wang@mediatek.com>
    Fixes: 5c0930cc ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
    Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    Signed-off-by: default avatarNeeraj Upadhyay <neeraj.upadhyay@kernel.org>
    9139f932
tree_nocb.h 50.3 KB