1. 04 Nov, 2019 1 commit
  2. 27 Sep, 2019 1 commit
    • Balasubramani Vivekanandan's avatar
      tick: broadcast-hrtimer: Fix a race in bc_set_next · b9023b91
      Balasubramani Vivekanandan authored
      When a cpu requests broadcasting, before starting the tick broadcast
      hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active
      using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide
      the required synchronization when the callback is active on other core.
      
      The callback could have already executed tick_handle_oneshot_broadcast()
      and could have also returned. But still there is a small time window where
      the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns
      without doing anything, but the next_event of the tick broadcast clock
      device is already set to a timeout value.
      
      In the race condition diagram below, CPU #1 is running the timer callback
      and CPU #2 is entering idle state and so calls bc_set_next().
      
      In the worst case, the next_event will contain an expiry time, but the
      hrtimer will not be started which happens when the racing callback returns
      HRTIMER_NORESTART. The hrtimer might never recover if all further requests
      from the CPUs to subscribe to tick broadcast have timeout greater than the
      next_event of tick broadcast clock device. This leads to cascading of
      failures and finally noticed as rcu stall warnings
      
      Here is a depiction of the race condition
      
      CPU #1 (Running timer callback)                   CPU #2 (Enter idle
                                                        and subscribe to
                                                        tick broadcast)
      ---------------------                             ---------------------
      
      __run_hrtimer()                                   tick_broadcast_enter()
      
        bc_handler()                                      __tick_broadcast_oneshot_control()
      
          tick_handle_oneshot_broadcast()
      
            raw_spin_lock(&tick_broadcast_lock);
      
            dev->next_event = KTIME_MAX;                  //wait for tick_broadcast_lock
            //next_event for tick broadcast clock
            set to KTIME_MAX since no other cores
            subscribed to tick broadcasting
      
            raw_spin_unlock(&tick_broadcast_lock);
      
          if (dev->next_event == KTIME_MAX)
            return HRTIMER_NORESTART
          // callback function exits without
             restarting the hrtimer                      //tick_broadcast_lock acquired
                                                         raw_spin_lock(&tick_broadcast_lock);
      
                                                         tick_broadcast_set_event()
      
                                                           clockevents_program_event()
      
                                                             dev->next_event = expires;
      
                                                             bc_set_next()
      
                                                               hrtimer_try_to_cancel()
                                                               //returns -1 since the timer
                                                               callback is active. Exits without
                                                               restarting the timer
        cpu_base->running = NULL;
      
      The comment that hrtimer cannot be armed from within the callback is
      wrong. It is fine to start the hrtimer from within the callback. Also it is
      safe to start the hrtimer from the enter/exit idle code while the broadcast
      handler is active. The enter/exit idle code and the broadcast handler are
      synchronized using tick_broadcast_lock. So there is no need for the
      existing try to cancel logic. All this can be removed which will eliminate
      the race condition as well.
      
      Fixes: 5d1638ac ("tick: Introduce hrtimer based broadcast")
      Originally-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBalasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com
      b9023b91
  3. 26 Sep, 2019 38 commits