• John Stultz's avatar
    ntp: Fix leap-second hrtimer livelock · a57ccabe
    John Stultz authored
    This is a backport of 6b43ae8a
    
    This should have been backported when it was commited, but I
    mistook the problem as requiring the ntp_lock changes
    that landed in 3.4 in order for it to occur.
    
    Unfortunately the same issue can happen (with only one cpu)
    as follows:
    do_adjtimex()
     write_seqlock_irq(&xtime_lock);
      process_adjtimex_modes()
       process_adj_status()
        ntp_start_leap_timer()
         hrtimer_start()
          hrtimer_reprogram()
           tick_program_event()
            clockevents_program_event()
             ktime_get()
              seq = req_seqbegin(xtime_lock); [DEADLOCK]
    
    This deadlock will no always occur, as it requires the
    leap_timer to force a hrtimer_reprogram which only happens
    if its set and there's no sooner timer to expire.
    
    NOTE: This patch, being faithful to the original commit,
    introduces a bug (we don't update wall_to_monotonic),
    which will be resovled by backporting a following fix.
    
    Original commit message below:
    
    Since commit 7dffa3c6 the ntp
    subsystem has used an hrtimer for triggering the leapsecond
    adjustment. However, this can cause a potential livelock.
    
    Thomas diagnosed this as the following pattern:
    CPU 0                                                    CPU 1
    do_adjtimex()
      spin_lock_irq(&ntp_lock);
        process_adjtimex_modes();				 timer_interrupt()
          process_adj_status();                                do_timer()
            ntp_start_leap_timer();                             write_lock(&xtime_lock);
              hrtimer_start();                                  update_wall_time();
                 hrtimer_reprogram();                            ntp_tick_length()
                   tick_program_event()                            spin_lock(&ntp_lock);
                     clockevents_program_event()
    		   ktime_get()
                         seq = req_seqbegin(xtime_lock);
    
    This patch tries to avoid the problem by reverting back to not using
    an hrtimer to inject leapseconds, and instead we handle the leapsecond
    processing in the second_overflow() function.
    
    The downside to this change is that on systems that support highres
    timers, the leap second processing will occur on a HZ tick boundary,
    (ie: ~1-10ms, depending on HZ)  after the leap second instead of
    possibly sooner (~34us in my tests w/ x86_64 lapic).
    
    This patch applies on top of tip/timers/core.
    
    CC: Sasha Levin <levinsasha928@gmail.com>
    CC: Thomas Gleixner <tglx@linutronix.de>
    Reported-by: default avatarSasha Levin <levinsasha928@gmail.com>
    Diagnoised-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Tested-by: default avatarSasha Levin <levinsasha928@gmail.com>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
    Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
    a57ccabe
ntp.c 23.1 KB