1. 16 Oct, 2019 1 commit
    • Fabien Parent's avatar
      clocksource/drivers/mediatek: Fix error handling · 41d49e79
      Fabien Parent authored
      When timer_of_init fails, it cleans up after itself by undoing
      everything it did during the initialization function.
      
      mtk_syst_init and mtk_gpt_init both call timer_of_cleanup if
      timer_of_init fails. timer_of_cleanup try to release the resource
      taken.  Since these resources have already been cleaned up by
      timer_of_init, we end up getting a few warnings printed:
      
      [    0.001935] WARNING: CPU: 0 PID: 0 at __clk_put+0xe8/0x128
      [    0.002650] Modules linked in:
      [    0.003058] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.67+ #1
      [    0.003852] Hardware name: MediaTek MT8183 (DT)
      [    0.004446] pstate: 20400085 (nzCv daIf +PAN -UAO)
      [    0.005073] pc : __clk_put+0xe8/0x128
      [    0.005555] lr : clk_put+0xc/0x14
      [    0.005988] sp : ffffff80090b3ea0
      [    0.006422] x29: ffffff80090b3ea0 x28: 0000000040e20018
      [    0.007121] x27: ffffffc07bfff780 x26: 0000000000000001
      [    0.007819] x25: ffffff80090bda80 x24: ffffff8008ec5828
      [    0.008517] x23: ffffff80090bd000 x22: ffffff8008d8b2e8
      [    0.009216] x21: 0000000000000001 x20: fffffffffffffdfb
      [    0.009914] x19: ffffff8009166180 x18: 00000000002bffa8
      [    0.010612] x17: ffffffc012996980 x16: 0000000000000000
      [    0.011311] x15: ffffffbf004a6800 x14: 3536343038393334
      [    0.012009] x13: 2079726576652073 x12: 7eb9c62c5c38f100
      [    0.012707] x11: ffffff80090b3ba0 x10: ffffff80090b3ba0
      [    0.013405] x9 : 0000000000000004 x8 : 0000000000000040
      [    0.014103] x7 : ffffffc079400270 x6 : 0000000000000000
      [    0.014801] x5 : ffffffc079400248 x4 : 0000000000000000
      [    0.015499] x3 : 0000000000000000 x2 : 0000000000000000
      [    0.016197] x1 : ffffff80091661c0 x0 : fffffffffffffdfb
      [    0.016896] Call trace:
      [    0.017218]  __clk_put+0xe8/0x128
      [    0.017654]  clk_put+0xc/0x14
      [    0.018048]  timer_of_cleanup+0x60/0x7c
      [    0.018551]  mtk_syst_init+0x8c/0x9c
      [    0.019020]  timer_probe+0x6c/0xe0
      [    0.019469]  time_init+0x14/0x44
      [    0.019893]  start_kernel+0x2d0/0x46c
      [    0.020378] ---[ end trace 8c1efabea1267649 ]---
      [    0.020982] ------------[ cut here ]------------
      [    0.021586] Trying to vfree() nonexistent vm area ((____ptrval____))
      [    0.022427] WARNING: CPU: 0 PID: 0 at __vunmap+0xd0/0xd8
      [    0.023119] Modules linked in:
      [    0.023524] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         4.19.67+ #1
      [    0.024498] Hardware name: MediaTek MT8183 (DT)
      [    0.025091] pstate: 60400085 (nZCv daIf +PAN -UAO)
      [    0.025718] pc : __vunmap+0xd0/0xd8
      [    0.026176] lr : __vunmap+0xd0/0xd8
      [    0.026632] sp : ffffff80090b3e90
      [    0.027066] x29: ffffff80090b3e90 x28: 0000000040e20018
      [    0.027764] x27: ffffffc07bfff780 x26: 0000000000000001
      [    0.028462] x25: ffffff80090bda80 x24: ffffff8008ec5828
      [    0.029160] x23: ffffff80090bd000 x22: ffffff8008d8b2e8
      [    0.029858] x21: 0000000000000000 x20: 0000000000000000
      [    0.030556] x19: ffffff800800d000 x18: 00000000002bffa8
      [    0.031254] x17: 0000000000000000 x16: 0000000000000000
      [    0.031952] x15: ffffffbf004a6800 x14: 3536343038393334
      [    0.032651] x13: 2079726576652073 x12: 7eb9c62c5c38f100
      [    0.033349] x11: ffffff80090b3b40 x10: ffffff80090b3b40
      [    0.034047] x9 : 0000000000000005 x8 : 5f5f6c6176727470
      [    0.034745] x7 : 5f5f5f5f28282061 x6 : ffffff80091c86ef
      [    0.035443] x5 : ffffff800852b690 x4 : 0000000000000000
      [    0.036141] x3 : 0000000000000002 x2 : 0000000000000002
      [    0.036839] x1 : 7eb9c62c5c38f100 x0 : 7eb9c62c5c38f100
      [    0.037536] Call trace:
      [    0.037859]  __vunmap+0xd0/0xd8
      [    0.038271]  vunmap+0x24/0x30
      [    0.038664]  __iounmap+0x2c/0x34
      [    0.039088]  timer_of_cleanup+0x70/0x7c
      [    0.039591]  mtk_syst_init+0x8c/0x9c
      [    0.040060]  timer_probe+0x6c/0xe0
      [    0.040507]  time_init+0x14/0x44
      [    0.040932]  start_kernel+0x2d0/0x46c
      
      This commit remove the calls to timer_of_cleanup when timer_of_init
      fails since it is unnecessary and actually cause warnings to be printed.
      
      Fixes: a0858f93 ("mediatek: Convert the driver to timer-of")
      Signed-off-by: default avatarFabien Parent <fparent@baylibre.com>
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Link: https://lore.kernel.org/linux-arm-kernel/20190919191315.25190-1-fparent@baylibre.com/
      41d49e79
  2. 27 Sep, 2019 1 commit
    • Balasubramani Vivekanandan's avatar
      tick: broadcast-hrtimer: Fix a race in bc_set_next · b9023b91
      Balasubramani Vivekanandan authored
      When a cpu requests broadcasting, before starting the tick broadcast
      hrtimer, bc_set_next() checks if the timer callback (bc_handler) is active
      using hrtimer_try_to_cancel(). But hrtimer_try_to_cancel() does not provide
      the required synchronization when the callback is active on other core.
      
      The callback could have already executed tick_handle_oneshot_broadcast()
      and could have also returned. But still there is a small time window where
      the hrtimer_try_to_cancel() returns -1. In that case bc_set_next() returns
      without doing anything, but the next_event of the tick broadcast clock
      device is already set to a timeout value.
      
      In the race condition diagram below, CPU #1 is running the timer callback
      and CPU #2 is entering idle state and so calls bc_set_next().
      
      In the worst case, the next_event will contain an expiry time, but the
      hrtimer will not be started which happens when the racing callback returns
      HRTIMER_NORESTART. The hrtimer might never recover if all further requests
      from the CPUs to subscribe to tick broadcast have timeout greater than the
      next_event of tick broadcast clock device. This leads to cascading of
      failures and finally noticed as rcu stall warnings
      
      Here is a depiction of the race condition
      
      CPU #1 (Running timer callback)                   CPU #2 (Enter idle
                                                        and subscribe to
                                                        tick broadcast)
      ---------------------                             ---------------------
      
      __run_hrtimer()                                   tick_broadcast_enter()
      
        bc_handler()                                      __tick_broadcast_oneshot_control()
      
          tick_handle_oneshot_broadcast()
      
            raw_spin_lock(&tick_broadcast_lock);
      
            dev->next_event = KTIME_MAX;                  //wait for tick_broadcast_lock
            //next_event for tick broadcast clock
            set to KTIME_MAX since no other cores
            subscribed to tick broadcasting
      
            raw_spin_unlock(&tick_broadcast_lock);
      
          if (dev->next_event == KTIME_MAX)
            return HRTIMER_NORESTART
          // callback function exits without
             restarting the hrtimer                      //tick_broadcast_lock acquired
                                                         raw_spin_lock(&tick_broadcast_lock);
      
                                                         tick_broadcast_set_event()
      
                                                           clockevents_program_event()
      
                                                             dev->next_event = expires;
      
                                                             bc_set_next()
      
                                                               hrtimer_try_to_cancel()
                                                               //returns -1 since the timer
                                                               callback is active. Exits without
                                                               restarting the timer
        cpu_base->running = NULL;
      
      The comment that hrtimer cannot be armed from within the callback is
      wrong. It is fine to start the hrtimer from within the callback. Also it is
      safe to start the hrtimer from the enter/exit idle code while the broadcast
      handler is active. The enter/exit idle code and the broadcast handler are
      synchronized using tick_broadcast_lock. So there is no need for the
      existing try to cancel logic. All this can be removed which will eliminate
      the race condition as well.
      
      Fixes: 5d1638ac ("tick: Introduce hrtimer based broadcast")
      Originally-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBalasubramani Vivekanandan <balasubramani_vivekanandan@mentor.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190926135101.12102-2-balasubramani_vivekanandan@mentor.com
      b9023b91
  3. 26 Sep, 2019 38 commits