• Daniel Wagner's avatar
    x86/kconfig: Fall back to ticket spinlocks · c6e074e3
    Daniel Wagner authored
    BugLink: https://bugs.launchpad.net/bugs/1810947
    
    Sebastian writes:
    
    """
    We reproducibly observe cache line starvation on a Core2Duo E6850 (2
    cores), a i5-6400 SKL (4 cores) and on a NXP LS2044A ARM Cortex-A72 (4
    cores).
    
    The problem can be triggered with a v4.9-RT kernel by starting
    
        cyclictest -S -p98 -m  -i2000 -b 200
    
    and as "load"
    
        stress-ng --ptrace 4
    
    The reported maximal latency is usually less than 60us. If the problem
    triggers then values around 400us, 800us or even more are reported. The
    upperlimit is the -i parameter.
    
    Reproduction with 4.9-RT is almost immediate on Core2Duo, ARM64 and SKL,
    but it took 7.5 hours to trigger on v4.14-RT on the Core2Duo.
    
    Instrumentation show always the picture:
    
    CPU0                                         CPU1
    => do_syscall_64                              => do_syscall_64
    => SyS_ptrace                                   => syscall_slow_exit_work
    => ptrace_check_attach                          => ptrace_do_notify / rt_read_unlock
    => wait_task_inactive                              rt_spin_lock_slowunlock()
       -> while task_running()                         __rt_mutex_unlock_common()
      /   check_task_state()                           mark_wakeup_next_waiter()
     |     raw_spin_lock_irq(&p->pi_lock);             raw_spin_lock(&current->pi_lock);
     |     .                                               .
     |     raw_spin_unlock_irq(&p->pi_lock);               .
      \  cpu_relax()                                       .
       -                                                   .
        *IRQ*                                          <lock acquired>
    
    In the error case we observe that the while() loop is repeated more than
    5000 times which indicates that the pi_lock can be acquired. CPU1 on the
    other side does not make progress waiting for the same lock with interrupts
    disabled.
    
    This continues until an IRQ hits CPU0. Once CPU0 starts processing the IRQ
    the other CPU is able to acquire pi_lock and the situation relaxes.
    """
    
    This matches with the observeration for v4.4-rt on a Core2Duo E6850:
    
    CPU 0:
    
    - no progress for a very long time in rt_mutex_dequeue_pi):
    
    stress-n-1931    0d..11  5060.891219: function:             __try_to_take_rt_mutex
    stress-n-1931    0d..11  5060.891219: function:                rt_mutex_dequeue
    stress-n-1931    0d..21  5060.891220: function:                rt_mutex_enqueue_pi
    stress-n-1931    0....2  5060.891220: signal_generate:      sig=17 errno=0 code=262148 comm=stress-ng-ptrac pid=1928 grp=1 res=1
    stress-n-1931    0d..21  5060.894114: function:             rt_mutex_dequeue_pi
    stress-n-1931    0d.h11  5060.894115: local_timer_entry:    vector=239
    
    CPU 1:
    
    - IRQ at 5060.894114 on CPU 1 followed by the IRQ on CPU 0
    
    stress-n-1928    1....0  5060.891215: sys_enter:            NR 101 (18, 78b, 0, 0, 17, 788)
    stress-n-1928    1d..11  5060.891216: function:             __try_to_take_rt_mutex
    stress-n-1928    1d..21  5060.891216: function:                rt_mutex_enqueue_pi
    stress-n-1928    1d..21  5060.891217: function:             rt_mutex_dequeue_pi
    stress-n-1928    1....1  5060.891217: function:             rt_mutex_adjust_prio
    stress-n-1928    1d..11  5060.891218: function:                __rt_mutex_adjust_prio
    stress-n-1928    1d.h10  5060.894114: local_timer_entry:    vector=239
    
    Thomas writes:
    
    """
    This has nothing to do with RT. RT is merily exposing the
    problem in an observable way. The same issue happens with upstream, it's
    harder to trigger and it's harder to observe for obvious reasons.
    
    If you read through the discussions [see the links below] then you
    really see that there is an upstream issue with the x86 qrlock
    implementation and Peter has posted fixes which resolve it, both at
    the practical and the theoretical level.
    """
    
    Backporting all qspinlock related patches is very likely to introduce
    regressions on v4.4. Therefore, the recommended solution by Peter and
    Thomas is to drop back to ticket spinlocks for v4.4.
    
    Link :https://lkml.kernel.org/r/20180921120226.6xjgr4oiho22ex75@linutronix.de
    Link: https://lkml.kernel.org/r/20180926110117.405325143@infradead.org
    Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
    Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Signed-off-by: default avatarDaniel Wagner <daniel.wagner@siemens.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: default avatarJuerg Haefliger <juergh@canonical.com>
    Signed-off-by: default avatarKleber Sacilotto de Souza <kleber.souza@canonical.com>
    c6e074e3
Kconfig 87.7 KB