• Grzegorz Halat's avatar
    x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails · 747d5a1b
    Grzegorz Halat authored
    A reboot request sends an IPI via the reboot vector and waits for all other
    CPUs to stop. If one or more CPUs are in critical regions with interrupts
    disabled then the IPI is not handled on those CPUs and the shutdown hangs
    if native_stop_other_cpus() is called with the wait argument set.
    
    Such a situation can happen when one CPU was stopped within a lock held
    section and another CPU is trying to acquire that lock with interrupts
    disabled. There are other scenarios which can cause such a lockup as well.
    
    In theory the shutdown should be attempted by an NMI IPI after the timeout
    period elapsed. Though the wait loop after sending the reboot vector IPI
    prevents this. It checks the wait request argument and the timeout. If wait
    is set, which is true for sys_reboot() then it won't fall through to the
    NMI shutdown method after the timeout period has finished.
    
    This was an oversight when the NMI shutdown mechanism was added to handle
    the 'reboot IPI is not working' situation. The mechanism was added to deal
    with stuck panic shutdowns, which do not have the wait request set, so the
    'wait request' case was probably not considered.
    
    Remove the wait check from the post reboot vector IPI wait loop and enforce
    that the wait loop in the NMI fallback path is invoked even if NMI IPIs are
    disabled or the registration of the NMI handler fails. That second wait
    loop will then hang if not all CPUs shutdown and the wait argument is set.
    
    [ tglx: Avoid the hard to parse line break in the NMI fallback path,
      	add comments and massage the changelog ]
    
    Fixes: 7d007d21 ("x86/reboot: Use NMI to assist in shutting down if IRQ fails")
    Signed-off-by: default avatarGrzegorz Halat <ghalat@redhat.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Don Zickus <dzickus@redhat.com>
    Link: https://lkml.kernel.org/r/20190628122813.15500-1-ghalat@redhat.com
    747d5a1b
smp.c 9.5 KB