Commit 747d5a1b authored by Grzegorz Halat's avatar Grzegorz Halat Committed by Thomas Gleixner

x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails

A reboot request sends an IPI via the reboot vector and waits for all other
CPUs to stop. If one or more CPUs are in critical regions with interrupts
disabled then the IPI is not handled on those CPUs and the shutdown hangs
if native_stop_other_cpus() is called with the wait argument set.

Such a situation can happen when one CPU was stopped within a lock held
section and another CPU is trying to acquire that lock with interrupts
disabled. There are other scenarios which can cause such a lockup as well.

In theory the shutdown should be attempted by an NMI IPI after the timeout
period elapsed. Though the wait loop after sending the reboot vector IPI
prevents this. It checks the wait request argument and the timeout. If wait
is set, which is true for sys_reboot() then it won't fall through to the
NMI shutdown method after the timeout period has finished.

This was an oversight when the NMI shutdown mechanism was added to handle
the 'reboot IPI is not working' situation. The mechanism was added to deal
with stuck panic shutdowns, which do not have the wait request set, so the
'wait request' case was probably not considered.

Remove the wait check from the post reboot vector IPI wait loop and enforce
that the wait loop in the NMI fallback path is invoked even if NMI IPIs are
disabled or the registration of the NMI handler fails. That second wait
loop will then hang if not all CPUs shutdown and the wait argument is set.

[ tglx: Avoid the hard to parse line break in the NMI fallback path,
  	add comments and massage the changelog ]

Fixes: 7d007d21 ("x86/reboot: Use NMI to assist in shutting down if IRQ fails")
Signed-off-by: default avatarGrzegorz Halat <ghalat@redhat.com>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Cc: Don Zickus <dzickus@redhat.com>
Link: https://lkml.kernel.org/r/20190628122813.15500-1-ghalat@redhat.com
parent 83b584d9
...@@ -179,6 +179,12 @@ asmlinkage __visible void smp_reboot_interrupt(void) ...@@ -179,6 +179,12 @@ asmlinkage __visible void smp_reboot_interrupt(void)
irq_exit(); irq_exit();
} }
static int register_stop_handler(void)
{
return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
NMI_FLAG_FIRST, "smp_stop");
}
static void native_stop_other_cpus(int wait) static void native_stop_other_cpus(int wait)
{ {
unsigned long flags; unsigned long flags;
...@@ -212,39 +218,41 @@ static void native_stop_other_cpus(int wait) ...@@ -212,39 +218,41 @@ static void native_stop_other_cpus(int wait)
apic->send_IPI_allbutself(REBOOT_VECTOR); apic->send_IPI_allbutself(REBOOT_VECTOR);
/* /*
* Don't wait longer than a second if the caller * Don't wait longer than a second for IPI completion. The
* didn't ask us to wait. * wait request is not checked here because that would
* prevent an NMI shutdown attempt in case that not all
* CPUs reach shutdown state.
*/ */
timeout = USEC_PER_SEC; timeout = USEC_PER_SEC;
while (num_online_cpus() > 1 && (wait || timeout--)) while (num_online_cpus() > 1 && timeout--)
udelay(1); udelay(1);
} }
/* if the REBOOT_VECTOR didn't work, try with the NMI */ /* if the REBOOT_VECTOR didn't work, try with the NMI */
if ((num_online_cpus() > 1) && (!smp_no_nmi_ipi)) { if (num_online_cpus() > 1) {
if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback, /*
NMI_FLAG_FIRST, "smp_stop")) * If NMI IPI is enabled, try to register the stop handler
/* Note: we ignore failures here */ * and send the IPI. In any case try to wait for the other
/* Hope the REBOOT_IRQ is good enough */ * CPUs to stop.
goto finish; */
if (!smp_no_nmi_ipi && !register_stop_handler()) {
/* sync above data before sending IRQ */ /* Sync above data before sending IRQ */
wmb(); wmb();
pr_emerg("Shutting down cpus with NMI\n"); pr_emerg("Shutting down cpus with NMI\n");
apic->send_IPI_allbutself(NMI_VECTOR); apic->send_IPI_allbutself(NMI_VECTOR);
}
/* /*
* Don't wait longer than a 10 ms if the caller * Don't wait longer than 10 ms if the caller didn't
* didn't ask us to wait. * reqeust it. If wait is true, the machine hangs here if
* one or more CPUs do not reach shutdown state.
*/ */
timeout = USEC_PER_MSEC * 10; timeout = USEC_PER_MSEC * 10;
while (num_online_cpus() > 1 && (wait || timeout--)) while (num_online_cpus() > 1 && (wait || timeout--))
udelay(1); udelay(1);
} }
finish:
local_irq_save(flags); local_irq_save(flags);
disable_local_APIC(); disable_local_APIC();
mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment