• Mark Rutland's avatar
    lkdtm/bugs: add test for panic() with stuck secondary CPUs · eac80dd4
    Mark Rutland authored
    Upon a panic() the kernel will use either smp_send_stop() or
    crash_smp_send_stop() to attempt to stop secondary CPUs via an IPI,
    which may or may not be an NMI. Generally it's preferable that this is an
    NMI so that CPUs can be stopped in as many situations as possible, but
    it's not always possible to provide an NMI, and there are cases where
    CPUs may be unable to handle the NMI regardless.
    
    This patch adds a test for panic() where all other CPUs are stuck with
    interrupts disabled, which can be used to check whether the kernel
    gracefully handles CPUs failing to respond to a stop, and whether NMIs
    actually work to stop CPUs.
    
    For example, on arm64 *without* an NMI, this results in:
    
    | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
    | lkdtm: Performing direct entry PANIC_STOP_IRQOFF
    | Kernel panic - not syncing: panic stop irqoff test
    | CPU: 2 PID: 24 Comm: migration/2 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
    | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
    | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
    | Call trace:
    |  dump_backtrace+0x94/0xec
    |  show_stack+0x18/0x24
    |  dump_stack_lvl+0x74/0xc0
    |  dump_stack+0x18/0x24
    |  panic+0x358/0x3e8
    |  lkdtm_PANIC+0x0/0x18
    |  multi_cpu_stop+0x9c/0x1a0
    |  cpu_stopper_thread+0x84/0x118
    |  smpboot_thread_fn+0x224/0x248
    |  kthread+0x114/0x118
    |  ret_from_fork+0x10/0x20
    | SMP: stopping secondary CPUs
    | SMP: failed to stop secondary CPUs 0-3
    | Kernel Offset: 0x401cf3490000 from 0xffff80008000000c0
    | PHYS_OFFSET: 0x40000000
    | CPU features: 0x00000000,68c167a1,cce6773f
    | Memory Limit: none
    | ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
    
    Note the "failed to stop secondary CPUs 0-3" message.
    
    On arm64 *with* an NMI, this results in:
    
    | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
    | lkdtm: Performing direct entry PANIC_STOP_IRQOFF
    | Kernel panic - not syncing: panic stop irqoff test
    | CPU: 1 PID: 19 Comm: migration/1 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
    | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
    | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
    | Call trace:
    |  dump_backtrace+0x94/0xec
    |  show_stack+0x18/0x24
    |  dump_stack_lvl+0x74/0xc0
    |  dump_stack+0x18/0x24
    |  panic+0x358/0x3e8
    |  lkdtm_PANIC+0x0/0x18
    |  multi_cpu_stop+0x9c/0x1a0
    |  cpu_stopper_thread+0x84/0x118
    |  smpboot_thread_fn+0x224/0x248
    |  kthread+0x114/0x118
    |  ret_from_fork+0x10/0x20
    | SMP: stopping secondary CPUs
    | Kernel Offset: 0x55a9c0bc0000 from 0xffff800080000000
    | PHYS_OFFSET: 0x40000000
    | CPU features: 0x00000000,68c167a1,fce6773f
    | Memory Limit: none
    | ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
    
    Note the absence of a "failed to stop secondary CPUs" message, since we
    don't log anything when secondary CPUs are successfully stopped.
    Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
    Cc: Douglas Anderson <dianders@chromium.org>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Stephen Boyd <swboyd@chromium.org>
    Cc: Sumit Garg <sumit.garg@linaro.org>
    Reviewed-by: default avatarKees Cook <keescook@chromium.org>
    Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
    Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
    Link: https://lore.kernel.org/r/20230921161634.4063233-1-mark.rutland@arm.comSigned-off-by: default avatarKees Cook <keescook@chromium.org>
    eac80dd4
bugs.c 17.5 KB