• Aristeu Rozanski's avatar
    x86, NMI watchdog: when booting with reset_devices, clear the performance counters · 28b166a7
    Aristeu Rozanski authored
    P4s have a quirk that makes necessary to clear P4_CCCR_OVF bit on the CCCR
    everytime the PMI is triggered. When booting the kernel with reset_devices
    (more specific kdump case), the counters reach zero and the PMI will be
    generated. This is not a problem on other processors but on P4s, it'll
    continue to generate NMIs until that bit is cleared. Since there may be
    other users of the performance counters, clear and disable all of them
    when booting with reset_devices option.
    
    We have a P4 box here that crashes because of this problem. Since the kdump
    kernel usually boots with only one processor active, the second logical
    unit won't be set up, therefore, MSR_P4_IQ_CCCR1 (and other performance
    counter registers) won't be cleared and P4_CCCR_OVF may be still set because
    the previous kernel was using this register. An NMI is triggered because of
    the MSR_P4_IQ_CCCR1 right after the NMI delivery is enabled, triggering the
    race fixed on my previous email.
    Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
    Acked-by: default avatarDon Zickus <dzickus@redhat.com>
    Acked-by: default avatarPrarit Bhargava <prarit@redhat.com>
    Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    28b166a7
perfctr-watchdog.c 19.1 KB