• Don Zickus's avatar
    perf/x86/p4: Block PMIs on init to prevent a stream of unkown NMIs · 90ed5b0f
    Don Zickus authored
    A bunch of unknown NMIs have popped up on a Pentium4 recently when booting
    into a kdump kernel.  This was exposed because the watchdog timer went
    from 60 seconds down to 10 seconds (increasing the ability to reproduce
    this problem).
    
    What is happening is on boot up of the second kernel (the kdump one),
    the previous nmi_watchdogs were enabled on thread 0 and thread 1.  The
    second kernel only initializes one cpu but the perf counter on thread 1
    still counts.
    
    Normally in a kdump scenario, the other cpus are blocking in an NMI loop,
    but more importantly their local apics have the performance counters disabled
    (iow LVTPC is masked).  So any counters that fire are masked and never get
    through to the second kernel.
    
    However, on a P4 the local apic is shared by both threads and thread1's PMI
    (despite being configured to only interrupt thread1) will generate an NMI on
    thread0.  Because thread0 knows nothing about this NMI, it is seen as an
    unknown NMI.
    
    This would be fine because it is a kdump kernel, strange things happen
    what is the big deal about a single unknown NMI.
    
    Unfortunately, the P4 comes with another quirk: clearing the overflow bit
    to prevent a stream of NMIs.  This is the problem.
    
    The kdump kernel can not execute because of the endless NMIs that happen.
    
    To solve this, I instrumented the p4 perf init code, to walk all the counters
    and zero them out (just like a normal reset would).
    
    Now when the counters go off, they do not generate anything and no unknown
    NMIs are seen.
    
    I tested this on a P4 we have in our lab.  After two or three crashes, I could
    normally reproduce the problem.  Now after 10 crashes, everything continues
    to boot correctly.
    Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/20140120154115.GZ25953@redhat.com
    [ Fixed a stylistic detail. ]
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    90ed5b0f
perf_event_p4.c 44.3 KB