1. 05 Dec, 2011 5 commits
    • Don Zickus's avatar
      x86, reboot: Use NMI instead of REBOOT_VECTOR to stop cpus · 3603a251
      Don Zickus authored
      A recent discussion started talking about the locking on the
      pstore fs and how it relates to the kmsg infrastructure.  We
      noticed it was possible for userspace to r/w to the pstore fs
      (grabbing the locks in the process) and block the panic path
      from r/w to the same fs.
      
      The reason was the cpu with the lock could be doing work while
      the crashing cpu is panic'ing.  Busting those spinlocks might
      cause those cpus to step on each other's data.  Fine, fair
      enough.
      
      It was suggested it would be nice to serialize the panic path
      (ie stop the other cpus) and have only one cpu running.  This
      would allow us to bust the spinlocks and not worry about another
      cpu stepping on the data.
      
      Of course, smp_send_stop() does this in the panic case.
      kmsg_dump() would have to be moved to be called after it.  Easy
      enough.
      
      The only problem is on x86 the smp_send_stop() function calls
      the REBOOT_VECTOR.  Any cpu with irqs disabled (which pstore and
      its backend ERST would do), block this IPI and thus do not stop.
       This makes it difficult to reliably log data to the pstore fs.
      
      The patch below switches from the REBOOT_VECTOR to NMI (and
      mimics what kdump does).  Switching to NMI allows us to deliver
      the IPI when irqs are disabled, increasing the reliability of
      this function.
      
      However, Andi carefully noted that on some machines this
      approach does not work because of broken BIOSes or whatever.
      
      To help accomodate this, the next couple of patches will run a
      selftest and provide a knob to disable.
      
      V2:
        uses atomic ops to serialize the cpu that shuts everyone down
      V3:
        comment cleanup
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: seiji.aguchi@hds.com
      Cc: vgoyal@redhat.com
      Cc: mjg@redhat.com
      Cc: tony.luck@intel.com
      Cc: gong.chen@intel.com
      Cc: satoru.moriya@hds.com
      Cc: avi@redhat.com
      Cc: Andi Kleen <andi@firstfloor.org>
      Link: http://lkml.kernel.org/r/1318533267-18880-2-git-send-email-dzickus@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3603a251
    • Mitsuo Hayasaka's avatar
      x86: Clean up the range of stack overflow checking · 467e6b7a
      Mitsuo Hayasaka authored
      The overflow checking of kernel stack checks if the stack
      pointer points to the available kernel stack range, which is
      derived from the original overflow checking.
      
      It is clear that curbase address is always less than low
      boundary of available kernel stack. So, this patch removes the
      first condition that checks if the pointer is higher than
      curbase.
      Signed-off-by: default avatarMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060845.11076.40916.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      467e6b7a
    • Mitsuo Hayasaka's avatar
      x86: Panic on detection of stack overflow · 55af7796
      Mitsuo Hayasaka authored
      Currently, messages are just output on the detection of stack
      overflow, which is not sufficient for systems that need a
      high reliability. This is because in general the overflow may
      corrupt data, and the additional corruption may occur due to
      reading them unless systems stop.
      
      This patch adds the sysctl parameter
      kernel.panic_on_stackoverflow and causes a panic when detecting
      the overflows of kernel, IRQ and exception stacks except user
      stack according to the parameter. It is disabled by default.
      Signed-off-by: default avatarMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/20111129060836.11076.12323.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      55af7796
    • Mitsuo Hayasaka's avatar
      x86: Check stack overflow in detail · 37fe6a42
      Mitsuo Hayasaka authored
      Currently, only kernel stack is checked for the overflow, which
      is not sufficient for systems that need a high reliability. To
      enhance it, it is required to check the IRQ and exception
      stacks, as well.
      
      This patch checks all the stack types and will cause messages of
      stacks in detail when free stack space drops below a certain
      limit except user stack.
      Signed-off-by: default avatarMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060829.11076.51733.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      37fe6a42
    • Mitsuo Hayasaka's avatar
      x86: Add user_mode_vm check in stack_overflow_check · 69682b62
      Mitsuo Hayasaka authored
      The kernel stack overflow is checked in stack_overflow_check(),
      which may wrongly detect the overflow if the stack pointer in
      user space points to the kernel stack intentionally or
      accidentally. So, the actual overflow is never detected after
      this misdetection because WARN_ONCE() is used on the detection
      of it.
      
      This patch adds user-mode-vm checking before it to avoid this
      problem and bails out early if the user stack is used.
      Signed-off-by: default avatarMitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: yrl.pp-manager.tt@hitachi.com
      Cc: Randy Dunlap <rdunlap@xenotime.net>
      Link: http://lkml.kernel.org/r/20111129060821.11076.55315.stgit@ltc219.sdl.hitachi.co.jpSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      69682b62
  2. 04 Dec, 2011 1 commit
    • Linus Torvalds's avatar
      x86: Fix boot failures on older AMD CPU's · 8e8da023
      Linus Torvalds authored
      People with old AMD chips are getting hung boots, because commit
      bcb80e53 ("x86, microcode, AMD: Add microcode revision to
      /proc/cpuinfo") moved the microcode detection too early into
      "early_init_amd()".
      
      At that point we are *so* early in the booth that the exception tables
      haven't even been set up yet, so the whole
      
      	rdmsr_safe(MSR_AMD64_PATCH_LEVEL, &c->microcode, &dummy);
      
      doesn't actually work: if the rdmsr does a GP fault (due to non-existant
      MSR register on older CPU's), we can't fix it up yet, and the boot fails.
      
      Fix it by simply moving the code to a slightly later point in the boot
      (init_amd() instead of early_init_amd()), since the kernel itself
      doesn't even really care about the microcode patchlevel at this point
      (or really ever: it's made available to user space in /proc/cpuinfo, and
      updated if you do a microcode load).
      Reported-tested-and-bisected-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Tested-by: default avatarBob Tracy <rct@gherkin.frus.com>
      Acked-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e8da023
  3. 03 Dec, 2011 1 commit
    • Konrad Rzeszutek Wilk's avatar
      xen/pm_idle: Make pm_idle be default_idle under Xen. · e5fd47bf
      Konrad Rzeszutek Wilk authored
      The idea behind commit d91ee586 ("cpuidle: replace xen access to x86
      pm_idle and default_idle") was to have one call - disable_cpuidle()
      which would make pm_idle not be molested by other code.  It disallows
      cpuidle_idle_call to be set to pm_idle (which is excellent).
      
      But in the select_idle_routine() and idle_setup(), the pm_idle can still
      be set to either: amd_e400_idle, mwait_idle or default_idle.  This
      depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.
      
      In case of mwait_idle we can hit some instances where the hypervisor
      (Amazon EC2 specifically) sets the MWAIT and we get:
      
        Brought up 2 CPUs
        invalid opcode: 0000 [#1] SMP
      
        Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
        RIP: e030:[<ffffffff81015d1d>]  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
        ...
        Call Trace:
         [<ffffffff8100e2ed>] cpu_idle+0xae/0xe8
         [<ffffffff8149ee78>] cpu_bringup_and_idle+0xe/0x10
        RIP  [<ffffffff81015d1d>] mwait_idle+0x6f/0xb4
         RSP <ffff8801d28ddf10>
      
      In the case of amd_e400_idle we don't get so spectacular crashes, but we
      do end up making an MSR which is trapped in the hypervisor, and then
      follow it up with a yield hypercall.  Meaning we end up going to
      hypervisor twice instead of just once.
      
      The previous behavior before v3.0 was that pm_idle was set to
      default_idle regardless of select_idle_routine/idle_setup.
      
      We want to do that, but only for one specific case: Xen.  This patch
      does that.
      
      Fixes RH BZ #739499 and Ubuntu #881076
      Reported-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e5fd47bf
  4. 02 Dec, 2011 14 commits
  5. 01 Dec, 2011 19 commits