1. 15 Dec, 2011 1 commit
    • Fenghua Yu's avatar
      x86, mce, therm_throt: Don't report power limit and package level thermal throttle events in mcelog · 29e9bf18
      Fenghua Yu authored
      Thermal throttle and power limit events are not defined as MCE errors in x86
      architecture and should not generate MCE errors in mcelog.
      
      Current kernel generates fake software defined MCE errors for these events.
      This may confuse users because they may think the machine has real MCE errors
      while actually only thermal throttle or power limit events happen.
      
      To make it worse, buggy firmware on some platforms may falsely generate
      the events. Therefore, kernel reports MCE errors which users think as real
      hardware errors. Although the firmware bugs should be fixed, on the other hand,
      kernel should not report MCE errors either.
      
      So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count,
      package_power_limit_count, core_throttle_count, and package_throttle_count) in
      /sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters
      for each event on each CPU. Please note that all CPU's on one package report
      duplicate counters. It's user application's responsibity to retrieve a package
      level counter for one package.
      
      This patch doesn't report package level power limit, core level power limit, and
      package level thermal throttle events in mcelog. When the events happen, only
      report them in respective counters in sysfs.
      
      Since core level thermal throttle has been legacy code in kernel for a while and
      users accepted it as MCE error in mcelog, core level thermal throttle is still
      reported in mcelog. In the mean time, the event is counted in a counter in sysfs
      as well.
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Acked-by: default avatarBorislav Petkov <bp@amd64.org>
      Acked-by: default avatarTony Luck <tony.luck@intel.com>
      Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.comSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      29e9bf18
  2. 09 Dec, 2011 34 commits
  3. 08 Dec, 2011 5 commits