1. 17 Sep, 2009 1 commit
    • Suresh Siddha's avatar
      x86, pat: don't use rb-tree based lookup in reserve_memtype() · dcb73bf4
      Suresh Siddha authored
      Recent enhancement of rb-tree based lookup exposed a  bug with the lookup
      mechanism in the reserve_memtype() which ensures that there are no conflicting
      memtype requests for the memory range.
      
      memtype_rb_search() returns an entry which has a start address <= new start
      address. And from here we traverse the linear linked list to check if there
      any conflicts with the existing mappings. As the rbtree is based on the
      start address of the memory range, it is quite possible that we have several
      overlapped mappings whose start address is much less than new requested start
      but the end is >= new requested end. This results in conflicting memtype
      mappings.
      
      Same bug exists with the old code which uses cached_entry from where
      we traverse the linear linked list. But the new rb-tree code exposes this
      bug fairly easily.
      
      For now, don't use the memtype_rb_search() and always start the search from
      the head of linear linked list in reserve_memtype(). Linear linked list
      for most of the systems grow's to few 10's of entries(as we track memory type
      of RAM pages using struct page). So we should be ok for now.
      
      We still retain the rbtree and use it to speed up free_memtype() which
      doesn't have the same bug(as we know what exactly we are searching for
      in free_memtype).
      
      Also use list_for_each_entry_from() in free_memtype() so that we start
      the search from rb-tree lookup result.
      Reported-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      LKML-Reference: <1253136483.4119.12.camel@sbs-t61.sc.intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      dcb73bf4
  2. 09 Sep, 2009 1 commit
  3. 31 Aug, 2009 1 commit
    • H. Peter Anvin's avatar
      mm: remove !NUMA condition from PAGEFLAGS_EXTENDED condition set · a269cca9
      H. Peter Anvin authored
      CONFIG_PAGEFLAGS_EXTENDED disables a trick to conserve pageflags.
      This trick is indended to be enabled when the pressure on page flags
      is very high.
      
      The previous condition was:
      
      -       depends on 64BIT || SPARSEMEM_VMEMMAP || !NUMA || !SPARSEMEM
      
      ... however, the sparsemem code already has a way to crowd out the
      node number from the pageflags, which means that !NUMA actually
      doesn't contribute to hard pageflags exhaustion.
      
      This is required for the new PG_uncached flag to not cause pageflags
      exhaustion on x86_32 + PAE + SPARSEMEM + !NUMA.
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      LKML-Reference: <4A9828F4.4040905@zytor.com>
      Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Cc: Suresh Siddha <suresh.siddha@intel.com>
      a269cca9
  4. 29 Aug, 2009 1 commit
    • Jan Beulich's avatar
      x86: Fix earlyprintk=dbgp for machines without NX · 47d25003
      Jan Beulich authored
      Since parse_early_param() may (e.g. for earlyprintk=dbgp)
      involve calls to page table manipulation functions (here
      set_fixmap_nocache()), NX hardware support must be determined
      before calling that function (so that __supported_pte_mask gets
      properly set up).
      
      But the call after parse_early_param() can also not go away, as
      that will honor eventual command line specified disabling of
      the NX functionality.
      
      ( This will then just result in whatever mappings got
        established during parse_early_param() having the NX bit set
        despite it being disabled on the command line, but I think
        that's tolerable).
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      LKML-Reference: <4A97F3BD02000078000121B9@vpn.id2.novell.com>
      [ merged to x86/pat to resolve a conflict. ]
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      47d25003
  5. 27 Aug, 2009 1 commit
  6. 26 Aug, 2009 10 commits
  7. 22 Aug, 2009 1 commit
  8. 21 Aug, 2009 4 commits
    • Suresh Siddha's avatar
      x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init · d0af9eed
      Suresh Siddha authored
      SDM Vol 3a section titled "MTRR considerations in MP systems" specifies
      the need for synchronizing the logical cpu's while initializing/updating
      MTRR.
      
      Currently Linux kernel does the synchronization of all cpu's only when
      a single MTRR register is programmed/updated. During an AP online
      (during boot/cpu-online/resume)  where we initialize all the MTRR/PAT registers,
      we don't follow this synchronization algorithm.
      
      This can lead to scenarios where during a dynamic cpu online, that logical cpu
      is initializing MTRR/PAT with cache disabled (cr0.cd=1) etc while other logical
      HT sibling continue to run (also with cache disabled because of cr0.cd=1
      on its sibling).
      
      Starting from Westmere, VMX transitions with cr0.cd=1 don't work properly
      (because of some VMX performance optimizations) and the above scenario
      (with one logical cpu doing VMX activity and another logical cpu coming online)
      can result in system crash.
      
      Fix the MTRR initialization by doing rendezvous of all the cpus. During
      boot and resume, we delay the MTRR/PAT init for APs till all the
      logical cpu's come online and the rendezvous process at the end of AP's bringup,
      will initialize the MTRR/PAT for all AP's.
      
      For dynamic single cpu online, we synchronize all the logical cpus and
      do the MTRR/PAT init on the AP that is coming online.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      d0af9eed
    • Suresh Siddha's avatar
      generic-ipi: Allow cpus not yet online to call smp_call_function with irqs disabled · 269c861b
      Suresh Siddha authored
      Because of deadlock possiblities smp_call_function() is not allowed to
      be called with interrupts disabled. Add an exception for the cpu not
      yet online, as no one else can send smp call function interrupt to this
      cpu that is not yet online and as such deadlock condition is not possible.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      269c861b
    • Amerigo Wang's avatar
      x86: Fix an incorrect argument of reserve_bootmem() · 3e0e1e9c
      Amerigo Wang authored
      This line looks suspicious, because if this is true, then the
      'flags' parameter of function reserve_bootmem_generic() will be
      unused when !CONFIG_NUMA. I don't think this is what we want.
      Signed-off-by: default avatarWANG Cong <amwang@redhat.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: akpm@linux-foundation.org
      LKML-Reference: <20090821083709.5098.52505.sendpatchset@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3e0e1e9c
    • Xiao Guangrong's avatar
      x86: Fix system crash when loading with "reservetop" parameter · 8126dec3
      Xiao Guangrong authored
      The system will die if the kernel is booted with "reservetop"
      parameter, in present code, parse "reservetop" parameter after
      early_ioremap_init(), and some function still use
      early_ioremap() after it.
      
      The problem is, "reservetop" parameter can modify
      'FIXADDR_TOP', then the virtual address got by early_ioremap()
      is base on old 'FIXADDR_TOP', but the page mapping is base on
      new 'FIXADDR_TOP', it will occur page fault, and the IDT is not
      prepare yet, so, the system is dead.
      
      So, put parse_early_param() in the front of
      early_ioremap_init() in this patch.
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: yinghai@kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <4A8D402F.4080805@cn.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8126dec3
  9. 20 Aug, 2009 4 commits
  10. 18 Aug, 2009 1 commit
    • Jan Beulich's avatar
      i386: Fix section mismatches for init code with !HOTPLUG_CPU · 78b89ecd
      Jan Beulich authored
      Commit 0e83815b changed the
      section the initial_code variable gets allocated in, in an
      attempt to address a section conflict warning. This, however
      created a new section conflict when building without
      HOTPLUG_CPU. The apparently only (reasonable) way to address
      this is to always use __REFDATA.
      
      Once at it, also fix a second section mismatch when not using
      HOTPLUG_CPU.
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <4A8AE7CD020000780001054B@vpn.id2.novell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      78b89ecd
  11. 17 Aug, 2009 3 commits
    • Suresh Siddha's avatar
      x86, pat: Allow ISA memory range uncacheable mapping requests · 1adcaafe
      Suresh Siddha authored
      Max Vozeler reported:
      >  Bug 13877 -  bogl-term broken with CONFIG_X86_PAT=y, works with =n
      >
      >  strace of bogl-term:
      >  814   mmap2(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0)
      >				 = -1 EAGAIN (Resource temporarily unavailable)
      >  814   write(2, "bogl: mmaping /dev/fb0: Resource temporarily unavailable\n",
      >	       57) = 57
      
      PAT code maps the ISA memory range as WB in the PAT attribute, so that
      fixed range MTRR registers define the actual memory type (UC/WC/WT etc).
      
      But the upper level is_new_memtype_allowed() API checks are failing,
      as the request here is for UC and the return tracked type is WB (Tracked type is
      WB as MTRR type for this legacy range potentially will be different for each
      4k page).
      
      Fix is_new_memtype_allowed() by always succeeding the ISA address range
      checks, as the null PAT (WB) and def MTRR fixed range register settings
      satisfy the memory type needs of the applications that map the ISA address
      range.
      Reported-and-Tested-by: default avatarMax Vozeler <xam@debian.org>
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      1adcaafe
    • Ingo Molnar's avatar
      x86, mce: Don't initialize MCEs on unknown CPUs · e412cd25
      Ingo Molnar authored
      An older test-box started hanging at the following point during
      bootup:
      
       [    0.022996] Mount-cache hash table entries: 512
       [    0.024996] Initializing cgroup subsys debug
       [    0.025996] Initializing cgroup subsys cpuacct
       [    0.026995] Initializing cgroup subsys devices
       [    0.027995] Initializing cgroup subsys freezer
       [    0.028995] mce: CPU supports 5 MCE banks
      
      I've bisected it down to commit 4efc0670 ("x86, mce: use 64bit
      machine check code on 32bit"), which utilizes the MCE code on
      32-bit systems too.
      
      The problem is caused by this detail in my config:
      
        # CONFIG_CPU_SUP_INTEL is not set
      
      This disables the quirks in mce_cpu_quirks() but still enables
      MCE support - which then hangs due to the missing quirk
      workaround needed on this CPU:
      
      	if (c->x86 == 6 && c->x86_model < 0x1A && banks > 0)
      		mce_banks[0].init = 0;
      
      The safe solution is to not initialize MCEs if we dont know on
      what CPU we are running (or if that CPU's support code got
      disabled in the config).
      
      Also be a bit more defensive on 32-bit systems: dont do a
      boot-time dump of pending MCEs not just on the specific system
      that we found a problem with (Pentium-M), but earlier ones as
      well.
      
      Now this problem is probably not common and disabling CPU
      support is rare - but still being more defensive in something
      we turned on for a wide range of CPUs is prudent.
      
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      LKML-Reference: Message-ID: <4A88E3E4.40506@jp.fujitsu.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e412cd25
    • Bartlomiej Zolnierkiewicz's avatar
      x86, mce: don't log boot MCEs on Pentium M (model == 13) CPUs · c7f6fa44
      Bartlomiej Zolnierkiewicz authored
      On my legacy Pentium M laptop (Acer Extensa 2900) I get bogus MCE on a cold
      boot with CONFIG_X86_NEW_MCE enabled, i.e. (after decoding it with mcelog):
      
      MCE 0
      HARDWARE ERROR. This is *NOT* a software problem!
      Please contact your hardware vendor
      CPU 0 BANK 1 MCG status:
      MCi status:
      Error overflow
      Uncorrected error
      Error enabled
      Processor context corrupt
      MCA: Data CACHE Level-1 UNKNOWN Error
      STATUS f200000000000195 MCGSTATUS 0
      
      [ The other STATUS values observed: f2000000000001b5 (... UNKNOWN error)
        and f200000000000115 (... READ Error).
      
        To verify that this is not a CONFIG_X86_NEW_MCE bug I also modified
        the CONFIG_X86_OLD_MCE code (which doesn't log any MCEs) to dump
        content of STATUS MSR before it is cleared during initialization. ]
      
      Since the bogus MCE results in a kernel taint (which in turn disables
      lockdep support) don't log boot MCEs on Pentium M (model == 13) CPUs
      by default ("mce=bootlog" boot parameter can be be used to get the old
      behavior).
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Reviewed-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c7f6fa44
  12. 16 Aug, 2009 2 commits
    • Leonardo Potenza's avatar
      x86: Annotate section mismatch warnings in kernel/apic/x2apic_uv_x.c · 52459ab9
      Leonardo Potenza authored
      The function uv_acpi_madt_oem_check() has been marked __init,
      the struct apic_x2apic_uv_x has been marked __refdata.
      
      The aim is to address the following section mismatch messages:
      
      WARNING: arch/x86/kernel/apic/built-in.o(.data+0x1368): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
      The variable apic_x2apic_uv_x references
      the function __cpuinit uv_wakeup_secondary()
      If the reference is valid then annotate the
      variable with __init* or __refdata (see linux/init.h) or name the variable:
      *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
      
      WARNING: arch/x86/kernel/built-in.o(.data+0x68e8): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
      The variable apic_x2apic_uv_x references
      the function __cpuinit uv_wakeup_secondary()
      If the reference is valid then annotate the
      variable with __init* or __refdata (see linux/init.h) or name the variable:
      *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
      
      WARNING: arch/x86/built-in.o(.text+0x7b36f): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_ioremap()
      The function uv_acpi_madt_oem_check() references
      the function __init early_ioremap().
      This is often because uv_acpi_madt_oem_check lacks a __init
      annotation or the annotation of early_ioremap is wrong.
      
      WARNING: arch/x86/built-in.o(.text+0x7b38d): Section mismatch in reference from the function uv_acpi_madt_oem_check() to the function .init.text:early_iounmap()
      The function uv_acpi_madt_oem_check() references
      the function __init early_iounmap().
      This is often because uv_acpi_madt_oem_check lacks a __init
      annotation or the annotation of early_iounmap is wrong.
      
      WARNING: arch/x86/built-in.o(.data+0x8668): Section mismatch in reference from the variable apic_x2apic_uv_x to the function .cpuinit.text:uv_wakeup_secondary()
      The variable apic_x2apic_uv_x references
      the function __cpuinit uv_wakeup_secondary()
      If the reference is valid then annotate the
      variable with __init* or __refdata (see linux/init.h) or name the variable:
      *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
      Signed-off-by: default avatarLeonardo Potenza <lpotenza@inwind.it>
      LKML-Reference: <200908161855.48302.lpotenza@inwind.it>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      52459ab9
    • Hugh Dickins's avatar
      x86, mce: therm_throt: Don't log redundant normality · 4e5c25d4
      Hugh Dickins authored
      0d01f314 "x86, mce: therm_throt
      - change when we print messages" removed redundant
      announcements of "Temperature/speed normal".
      
      They're not worth logging and remove their accompanying
      "Machine check events logged" messages as well from the
      console.
      Signed-off-by: default avatarHugh Dickins <hugh.dickins@tiscali.co.uk>
      Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      LKML-Reference: <Pine.LNX.4.64.0908161544100.7929@sister.anvils>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4e5c25d4
  13. 15 Aug, 2009 1 commit
  14. 13 Aug, 2009 9 commits