1. 02 Mar, 2010 1 commit
  2. 01 Mar, 2010 1 commit
    • Wu Fengguang's avatar
      resource: Fix generic page_is_ram() for partial RAM pages · 37b99dd5
      Wu Fengguang authored
      The System RAM walk shall skip partial RAM pages and avoid calling
      func() on them. So that page_is_ram() return 0 for a partial RAM page.
      
      In particular, it shall not call func() with len=0.
      This fixes a boot time bug reported by Sachin and root caused by Thomas:
      
      > >>> WARNING: at arch/x86/mm/ioremap.c:111 __ioremap_caller+0x169/0x2f1()
      > >>> Hardware name: BladeCenter LS21 -[79716AA]-
      > >>> Modules linked in:
      > >>> Pid: 0, comm: swapper Not tainted 2.6.33-git6-autotest #1
      > >>> Call Trace:
      > >>> [<ffffffff81047cff>] ? __ioremap_caller+0x169/0x2f1
      > >>> [<ffffffff81063b7d>] warn_slowpath_common+0x77/0xa4
      > >>> [<ffffffff81063bb9>] warn_slowpath_null+0xf/0x11
      > >>> [<ffffffff81047cff>] __ioremap_caller+0x169/0x2f1
      > >>> [<ffffffff813747a3>] ? acpi_os_map_memory+0x12/0x1b
      > >>> [<ffffffff81047f10>] ioremap_nocache+0x12/0x14
      > >>> [<ffffffff813747a3>] acpi_os_map_memory+0x12/0x1b
      > >>> [<ffffffff81282fa0>] acpi_tb_verify_table+0x29/0x5b
      > >>> [<ffffffff812827f0>] acpi_load_tables+0x39/0x15a
      > >>> [<ffffffff8191c8f8>] acpi_early_init+0x60/0xf5
      > >>> [<ffffffff818f2cad>] start_kernel+0x397/0x3a7
      > >>> [<ffffffff818f2295>] x86_64_start_reservations+0xa5/0xa9
      > >>> [<ffffffff818f237a>] x86_64_start_kernel+0xe1/0xe8
      > >>> ---[ end trace 4eaa2a86a8e2da22 ]---
      > >>> ioremap reserve_memtype failed -22
      
      The return code is -EINVAL, so it failed in the is_ram check, which is
      not too surprising
      
      > BIOS-provided physical RAM map:
      >  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
      >  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
      >  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
      >  BIOS-e820: 0000000000100000 - 00000000cffa3900 (usable)
      >  BIOS-e820: 00000000cffa3900 - 00000000cffa7400 (ACPI data)
      
      The ACPI data is not starting on a page boundary and neither does the
      usable RAM area end on a page boundary. Very useful !
      
      > ACPI: DSDT 00000000cffa3900 036CE (v01 IBM    SERLEWIS 00001000 INTL 20060912)
      
      ACPI is trying to map DSDT at cffa3900, which results in a check
      vs. cffa3000 which is the relevant page boundary. The generic is_ram
      check correctly identifies that as RAM because it's in the usable
      resource area. The old e820 based is_ram check does not take
      overlapping resource areas into account. That's why it works.
      
      CC: Sachin Sant <sachinp@in.ibm.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
      LKML-Reference: <20100301135551.GA9998@localhost>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      37b99dd5
  3. 27 Feb, 2010 3 commits
  4. 25 Feb, 2010 3 commits
    • Pekka Enberg's avatar
      x86, mm: Unify kernel_physical_mapping_init() API · c1fd1b43
      Pekka Enberg authored
      This patch changes the 32-bit version of kernel_physical_mapping_init() to
      return the last mapped address like the 64-bit one so that we can unify the
      call-site in init_memory_mapping().
      
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      LKML-Reference: <alpine.DEB.2.00.1002241703570.1180@melkki.cs.helsinki.fi>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      c1fd1b43
    • Ian Campbell's avatar
      x86, mm: Allow highmem user page tables to be disabled at boot time · 14315592
      Ian Campbell authored
      Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
      enable CONFIG_HIGHPTE for any x86 configuration which has highmem
      enabled. This means that the overhead applies even to machines which
      have a fairly modest amount of high memory and which therefore do not
      really benefit from allocating PTEs in high memory but still pay the
      price of the additional mapping operations.
      
      Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
      no actual highptes being allocated there was a reduction in system
      time used from 59.737s to 55.9s.
      
      With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
        Average Optimal load -j 4 Run (std deviation):
        Elapsed Time 175.396 (0.238914)
        User Time 515.983 (5.85019)
        System Time 59.737 (1.26727)
        Percent CPU 263.8 (71.6796)
        Context Switches 39989.7 (4672.64)
        Sleeps 42617.7 (246.307)
      
      With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
        Average Optimal load -j 4 Run (std deviation):
        Elapsed Time 174.278 (0.831968)
        User Time 515.659 (6.07012)
        System Time 55.9 (1.07799)
        Percent CPU 263.8 (71.266)
        Context Switches 39929.6 (4485.13)
        Sleeps 42583.7 (373.039)
      
      This patch allows the user to control the allocation of PTEs in
      highmem from the command line ("userpte=nohigh") but retains the
      status-quo as the default.
      
      It is possible that some simple heuristic could be developed which
      allows auto-tuning of this option however I don't have a sufficiently
      large machine available to me to perform any particularly meaningful
      experiments. We could probably handwave up an argument for a threshold
      at 16G of total RAM.
      
      Assuming 768M of lowmem we have 196608 potential lowmem PTE
      pages. Each page can map 2M of RAM in a PAE-enabled configuration,
      meaning a maximum of 384G of RAM could potentially be mapped using
      lowmem PTEs.
      
      Even allowing generous factor of 10 to account for other required
      lowmem allocations, generous slop to account for page sharing (which
      reduces the total amount of RAM mappable by a given number of PT
      pages) and other innacuracies in the estimations it would seem that
      even a 32G machine would not have a particularly pressing need for
      highmem PTEs. I think 32G could be considered to be at the upper bound
      of what might be sensible on a 32 bit machine (although I think in
      practice 64G is still supported).
      
      It's seems questionable if HIGHPTE is even a win for any amount of RAM
      you would sensibly run a 32 bit kernel on rather than going 64 bit.
      Signed-off-by: default avatarIan Campbell <ian.campbell@citrix.com>
      LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      14315592
    • Thadeu Lima de Souza Cascardo's avatar
      x86: Do not reserve brk for DMI if it's not going to be used · e808bae2
      Thadeu Lima de Souza Cascardo authored
      This will save 64K bytes from memory when loading linux if DMI is
      disabled, which is good for embedded systems.
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
      LKML-Reference: <1265758732-19320-1-git-send-email-cascardo@holoscopio.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      e808bae2
  5. 17 Feb, 2010 6 commits
  6. 16 Feb, 2010 26 commits