Commits · 4060994c3e337b40e0f6fa8ce2cc178e021baf3d · nexedi / linux

15 Nov, 2005 40 commits

Merge x86-64 update from Andi · 4060994c
Linus Torvalds authored Nov 14, 2005

4060994c

[PATCH] x86_64: Fix sparse mem · d3ee871e

Bob Picco authored Nov 05, 2005

Fix up booting with sparse mem enabled. Otherwise it would just
cause an early PANIC at boot.
Signed-off-by: Bob Picco <bob.picco@hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

d3ee871e

[PATCH] x86_64: Increase the maximum number of local APICs to the maximum · 8893166f

Andi Kleen authored Nov 05, 2005

This is needed for large multinode IBM systems which have a sparse
APIC space in clustered mode, fully covering the available 8 bits.

The previous kernels would limit the local APIC number to 127,
which caused it to reject some of the CPUs at boot.

I increased the maximum and shrunk the apic_version array a bit
to make up for that (the version is only 8 bit, so don't need
an full int to store)

Cc:  Chris McDermott <lcm@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8893166f

[PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for pagefault tracing · 9e43e1b7

Andi Kleen authored Nov 05, 2005

CONFIG_CHECKING covered some debugging code used in the early times
of the port. But it wasn't even SMP safe for quite some time
and the bugs it checked for seem to be gone.

This patch removes all the code to verify GS at kernel entry. There
haven't been any new bugs in this area for a long time.

Previously it also covered the sysctl for the page fault tracing.
That didn't make much sense because that code was unconditionally
compiled in. I made that a boot option now because it is typically
only useful at boot.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

9e43e1b7

[PATCH] x86_64: Make node boundaries consistent · ffd10a2b

Magnus Damm authored Nov 05, 2005

The current x86_64 NUMA memory code is inconsequent when it comes to node
memory ranges. The exact behaviour varies depending on which config option
that is used.

setup_node_bootmem() has start and end as arguments and these are used to
calculate the size of the node like this: (end - start). This is all fine
if end is pointing to the first non-available byte. The problem is that the
current x86_64 code sometimes treats it as the last present byte and sometimes
as the first non-available byte. The result is that some configurations might
lose a page at the end of the range.

This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU
so they all treat the end variable as the first non-available byte. This is
the same way as the single node code.

The patch is boot tested on dual x86_64 hardware with the above configurations,
but maybe the removed code is needed as some workaround?
Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ffd10a2b

[PATCH] x86_64: Log machine checks from boot on Intel systems · e583538f

Andi Kleen authored Nov 05, 2005

The logging for boot errors was turned off because it was broken
on some AMD systems. But give Intel EM64T systems a chance because they are
supposed to be correct there.

The advantage is that there is a chance to actually log uncorrected
machine checks after the reset.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e583538f

[PATCH] x86_64: Make ACPI NUMA and NUMA emulation peers of K8_NUMA in Kconfig · b0bd35e6

Ravikiran G Thirumalai authored Nov 05, 2005

On x86_64 arches, there is no way to choose ACPI_NUMA without having to choose
K8_NUMA.  CONFIG_K8_NUMA is not needed for Intel EM64T NUMA boxes.  It also
looks odd if you have to select ACPI_NUMA from the power management menu.
This patch fixes those oddities.  Patch does the following:

1. Makes NUMA a config option like other arches
2. Makes topology detection options like K8_NUMA dependent on NUMA
3. Choosing ACPI NUMA detection can be done from the standard
   "Processor type and features" menu

AK: I fixed up the dependencies and changed the help texts a bit
on top of Kiran's patch.
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b0bd35e6

[PATCH] x86_64: Use common sys_time64 · efbbdce9

Paolo 'Blaisorblade' Giarrusso authored Nov 05, 2005

Keeping this function does not makes sense because it's a copied (and
buggy) copy of sys_time.  The only difference is that now.tv_sec (which is
a time_t, i.e.  a 64-bit long) is copied (and truncated) into a int
(32-bit).

The prototype is the same (they both take a long __user *), so let's drop
this and redirect it to sys_time (and make sure it exists by defining
__ARCH_WANT_SYS_TIME).

Only disadvantage is that the sys_stime definition is also compiled (may be
fixed if needed by adding a separate __ARCH_WANT_SYS_STIME macro, and
defining it for all arch's defining __ARCH_WANT_SYS_TIME except x86_64).
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

efbbdce9

[PATCH] x86_64: Set ____cacheline_maxaligned_in_smp alignment to 128 bytes · bf0f2e23

Paolo 'Blaisorblade' Giarrusso authored Nov 05, 2005

The current value was correct before the introduction of Intel EM64T support -
but now L1_CACHE_SHIFT_MAX can be less than L1_CACHE_SHIFT, which _is_ funny!

Between the few users of ____cacheline_maxaligned_in_smp, we also have (for
example) rcu_ctrlblk, and struct zone, with zone->{lru_,}lock.  I.e.  we have
a lot of excess cacheline bouncing on them.

No correctness issues, obviously.  So this could even be merged for 2.6.14
(I'm not a fan of this idea, though).

CC: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bf0f2e23

[PATCH] x86_64: Remove asm-x86_64/rwsem.h · 8e0d4f4e

Andi Kleen authored Nov 05, 2005

Not needed since x86-64 always uses the spinlock based rwsems.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8e0d4f4e

[PATCH] x86_64: Remove optimization for B stepping AMD K8 · a5b250a4

Andi Kleen authored Nov 05, 2005

B stepping were the first shipping Opterons. memcpy/memset/copy_page/
clear_page had special optimized version for them. These are really
old and in the minority now and the difference to the generic versions
(using rep microcode) is not that big anyways. So just remove them.

TODO: figure out optimized versions for Intel Netburst based EM64T
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a5b250a4

[PATCH] x86_64: Reduce number of retries for reset through keyboard controller · a6f5deb2

Andi Kleen authored Nov 05, 2005

Old code could retry for 10 seconds worst time. Only try it
for one second now.

Suggested by Yinghai Lu

Cc: Yinghai.Lu@amd.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a6f5deb2

[PATCH] x86_64: x86_64/i386 fix Intel cache detection code assumption about threads sharing · 2b091875

Siddha, Suresh B authored Nov 05, 2005

Fix the Intel cache detection code assumption that number of threads
sharing the cache will either be equal to number of HT or core siblings.

This also cleans up the code in general a bit.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2b091875

[PATCH] x86-64/i386: Intel HT, Multi core detection fixes · 94605eff

Siddha, Suresh B authored Nov 05, 2005

Fields obtained through cpuid vector 0x1(ebx[16:23]) and
vector 0x4(eax[14:25], eax[26:31]) indicate the maximum values and might not
always be the same as what is available and what OS sees.  So make sure
"siblings" and "cpu cores" values in /proc/cpuinfo reflect the values as seen
by OS instead of what cpuid instruction says. This will also fix the buggy BIOS
cases (for example where cpuid on a single core cpu says there are "2" siblings,
even when HT is disabled in the BIOS.
http://bugzilla.kernel.org/show_bug.cgi?id=4359)
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

94605eff

[PATCH] x86_64: Fix NUMA node lookup debug code which had bitrotted · e90f22ed
Andi Kleen authored Nov 05, 2005
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
e90f22ed

[PATCH] x86_64: Don't enable interrupt unconditionally in reboot path · 3506229f

Andi Kleen authored Nov 05, 2005

When they were disabled before (e.g. after a panic) it's better
to keep them off, otherwise followon panics can happen from timer
interrupt handlers etc.

Drawback is that pageup in the console won't work anymore though.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

3506229f

[PATCH] x86_64: Formatting fixes for arch/x86_64/kernel/process.c · a88cde13

Andi Kleen authored Nov 05, 2005

No functional changes.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

a88cde13

[PATCH] x86_64: Allow modular build of ia32 aout loader · ea0be473
Andi Kleen authored Nov 05, 2005
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
ea0be473

[PATCH] x86_64: Force correct address space size for MTRR on some 64bit Intel Xeons · af9c142d

Shaohua Li authored Nov 05, 2005

They report 40bit, but only have 36bits of physical address space.
This caused problems with setting up the correct masks for MTRR.

CPUID workaround for steppings 0F33h(supporting x86) and 0F34h(supporting x86
and EM64T). Detail info can be found at:
http://download.intel.com/design/Xeon/specupdt/30240216.pdf
http://download.intel.com/design/Pentium4/specupdt/30235221.pdf

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

af9c142d

[PATCH] AGP: Make gart iterator in K8 AGP driver SMP safe · 1d2e6bd8

Andi Kleen authored Nov 05, 2005

Ugh!

Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1d2e6bd8

[PATCH] AGP: Try unsupported AGP chipsets on x86-64 by default · 172efbb4

Andi Kleen authored Nov 05, 2005

So far all new ones have worked and there isn't much variation because
the CPU does all the interesting bits.

So enable try unsupported by default.

Can be still disabled with try_unsupported=0 (module) or
amd64.try_unsupported=0   (boot option)
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

172efbb4

[PATCH] AGP: Support ULI/ALI 1689 bridge on AMD64 · 870b7681

Andi Kleen authored Nov 05, 2005

(no name because I'm not sure of the correct name)

Cc: davej@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

870b7681

[PATCH] x86_64: Optimize NUMA node hash function · 529a3404

Eric Dumazet authored Nov 05, 2005

Compute the highest possible value for memnode_shift, in order to reduce
footprint of memnodemap[] to the minimum, thus making all users
(phys_to_nid(), kfree()), more cache friendly.

Before the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 23 for the hash shift. Max adder is 3ffffffff

After the patch :

 Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
 Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
 Using 33 for the hash shift.

In this case, only 2 bytes of memnodemap[] are used, instead of 2048
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

529a3404

[PATCH] x86_64: Save/restore CS in 64bit signal handlers and force __USER_CS for CS · e4e5d324

Bryan Ford authored Nov 05, 2005

This allows to run 64bit signal handlers in 64bit processes that run small
code snippets in compat mode.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e4e5d324

[PATCH] x86_64: New heuristics to find out hotpluggable CPUs. · 420f8f68

Andi Kleen authored Nov 05, 2005

With a NR_CPUS==128 kernel with CPU hotplug enabled we would waste 4MB
on per CPU data of all possible CPUs.  The reason was that HOTPLUG
always set up possible map to NR_CPUS cpus and then we need to allocate
that much (each per CPU data is roughly ~32k now)

The underlying problem is that ACPI didn't tell us how many hotplug CPUs
the platform supports.  So the old code just assumed all, which would
lead to this memory wastage.

This implements some new heuristics:

 - If the BIOS specified disabled CPUs in the ACPI/mptables assume they
   can be enabled later (this is bending the ACPI specification a bit,
   but seems like a obvious extension)
 - The user can overwrite it with a new additionals_cpus=NUM option
 - Otherwise use half of the available CPUs or 2, whatever is more.

Cc: ashok.raj@intel.com
Cc: len.brown@intel.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

420f8f68

[PATCH] x86_64: Use int operations in spinlocks to support more than 128 CPUs spinning. · 485832a5
Andi Kleen authored Nov 05, 2005
```
Pointed out by Eric Dumazet
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
485832a5

[PATCH] x86_64: Some clarifications for Documention/x86_64/mm.txt · 8315eca2

Andi Kleen authored Nov 05, 2005

I got some questions on this, so just fix up the documentation.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8315eca2

[PATCH] x86_64: Replace swiotlb extern with include · 59170891

Andi Kleen authored Nov 05, 2005

Minor victory on the continuous quest against all stray extern.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

59170891

[PATCH] x86_64: Replace cpu_pda extern with include · 4d74dbd7

Andi Kleen authored Nov 05, 2005

Minor cleanup - remove obsolete extern
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

4d74dbd7

[PATCH] x86_64: Only use asm/sections.h to declare section symbols · 2bc0414e

Andi Kleen authored Nov 05, 2005

Adding __initdata_* to asm-generic/sections.h
Replaces a lot of open coded externs in arch/x86_64/*
I had to change __bss_end to __bss_stop to match the other architectures.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2bc0414e

[PATCH] x86_64: Don't apply __PHYSICAL_MASK to page frame numbers · 6b75aeed

Andi Kleen authored Nov 05, 2005

It is for physical addresses, not for PFNs.

Pointed out by Tejun Heo.

Cc: htejun@gmail.com
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6b75aeed

[PATCH] x86_64: Unmap NULL during early bootup · f6c2e333

Siddha, Suresh B authored Nov 05, 2005

We should zap the low mappings, as soon as possible, so that we can catch
kernel bugs more effectively. Previously early boot had NULL mapped
and didn't trap on NULL references.

This patch introduces boot_level4_pgt, which will always have low identity
addresses mapped.  Druing boot, all the processors will use this as their
level4 pgt.  On BP, we will switch to init_level4_pgt as soon as we enter C
code and zap the low mappings as soon as we are done with the usage of
identity low mapped addresses.  On AP's we will zap the low mappings as
soon as we jump to C code.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f6c2e333

[PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA · 69d81fcd

Andi Kleen authored Nov 05, 2005

Not go from the CPU number to an mapping array.
Mode number is often used now in fast paths.

This also adds a generic numa_node_id to all the topology includes

Suggested by Eric Dumazet
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

69d81fcd

[PATCH] x86_64: Fix gcc 4 warning in aperture.c · 50895c5d

Andi Kleen authored Nov 05, 2005

Fix

  arch/x86_64/kernel/aperture.c: In function #iommu_hole_init#:
  arch/x86_64/kernel/aperture.c:199: warning: #aper_order# may be used uninitialized in this function
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

50895c5d

[PATCH] x86-64/i386: Fix CPU model for family 6 · f5f786d0

Suresh Siddha authored Nov 05, 2005

According to cpuid instruction in IA32 SDM-Vol2, when computing cpu model,
we need to consider extended model ID for family 0x6 also.

AK: Also added fixes/simplifcation from Petr Vandrovec
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f5f786d0

[PATCH] x86_64: Remove duplicate __cpuinit define · e9b59d83

Ashok Raj authored Nov 05, 2005

Remove duplicate __cpuinit in smp.c. Already defined in init.h which is
already included.
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e9b59d83

[PATCH] x86_64: Use the DMA32 zone for dma_alloc_coherent()/pci_alloc_consistent · 47492d36
Andi Kleen authored Nov 05, 2005
```
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
```
47492d36

[PATCH] x86_64: Remove obsolete ARCH_HAS_ATOMIC_UNSIGNED and page_flags_t · 07808b74

Andi Kleen authored Nov 05, 2005

Has been introduced for x86-64 at some point to save memory
in struct page, but has been obsolete for some time. Just
remove it.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

07808b74

[PATCH] x86_64: Fix up outdated pfn_to_page comment · 1dff7f3d

Andi Kleen authored Nov 05, 2005

pfn_to_page really requires pfn_valid to be true now, no question.
Some people stumbled over it, but it was misleading and wrong.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1dff7f3d

[PATCH] i386/x86-64: Share interrupt vectors when there is a large number of interrupt sources · 6004e1b7

James Cleverdon authored Nov 05, 2005

Here's a patch that builds on Natalie Protasevich's IRQ compression
patch and tries to work for MPS boots as well as ACPI. It is meant for
a 4-node IBM x460 NUMA box, which was dying because it had interrupt
pins with GSI numbers > NR_IRQS and thus overflowed irq_desc.

The problem is that this system has 270 GSIs (which are 1:1 mapped with
I/O APIC RTEs) and an 8-node box would have 540. This is much bigger
than NR_IRQS (224 for both i386 and x86_64). Also, there aren't enough
vectors to go around. There are about 190 usable vectors, not counting
the reserved ones and the unused vectors at 0x20 to 0x2F. So, my patch
attempts to compress the GSI range and share vectors by sharing IRQs.

Cc: "Protasevich, Natalie" <Natalie.Protasevich@unisys.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6004e1b7