Commits · e859dc553c857f4672b3bbb73ee9170a901f8712 · Kirill Smelkov / linux

02 May, 2007 40 commits

[PATCH] i386: Implement alternative_io for i386 · e859dc55
Andi Kleen authored May 02, 2007
```
Ported from x86-64.
Signed-off-by: Andi Kleen <ak@suse.de>
```
e859dc55

[PATCH] i386: Evaluate constant cpu features at runtime · 3671df85

Andi Kleen authored May 02, 2007

Redefine cpu_has() to evaluate cpu features already checked in early
boot at compile time.  This way the compiler might eliminate some dead code.
Signed-off-by: Andi Kleen <ak@suse.de>

3671df85

[PATCH] i386: Verify important CPUID bits in real mode · c7f81c94

Andi Kleen authored May 02, 2007

Check some CPUID bits that are needed for compiler generated early in boot.
When the system is still in real mode before changing the VESA BIOS mode
it is possible to still display an visible error message on the screen.

Similar to x86-64.

Includes cleanups from Eric Biederman
Signed-off-by: Andi Kleen <ak@suse.de>

c7f81c94

[PATCH] i386: Drop -traditional in arch/i386/boot · 484ad393
Andi Kleen authored May 02, 2007
```
Needed for followon patch
Signed-off-by: Andi Kleen <ak@suse.de>
```
484ad393
[PATCH] x86-64: Drop -traditional for arch/x86_64/boot · fa0a0091
Andi Kleen authored May 02, 2007
```
Follows i386 and useful cleanup.
Signed-off-by: Andi Kleen <ak@suse.de>
```
fa0a0091
[PATCH] x86-64: Use symbolic CPU features in early CPUID check · 72b1b1d0
Andi Kleen authored May 02, 2007
```
Dead to magic numbers!

Generated code is the same.
Signed-off-by: Andi Kleen <ak@suse.de>
```
72b1b1d0

[PATCH] x86-64: Avoid overflows during apic timer calibration · 4637a74c

David P. Reed authored May 02, 2007

- Use 64bit TSC calculations to avoid handling overflow
- Use 32bit unsigned arithmetic for the APIC timer. This
way overflows are handled correctly.
- Fix exit check of loop to account for apic timer counting down

Signed-off-by: dpreed@reed.com
Signed-off-by: Andi Kleen <ak@suse.de>

4637a74c

[PATCH] x86-64: Shut up 32bit emulation for SIOCGIFCOUNT · 9d016dd4

Andi Kleen authored May 02, 2007

The kernel doesn't implement it, but some programs like java use it
anyways. Shut the code up.
Signed-off-by: Andi Kleen <ak@suse.de>

9d016dd4

[PATCH] x86-64: Define IGNORE_IOCTL() macro for compat_ioctls · 421f0281

Andi Kleen authored May 02, 2007

Define a new IGNORE_IOCTL() to let a compat ioctl not be warned about even when
it is not implemented.

This is the same as COMPATIBLE_IOCTL internally, but better self documentng.

Valid reasons to use this:
- It is implemented with ->compat_ioctl on some device, but programs
  call it on others too.
- The ioctl is not implemented in the native kernel, but programs
  call it commonly anyways.
Most other reasons are not valid.
Signed-off-by: Andi Kleen <ak@suse.de>

421f0281

[PATCH] x86-64: Use the 32bit wd_ops for 64bit too. · 05cb007d

Andi Kleen authored May 02, 2007

This mainly removes a lot of code, replacing it with calls into the new 32bit
perfctr-watchdog.c
Signed-off-by: Andi Kleen <ak@suse.de>

05cb007d

[PATCH] i386: Clean up NMI watchdog code · 09198e68

Andi Kleen authored May 02, 2007

- Introduce a wd_ops structure
- Convert the various nmi watchdogs over to it
- This allows to split the perfctr reservation from the watchdog
setup cleanly.
- Do perfctr reservation globally as it should have always been
- Remove dead code referenced only by unused EXPORT_SYMBOLs
Signed-off-by: Andi Kleen <ak@suse.de>

09198e68

[PATCH] x86-64: set node_possible_map at runtime - try 2 · e3f1caee

Suresh Siddha authored May 02, 2007

Set the node_possible_map at runtime on x86_64.  On a non NUMA system,
num_possible_nodes() will now say '1'.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>

e3f1caee

[PATCH] x86-64: Dynamically adjust machine check interval · 8a336b0a

Tim Hockin authored May 02, 2007

Background:
 We've found that MCEs (specifically DRAM SBEs) tend to come in bunches,
 especially when we are trying really hard to stress the system out.  The
 current MCE poller uses a static interval which does not care whether it
 has or has not found MCEs recently.

Description:
 This patch makes the MCE poller adjust the polling interval dynamically.
 If we find an MCE, poll 2x faster (down to 10 ms).  When we stop finding
 MCEs, poll 2x slower (up to check_interval seconds).  The check_interval
 tunable becomes the max polling interval.  The "Machine check events
 logged" printk() is rate limited to the check_interval, which should be
 identical behavior to the old functionality.

Result:
 If you start to take a lot of correctable errors (not exceptions), you
 log them faster and more accurately (less chance of overflowing the MCA
 registers).  If you don't take a lot of errors, you will see no change.

Alternatives:
 I considered simply reducing the polling interval to 10 ms immediately
 and keeping it there as long as we continue to find errors.  This felt a
 bit heavy handed, but does perform significantly better for the default
 check_interval of 5 minutes (we're using a few seconds when testing for
 DRAM errors).  I could be convinced to go with this, if anyone felt it
 was not too aggressive.

Testing:
 I used an error-injecting DIMM to create lots of correctable DRAM errors
 and verified that the polling interval accelerates.  The printk() only
 happens once per check_interval seconds.

Patch:
 This patch is against 2.6.21-rc7.
Signed-Off-By: Tim Hockin <thockin@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>

8a336b0a

[PATCH] x86-64: ignore vgacon if hardware not present · f82af20e

Gerd Hoffmann authored May 02, 2007

Avoid trying to set up vgacon if there's no vga hardware present.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Alan <alan@lxorguk.ukuu.org.uk>
Acked-by: Ingo Molnar <mingo@elte.hu>

f82af20e

[PATCH] i386: fix wrong comment for syscall stack layout · 889f21ce

Andi Kleen authored May 02, 2007

`ret_from_sys_call' label no longer exist and `syscall_exit' label was
introduced instead.
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Andi Kleen <ak@suse.de>

889f21ce

[PATCH] x86-64: unexport cpu_llc_id · 425001fe

Andrew Morton authored May 02, 2007

WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data:cpu_llc_id from __ksymtab between '__ksymtab_cpu_llc_id' (at offset 0x4a0) and '__ksymtab_smp_num_siblings'

It is strange to export a __cpuinitdata symbols to modules, and no module
appears to use it anyway.

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>

425001fe

[PATCH] i386: convert to the kthread API · f26d6a2b

Eric W. Biederman authored May 02, 2007

This patch just trivial converts from calling kernel_thread and daemonize
to just calling kthread_run.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

f26d6a2b

[PATCH] i386: pte simplify ops · 9e5e3162

Zachary Amsden authored May 02, 2007

Add comment and condense code to make use of native_local_ptep_get_and_clear
function.  Also, it turns out the 2-level and 3-level paging definitions were
identical, so move the common definition into pgtable.h
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>

9e5e3162

[PATCH] i386: pte xchg optimization · 142dd975

Zachary Amsden authored May 02, 2007

In situations where page table updates need only be made locally, and there is
no cross-processor A/D bit races involved, we need not use the heavyweight
xchg instruction to atomically fetch and clear page table entries.  Instead,
we can just read and clear them directly.

This introduces a neat optimization for non-SMP kernels; drop the atomic xchg
operations from page table updates.

Thanks to Michel Lespinasse for noting this potential optimization.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>

142dd975

[PATCH] i386: pte clear optimization · c2c1accd

Zachary Amsden authored May 02, 2007

When exiting from an address space, no special hypervisor notification of page
table updates needs to occur; direct page table hypervisors, such as Xen,
switch to another address space first (init_mm) and unprotects the page tables
to avoid the cost of trapping to the hypervisor for each pte_clear. Shadow
mode hypervisors, such as VMI and lhype don't need to do the extra work of
calling through paravirt-ops, and can just directly clear the page table
entries without notifiying the hypervisor, since all the page tables are about
to be freed.

So introduce native_pte_clear functions which bypass any paravirt-ops
notification. This results in a significant performance win for VMI and
removes some indirect calls from zap_pte_range.

Note the 3-level paging already had a native_pte_clear function, thus
demanding argument conformance and extra args for the 2-level definition.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>

c2c1accd

[PATCH] i386: remove xtime_lock'ing around cpufreq notifier · df3624aa

Daniel Walker authored May 02, 2007

The locking of the xtime_lock around the cpu notifier is unessesary now.
At one time the tsc was used after a frequency change for timekeeping, but
the re-write of timekeeping no longer uses the TSC unless the frequency is
constant.

The variables that are changed in this section of code had also once been
used for timekeeping, but not any longer ..
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

df3624aa

[PATCH] x86-64: skip cache_free_alien() on non NUMA · 62918a03

Siddha, Suresh B authored May 02, 2007

Set use_alien_caches to 0 on non NUMA platforms.  And avoid calling the
cache_free_alien() when use_alien_caches is not set.  This will avoid the
cache miss that happens while dereferencing slabp to get nodeid.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

62918a03

[PATCH] x86-64: Auto compute __NR_syscall_max at compile time · 57a4f91a
Andi Kleen authored May 02, 2007
```
No need to maintain it anymore
Signed-off-by: Andi Kleen <ak@suse.de>
```
57a4f91a

[PATCH] i386: check capability · 2f3c30e6

Joachim Deguara authored May 02, 2007

Currently the i386 architecture checks the family for mce capability and this
removes that and uses the CPUID information.  Tested on a K8 revE and a
family10h processor.

This eliminates checking of a set AMD procesor family if mce is
allowed and relies on the information being in CPUID.
Signed-off-by: Joachim Deguara <joachim.deguara@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

2f3c30e6

[PATCH] i386: clean up flush_tlb_others fn · 1bdae458

Keshavamurthy, Anil S authored May 02, 2007

Cleanup flush_tlb_others(), no functional change.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

1bdae458

[PATCH] i386: replace spin_lock_irqsave with spin_lock · 62dbc210

Hisashi Hifumi authored May 02, 2007

IRQ is already disabled through local_irq_disable().  So
spin_lock_irqsave() can be replaced with spin_lock().
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

62dbc210

[PATCH] i386: avoid checking for cpu gone when CONFIG_HOTPLUG_CPU not defined · e8a72ffa

Keshavamurthy, Anil S authored May 02, 2007

Avoid checking for cpu gone in mm hot path when CONFIG_HOTPLUG_CPU is not
defined.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

e8a72ffa

[PATCH] x86-64: move __vgetcpu_mode & __jiffies to the vsyscall_2 zone · 141a892f

Eric Dumazet authored May 02, 2007

We apparently hit the 1024 limit of vsyscall_0 zone when some debugging
options are set, or if __vsyscall_gtod_data is 64 bytes larger.

In order to save 128 bytes from the vsyscall_0 zone, we move __vgetcpu_mode
& __jiffies to vsyscall_2 zone where they really belong, since they are
used only from vgetcpu() (which is in this vsyscall_2 area).

After patch is applied, new layout is :

ffffffffff600000 T vgettimeofday
ffffffffff60004e t vsysc2
ffffffffff600140 t vread_hpet
ffffffffff600150 t vread_tsc
ffffffffff600180 D __vsyscall_gtod_data
ffffffffff600400 T vtime
ffffffffff600413 t vsysc1
ffffffffff600800 T vgetcpu
ffffffffff600870 D __vgetcpu_mode
ffffffffff600880 D __jiffies
ffffffffff600c00 T venosys_1
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

141a892f

[PATCH] i386: PARAVIRT: fix startup_ipi_hook config dependency · 0260c196

Jeremy Fitzhardinge authored May 02, 2007

startup_ipi_hook depends on CONFIG_X86_LOCAL_APIC, so move it to the
right part of the paravirt_ops initialization.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>

0260c196

[PATCH] i386: fix mtrr sections · 25c16b99

Randy Dunlap authored May 02, 2007

Fix section mismatch warnings in mtrr code.
Fix line length on one source line.

WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x103)
WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x180)
WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x199)
WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text: from .text.get_mtrr_state after 'get_mtrr_state' (at offset 0x1c1)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

25c16b99

[PATCH] x86-64: Use safe_apic_wait_icr_idle in __send_IPI_dest_field - x86_64 · 70ae77f4

Fernando Luis [** ISO-8859-1 charset **] VzquezCao authored May 02, 2007

Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is
NMI_VECTOR to avoid potential hangups in the event of crash when kdump
tries to stop the other CPUs.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

70ae77f4

[PATCH] i386: Use safe_apic_wait_icr_idle in safe_apic_wait_icr_idle - i386 · f5efb41e

Fernando Luis [** ISO-8859-1 charset **] VzquezCao authored May 02, 2007

Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is
NMI_VECTOR to avoid potential hangups in the event of crash when kdump
tries to stop the other CPUs.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

f5efb41e

[PATCH] x86-64: __send_IPI_dest_field - x86_64 · 9062d888

Fernando Luis [** ISO-8859-1 charset **] VzquezCao authored May 02, 2007

Implement __send_IPI_dest_field which can be used to send IPIs when the
"destination shorthand" field of the ICR is set to 00 (destination
field). Use it whenever possible.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

9062d888

[PATCH] i386: __send_IPI_dest_field - i386 · 45ae5e96

Fernando Luis [** ISO-8859-1 charset **] VzquezCao authored May 02, 2007

Implement __send_IPI_dest_field which can be used to send IPIs when the
"destination shorthand" field of the ICR is set to 00 (destination
field). Use it whenever possible.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

45ae5e96

[PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 · 3144c332

Fernando Luis VazquezCao authored May 02, 2007

inquire_remote_apic is used for APIC debugging, so use
safe_apic_wait_icr_idle  instead of apic_wait_icr_idle to avoid possible
lockups when APIC delivery fails.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

3144c332

[PATCH] i386: use safe_apic_wait_icr_idle in smpboot.c · 4312fa81

Fernando Luis VazquezCao authored May 02, 2007

__inquire_remote_apic is used for APIC debugging, so use
safe_apic_wait_icr_idle  instead of apic_wait_icr_idle to avoid possible
lockups when APIC delivery fails.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

4312fa81

[PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 · ea8c733b

Fernando Luis VazquezCao authored May 02, 2007

The functionality provided by the new safe_apic_wait_icr_idle is being
open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle
instead to consolidate code and ease maintenance.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

ea8c733b

[PATCH] i386: use safe_apic_wait_icr_idle - i386 · ae08e43e

Fernando Luis VazquezCao authored May 02, 2007

The functionality provided by the new safe_apic_wait_icr_idle is being
open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle
instead to consolidate code and ease maintenance.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

ae08e43e

[PATCH] x86-64: safe_apic_wait_icr_idle - x86_64 · 8339e9fb

Fernando Luis VazquezCao authored May 02, 2007

apic_wait_icr_idle looks like this:

static __inline__ void apic_wait_icr_idle(void)
{
  while (apic_read(APIC_ICR) & APIC_ICR_BUSY)
    cpu_relax();
}

The busy loop in this function would not be problematic if the
corresponding status bit in the ICR were always updated, but that does
not seem to be the case under certain crash scenarios. Kdump uses an IPI
to stop the other CPUs in the event of a crash, but when any of the
other CPUs are locked-up inside the NMI handler the CPU that sends the
IPI will end up looping forever in the ICR check, effectively
hard-locking the whole system.

Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3:

"A local APIC unit indicates successful dispatch of an IPI by
resetting the Delivery Status bit in the Interrupt Command
Register (ICR). The operating system polls the delivery status
bit after sending an INIT or STARTUP IPI until the command has
been dispatched.

A period of 20 microseconds should be sufficient for IPI dispatch
to complete under normal operating conditions. If the IPI is not
successfully dispatched, the operating system can abort the
command. Alternatively, the operating system can retry the IPI by
writing the lower 32-bit double word of the ICR. This “time-out”
mechanism can be implemented through an external interrupt, if
interrupts are enabled on the processor, or through execution of
an instruction or time-stamp counter spin loop."

Intel's documentation suggests the implementation of a time-out
mechanism, which, by the way, is already being open-coded in some parts
of the kernel that tinker with ICR.

Create a apic_wait_icr_idle replacement that implements the time-out
mechanism and that can be used to solve the aforementioned problem.

AK: moved both functions out of line
AK: Added improved loop from Keith Owens
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

8339e9fb

[PATCH] i386: safe_apic_wait_icr_idle - i386 · f2b218dd

Fernando Luis VazquezCao authored May 02, 2007

apic_wait_icr_idle looks like this:

static __inline__ void apic_wait_icr_idle(void)
{
  while (apic_read(APIC_ICR) & APIC_ICR_BUSY)
    cpu_relax();
}

The busy loop in this function would not be problematic if the
corresponding status bit in the ICR were always updated, but that does
not seem to be the case under certain crash scenarios. Kdump uses an IPI
to stop the other CPUs in the event of a crash, but when any of the
other CPUs are locked-up inside the NMI handler the CPU that sends the
IPI will end up looping forever in the ICR check, effectively
hard-locking the whole system.

Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3:

"A local APIC unit indicates successful dispatch of an IPI by
resetting the Delivery Status bit in the Interrupt Command
Register (ICR). The operating system polls the delivery status
bit after sending an INIT or STARTUP IPI until the command has
been dispatched.

A period of 20 microseconds should be sufficient for IPI dispatch
to complete under normal operating conditions. If the IPI is not
successfully dispatched, the operating system can abort the
command. Alternatively, the operating system can retry the IPI by
writing the lower 32-bit double word of the ICR. This “time-out”
mechanism can be implemented through an external interrupt, if
interrupts are enabled on the processor, or through execution of
an instruction or time-stamp counter spin loop."

Intel's documentation suggests the implementation of a time-out
mechanism, which, by the way, is already being open-coded in some parts
of the kernel that tinker with ICR.

Create a apic_wait_icr_idle replacement that implements the time-out
mechanism and that can be used to solve the aforementioned problem.

AK: moved both functions out of line
AK: added improved loop from Keith Owens
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Andi Kleen <ak@suse.de>

f2b218dd