Commits · 1d0691674764098304ae4c63c715f5883b4d3784 · nexedi / linux

20 Dec, 2007 9 commits

[IPV4] ip_gre: set mac_header correctly in receive path · 1d069167

Timo Teras authored Dec 20, 2007

mac_header update in ipgre_recv() was incorrectly changed to
skb_reset_mac_header() when it was introduced.
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d069167

[XFRM]: Audit function arguments misordered · 5951cab1

Paul Moore authored Dec 20, 2007

In several places the arguments to the xfrm_audit_start() function are
in the wrong order resulting in incorrect user information being
reported.  This patch corrects this by pacing the arguments in the
correct order.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5951cab1

[IPSEC]: Avoid undefined shift operation when testing algorithm ID · f398035f

Herbert Xu authored Dec 19, 2007

The aalgos/ealgos fields are only 32 bits wide. However, af_key tries
to test them with the expression 1 << id where id can be as large as
253. This produces different behaviour on different architectures.

The following patch explicitly checks whether ID is greater than 31
and fails the check if that's the case.

We cannot easily extend the mask to be longer than 32 bits due to
exposure to user-space. Besides, this whole interface is obsolete
anyway in favour of the xfrm_user interface which doesn't use this
bit mask in templates (well not within the kernel anyway).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

f398035f

[IPV4] ARP: Remove not used code · e0260fed

Mark Ryden authored Dec 19, 2007

In arp_process() (net/ipv4/arp.c), there is unused code: definition
and assignment of tha (target hw address ).
Signed-off-by: Mark Ryden <markryde@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0260fed

[TG3]: Endianness bugfix. · 286e310f

Al Viro authored Dec 17, 2007

tg3_nvram_write_block_unbuffered() is reading data from nvram into
allocated buffer before overwriting a part of it with user-supplied
data. Then it feeds the entire page back to nvram. It should be
storing the words it had read as little-endian, not as host-endian.
Note that tg3_set_eeprom() does exactly that for padding the same
data to full words before it gets passed down to tg3_nvram_write_block()
and then to tg3_nvram_write_block_unbuffered().

Moreover, when we get to sending the entire thing back to nvram, we
go through it word-by-word, doing essentially
writel(swab32(le32_to_cpu(word)), ...)
so if we want them to reach the card in host-independent endianness,
we'd better really have all that buffer filled with fixed-endian.
For user-supplied part we obviously do have that (it's an array of
octets memcpy'd in), ditto for padding of user-supplied part to word
boundaries (taken care of in tg3_set_eeprom()). The rest of the
buffer gets filled by tg3_nvram_write_block_unbuffered() and it would
damn better be consistent with that (and with tg3_get_eeprom(), while
we are at it - there we also convert the words read from nvram to
little-endian before returning the buffer to user).

The bug should get triggered on big-endian boxen when set_eeprom is done
for less than entire page. Then the words that should've been unaffected
at all will actually get byteswapped in place in nvram.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

286e310f

[TG3]: Endianness annotations. · b9fc7dc5

Al Viro authored Dec 17, 2007

Fixed misannotations, introduced a new helper - tg3_nvram_read_le().
It gets __le32 * instead of u32 * and puts there the value converted
to little-endian.  A lot of callers of tg3_nvram_read() were doing
that; converted them to tg3_nvram_read_le().

At that point the driver is practically endian-clean; the only remaining
place is an actual bug, AFAICS; will be dealt with in the next patch.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9fc7dc5

NET: mac80211: fix inappropriate memory freeing · 20880e89

Cyrill Gorcunov authored Dec 13, 2007

Fix inappropriate memory freeing in case of requested rate_control_ops was
not found.  In this case the list head entity is going to be accidentally
wasted.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

20880e89

mac80211: fix header ops · 3333590e

Johannes Berg authored Dec 12, 2007

When using recvfrom() on a SOCK_DGRAM packet socket, I noticed that the MAC
address passed back for wireless frames was always completely wrong. The
reason for this is that the header parse function assigned to our virtual
interfaces is a function parsing an 802.11 rather than 802.3 header. This
patch fixes it by keeping the default ethernet header operations assigned.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

3333590e

mac80211: Drop out of associated state if link is lost · 2d192d95

Michael Wu authored Nov 10, 2007

There is no point in staying in IEEE80211_ASSOCIATED if there is no
sta_info entry to receive frames with.
Signed-off-by: Michael Wu <flamingice@sourmilk.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>

2d192d95

19 Dec, 2007 19 commits

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 · 4486c5f5

Linus Torvalds authored Dec 19, 2007

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Adjust CMCI mask on CPU hotplug
  [IA64] make flush_tlb_kernel_range() an inline function
  [IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS
  [IA64] Fix Altix BTE error return status
  [IA64] Remove assembler warnings on head.S
  [IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c
  [IA64] set_thread_area fails in IA32 chroot
  [IA64] print kernel release in OOPS to make kerneloops.org happy
  [IA64] Two trivial spelling fixes
  [IA64] Avoid unnecessary TLB flushes when allocating memory
  [IA64] ia32 nopage
  [IA64] signal: remove redundant code in setup_sigcontext()
  IA64: Slim down __clear_bit_unlock

4486c5f5

pata_hpt37x: Fix HPT374 detection · f941b168

Alan Cox authored Dec 19, 2007

Bug #9261
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f941b168

ps3fb: Fix ps3fb free_irq() dev_id · fcbe6e97

Geoff Levand authored Dec 19, 2007

The dev_id arg passed to free_irq() must match that passed to
request_irq().

Fixes this PS3 error message:

  Trying to free already-free IRQ 44
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fcbe6e97

ps3fb: Update for firmware 2.10 · 9ac67a35

Geert Uytterhoeven authored Dec 19, 2007

ps3fb: Update for firmware 2.10

As of PS3 firmware version 2.10, the GPU command buffer size must be at least 2
MiB large. Since we use only a small part of the GPU command buffer and don't
want to waste precious XDR memory, move the GPU command buffer back to the
start of the XDR memory reserved for ps3fb and let the unused part overlap with
the actual frame buffer.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9ac67a35

Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 · c7eeae73

Linus Torvalds authored Dec 19, 2007

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] initio: bugfix for accessors patch
  [SCSI] st: fix kernel BUG at include/linux/scatterlist.h:59!
  [SCSI] initio: fix conflict when loading driver
  [SCSI] sym53c8xx: fix "irq X: nobody cared" regression
  [SCSI] dpt_i2o: driver is only 32 bit so don't set 64 bit DMA mask
  [SCSI] sym53c8xx: fix free_irq() regression

c7eeae73

Do dirty page accounting when removing a page from the page cache · 3a692790

Linus Torvalds authored Dec 19, 2007

Krzysztof Oledzki noticed a dirty page accounting leak on some of his
machines, causing the machine to eventually lock up when the kernel
decided that there was too much dirty data, but nobody could actually
write anything out to fix it.

The culprit turns out to be filesystems (cough ext3 with data=journal
cough) that re-dirty the page when the "->invalidatepage()" callback is
called.

Fix it up by doing a final dirty page accounting check when we actually
remove the page from the page cache.

This fixes bugzilla entry 9182:

	http://bugzilla.kernel.org/show_bug.cgi?id=9182Tested-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Krzysztof Oledzki <olel@ans.pl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

3a692790

[IA64] Adjust CMCI mask on CPU hotplug · ed5d4026

Hidetoshi Seto authored Dec 19, 2007

Currently CMCI mask of hot-added CPU is always disabled after CPU hotplug.
We should adjust this mask depending on CMC polling state.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

ed5d4026

[IA64] make flush_tlb_kernel_range() an inline function · 285fbd66

Jan Beulich authored Dec 19, 2007

This fixes an unused variable warning in mm/vmalloc.c.

Tony: also fix resulting fallout in uncached.c with a
typo in args to flush_tlb_kernel_range().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

285fbd66

[IA64] Guard elfcorehdr_addr with #if CONFIG_PROC_FS · 17fbe004

Simon Horman authored Nov 12, 2007

Access to elfcorehdr_addr needs to be guarded by #if CONFIG_PROC_FS
as well as the existing #if guards.

Fixes the following build problem:

arch/ia64/hp/common/built-in.o: In function
`sba_init':arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
:arch/ia64/hp/common/sba_iommu.c:2043: undefined reference to `elfcorehdr_addr'
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>

17fbe004

[IA64] Fix Altix BTE error return status · 64135fa9

Russ Anderson authored Aug 21, 2007

The Altix shub2 BTE error detail bits are in a different location
than on shub1.  The current code does not take this into account
resulting in all shub2 BTE failures mapping to "unknown".

This patch reads the error detail bits from the proper location,
so the correct BTE failure reason is returned for both shub1
and shub2.
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

64135fa9

[IA64] Remove assembler warnings on head.S · 09106228

Hidetoshi Seto authored Dec 12, 2007

This patch removes the following assembler warning messages.

  AS      arch/ia64/kernel/head.o
arch/ia64/kernel/head.S: Assembler messages:
arch/ia64/kernel/head.S:1179: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1179: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
arch/ia64/kernel/head.S:1180: Warning: Use of 'ld8' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1180: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
 :
arch/ia64/kernel/head.S:1213: Warning: Use of 'ldf.fill.nta' violates RAW dependency 'CR[PTA]' (data)
arch/ia64/kernel/head.S:1213: Warning: Only the first path encountering the conflict is reported
arch/ia64/kernel/head.S:1178: Warning: This is the location of the conflicting usage
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

09106228

[IA64] Remove compiler warinings about uninitialized variable in irq_ia64.c · 373167e8

Kenji Kaneshige authored Aug 22, 2007

This patch removes the following compiler warning messages.

CC arch/ia64/kernel/irq_ia64.o
arch/ia64/kernel/irq_ia64.c: In function 'create_irq':
arch/ia64/kernel/irq_ia64.c:343: warning: 'domain.bits[0u]' may be used uninitialized in this function
arch/ia64/kernel/irq_ia64.c: In function 'assign_irq_vector':
arch/ia64/kernel/irq_ia64.c:203: warning: 'domain.bits[0u]' may be used uninitialized in this function
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

373167e8

[IA64] set_thread_area fails in IA32 chroot · e384f414

Ian Wienand authored Nov 20, 2007

I tried to upgrade an IA32 chroot on my IA64 to a new glibc with TLS.
It kept dying because set_thread_area was returning -ESRCH
(bugs.debian.org/451939).

I instrumented arch/ia64/ia32/sys_ia32.c:get_free_idx() and ended up
seeing output like

[pid] idx   desc->a  desc->b
-----------------------------
[2710] 0 -> c6b0ffff 40dff31b
[2710] 1 -> 0 0
[2710] 2 -> 0 0

[2710] 0 -> c6b0ffff 40dff31b
[2710] 1 -> c6b0ffff 40dff31b
[2710] 2 -> 0 0

[2711] 0 -> c6b0ffff 40dff31b
[2711] 1 -> c6b0ffff 40dff31b
[2711] 2 -> 48c0ffff 40dff317

which suggested to me that TLS pointers were surviving exec() calls,
leading to GDT pointers filling up and the eventual failure of
get_free_idx().

I think the solution is flushing the tls array on exec.
Signed-Off-By: Ian Wienand <ianw@gelato.unsw.edu.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>

e384f414

[IA64] print kernel release in OOPS to make kerneloops.org happy · ee211b37

Luck, Tony authored Dec 18, 2007

The ia64 oops message doesn't include the kernel version, which
makes it hard to automatically categorize oops messages scraped
from mailing lists and bug databases.
Signed-off-by: Tony Luck <tony.luck@intel.com>

ee211b37

[IA64] Two trivial spelling fixes · 313d8e57

Joe Perches authored Dec 18, 2007

s/addres/address/
s/performanc/performance/
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

313d8e57

[IA64] Avoid unnecessary TLB flushes when allocating memory · aec103bf

de Dinechin, Christophe (Integrity VM) authored Dec 13, 2007

Improve performance of memory allocations on ia64 by avoiding a global TLB
purge to purge a single page from the file cache. This happens whenever we
evict a page from the buffer cache to make room for some other allocation.

Test case: Run 'find /usr -type f | xargs cat > /dev/null' in the
background to fill the buffer cache, then run something that uses memory,
e.g. 'gmake -j50 install'. Instrumentation showed that the number of
global TLB purges went from a few millions down to about 170 over a 12
hours run of the above.

The performance impact is particularly noticeable under virtualization,
because a virtual TLB is generally both larger and slower to purge than
a physical one.
Signed-off-by: Christophe de Dinechin <ddd@hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

aec103bf

[IA64] ia32 nopage · 3cdc7fc7

Nick Piggin authored Dec 13, 2007

Convert ia64's ia32 support from nopage to fault.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

3cdc7fc7

[IA64] signal: remove redundant code in setup_sigcontext() · 2018df76

Shi Weihua authored Dec 13, 2007

This patch removes some redundant code in the function setup_sigcontext().

The registers ar.ccv,b7,r14,ar.csd,ar.ssd,r2-r3 and r16-r31 are not
restored in restore_sigcontext() when (flags & IA64_SC_FLAG_IN_SYSCALL) is
true. So we don't need to zero those variables in setup_sigcontext().
Signed-off-by: Shi Weihua <shiwh@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

2018df76

IA64: Slim down __clear_bit_unlock · a3ebdb6c

Christoph Lameter authored Dec 18, 2007

__clear_bit_unlock does not need to perform atomic operations on the
variable.  Avoid a cmpxchg and simply do a store with release semantics.
Add a barrier to be safe that the compiler does not do funky things.

Tony: Use intrinsic rather than inline assembler
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

a3ebdb6c

18 Dec, 2007 12 commits

[SCSI] initio: bugfix for accessors patch · a169e637

Boaz Harrosh authored Dec 17, 2007

patch: [SCSI] initio: convert to use the data buffer accessors had a
small but fatal bug in that it didn't increment the pointer into the
initio scatterlist descriptors as it looped over the block generated
ones. Fixed here.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

a169e637

[SCSI] st: fix kernel BUG at include/linux/scatterlist.h:59! · cd81621c

FUJITA Tomonori authored Dec 15, 2007

This is caused by a missing scatterlist initialisation (it only shows
up when sg list handling debugging is turned on).
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Kai Makisara <Kai.Makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

cd81621c

[SCSI] initio: fix conflict when loading driver · 99f1f534

Alan Cox authored Dec 13, 2007

> I have a scanner connected to a Initio INI-950 SCSI card and I recently
> upgraded from SuSE 10.2 to 10.3.  The new kernel doesn't see any of my
> devices.  I get the following in /var/log/messages:
>
> ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 16
> initio: I/O port range 0x0 is busy.
> ACPI: PCI interrupt for device 0000:00:0a.0 disabled

Humm not a collision - thats a bug in the driver updating.  Looks like the
changes I made and combined with Christoph's lost a line somewhere when I
was merging it all.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

99f1f534

[SCSI] sym53c8xx: fix "irq X: nobody cared" regression · cedefa13

Tony Battersby authored Dec 14, 2007

The patch described by the following excerpt from ChangeLog-2.6.24-rc1
eventually causes a "irq X: nobody cared" error after a while:

commit 99c9e0a1
Author: Matthew Wilcox <matthew@wil.cx>
Date:   Fri Oct 5 15:55:12 2007 -0400

    [SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE

After this happens, the kernel disables the IRQ, causing the SCSI card
to stop working until the next reboot.  The problem is caused by the
interrupt handler returning IRQ_NONE instead of IRQ_HANDLED after
handling an interrupt-on-the-fly (INTF) condition.  The following patch
fixes the problem.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

cedefa13

[SCSI] dpt_i2o: driver is only 32 bit so don't set 64 bit DMA mask · c80ddf00

James Bottomley authored Dec 12, 2007

This fixes a potential corruption bug where the truncation would cause
reading or writing to the wrong memory area on machines with >4GB of
main memory.

Cc: Stable Kernel Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

c80ddf00

[SCSI] sym53c8xx: fix free_irq() regression · 7ee2413c

Tony Battersby authored Nov 06, 2007

The following commit changed the pointer passed to request_irq(), but
failed to change the pointer passed to free_irq():

commit 99c9e0a1
Author: Matthew Wilcox <matthew@wil.cx>
Date:   Fri Oct 5 15:55:12 2007 -0400

    [SCSI] sym53c8xx: Make interrupt handler capable of returning IRQ_NONE

    ...

The result is that free_irq() doesn't actually take any action.  This
patch fixes it.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

7ee2413c

Merge git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86 · 3e3b3916

Linus Torvalds authored Dec 18, 2007

* git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86:
  x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!"
  genirq: revert lazy irq disable for simple irqs
  x86: also define AT_VECTOR_SIZE_ARCH
  x86: kprobes bugfix
  x86: jprobe bugfix
  timer: kernel/timer.c section fixes
  genirq: add unlocked version of set_irq_handler()
  clockevents: fix reprogramming decision in oneshot broadcast
  oprofile: op_model_athlon.c support for AMD family 10h barcelona performance counters

3e3b3916

x86: fix "Kernel panic - not syncing: IO-APIC + timer doesn't work!" · 4aae0702

Ingo Molnar authored Dec 18, 2007

this is the tale of a full day spent debugging an ancient but elusive bug.

after booting up thousands of random .config kernels, i finally happened
to generate a .config that produced the following rare bootup failure
on 32-bit x86:

| ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
| ..MP-BIOS bug: 8254 timer not connected to IO-APIC
| ...trying to set up timer (IRQ0) through the 8259A ...  failed.
| ...trying to set up timer as Virtual Wire IRQ... failed.
| ...trying to set up timer as ExtINT IRQ... failed :(.
| Kernel panic - not syncing: IO-APIC + timer doesn't work!  Boot with apic=debug
| and send a report.  Then try booting with the 'noapic' option

this bug has been reported many times during the years, but it was never
reproduced nor fixed.

the bug that i hit was extremely sensitive to .config details.

First i did a .config-bisection - suspecting some .config detail.
That led to CONFIG_X86_MCE: enabling X86_MCE magically made the bug disappear
and the system would boot up just fine.

Debugging my way through the MCE code ended up identifying two unlikely
candidates: the thing that made a real difference to the hang was that
X86_MCE did two printks:

 Intel machine check architecture supported.
 Intel machine check reporting enabled on CPU#1.

Adding the same printks to a !CONFIG_X86_MCE kernel made the bug go away!

this left timing as the main suspect: i experimented with adding various
udelay()s to the arch/x86/kernel/io_apic_32.c:check_timer() function, and
the race window turned out to be narrower than 30 microseconds (!).

That made debugging especially funny, debugging without having printk
ability before the bug hits is ... interesting ;-)

eventually i started suspecting IRQ activities - those are pretty much the
only thing that happen this early during bootup and have the timescale of
a few dozen microseconds. Also, check_timer() changes the IRQ hardware
in various creative ways, so the main candidate became IRQ0 interaction.

i've added a counter to track timer irqs (on which core they arrived, at
what exact time, etc.) and found that no timer IRQ would arrive after the
bug condition hits - even if we re-enable IRQ0 and re-initialize the i8259A,
but that we'd get a small number of timer irqs right around the time when we
call the check_timer() function.

Eventually i got the following backtrace triggered from debug code in the
timer interrupt:

...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ...
Pid: 1, comm: swapper Not tainted (2.6.24-rc5 #57)
EIP: 0060:[<c044d57e>] EFLAGS: 00000246 CPU: 0
EIP is at _spin_unlock_irqrestore+0x5/0x1c
EAX: c0634178 EBX: 00000000 ECX: c4947d63 EDX: 00000246
ESI: 00000002 EDI: 00010031 EBP: c04e0f2e ESP: f7c41df4
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: ffe04000 CR3: 00630000 CR4: 000006d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
  [<c05f5784>] setup_IO_APIC+0x9c3/0xc5c

the spin_unlock() was called from init_8259A(). Wait ... we have an IRQ0
entry while we are in the middle of setting up the local APIC, the i8259A
and the PIT??

That is certainly not how it's supposed to work! check_timer() was supposed
to be called with irqs turned off - but this eroded away sometime in the
past. This code would still work most of the time because this code runs
very quickly, but just the right timing conditions are present and IRQ0
hits in this small, ~30 usecs window, timer irqs stop and the system does
not boot up. Also, given how early this is during bootup, the hang is
very deterministic - but it would only occur on certain machines (and
certain configs).

The fix was quite simple: disable/restore interrupts properly in this
function. With that in place the test-system now boots up just fine.

(64-bit x86 io_apic_64.c had the same bug.)

Phew! One down, only 1500 other kernel bugs are left ;-)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

4aae0702

genirq: revert lazy irq disable for simple irqs · 971e5b35

Steven Rostedt authored Dec 18, 2007

In commit 76d21601 lazy irq disabling
was implemented, and the simple irq handler had a masking set to it.

Remy Bohmer discovered that some devices in the ARM architecture
would trigger the mask, but never unmask it. His patch to do the
unmasking was questioned by Russell King about masking simple irqs
to begin with. Looking further, it was discovered that the problems
Remy was seeing was due to improper use of the simple handler by
devices, and he later submitted patches to fix those. But the issue
that was uncovered was that the simple handler should never mask.

This patch reverts the masking in the simple handler.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>

971e5b35

x86: also define AT_VECTOR_SIZE_ARCH · 213fde71

Jan Beulich authored Dec 18, 2007

The patch introducing this left out 64-bit x86 despite it also having
extra entries.

this solves Xen guest troubles.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

213fde71

x86: kprobes bugfix · 0b0122fa

Masami Hiramatsu authored Dec 18, 2007

Kprobes for x86-64 may cause a kernel crash if it inserted on "iret"
instruction. "call absolute" is invalid on x86-64, so we don't need
treat it.

 - Change the processing order as same as x86-32.
 - Add "iret"(0xcf) case.
 - Remove next_rip local variable.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

0b0122fa

x86: jprobe bugfix · 29b6cd79

Masami Hiramatsu authored Dec 18, 2007

jprobe for x86-64 may cause kernel page fault when the jprobe_return()
is called from incorrect function.

- Use jprobe_saved_regs instead getting it from stack.
  (Especially on x86-64, it may get incorrect data, because
   pt_regs can not be get by using container_of(rsp))
- Change the type of stack pointer to unsigned long *.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

29b6cd79