Commits · 9ba9b18addcee8b7be8877727738f59a9fd37b29 · Kirill Smelkov / linux

10 Jan, 2007 19 commits

[PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and APM idle code · 9ba9b18a

Ingo Molnar authored Dec 21, 2006

Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
xruns starting at the 2.6.18-rt kernel, and those problems persisted all
until current -rt kernels. The latencies were serious and unjustified by
system load, often in the milliseconds range.

After a patient and heroic multi-month effort of Fernando, where he
tested dozens of kernels, tried various configs, boot options,
test-patches of mine and provided latency traces of those incidents, the
following 'smoking gun' trace was captured by him:

                 _------=> CPU#
                / _-----=> irqs-off
               | / _----=> need-resched
               || / _---=> hardirq/softirq
               ||| / _--=> preempt-depth
               |||| /
               |||||     delay
   cmd     pid ||||| time  |   caller
      \   /    |||||   \   |   /
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
  IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (c01262ba 0 0)
  IRQ_19-1479  1D..1    0us : resched_task (try_to_wake_up)
  IRQ_19-1479  1D..1    0us : __spin_unlock_irqrestore (try_to_wake_up)
  ...
  <idle>-0     1...1   11us!: default_idle (cpu_idle)
  ...
  <idle>-0     0Dn.1  602us : smp_apic_timer_interrupt (c0103baf 1 0)
  ...
   <...>-5856  0D..2  618us : __switch_to (__schedule)
   <...>-5856  0D..2  618us : __schedule <<idle>-0> (20 162)
   <...>-5856  0D..2  619us : __spin_unlock_irq (__schedule)
   <...>-5856  0...1  619us : trace_stop_sched_switched (__schedule)
   <...>-5856  0D..1  619us : trace_stop_sched_switched <<...>-5856> (37 0)

what is visible in this trace is that CPU#1 ran try_to_wake_up() for
PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
for CPU#0. But it decided to not send an IPI that no CPU - due to
TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
result was a 600+ usecs latency and a missed wakeup!

the bug turned out to be an idle-wakeup bug introduced into the mainline
kernel this summer via an optimization in the x86_64 tree:

    commit 495ab9c0
    Author: Andi Kleen <ak@suse.de>
    Date:   Mon Jun 26 13:59:11 2006 +0200

    [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status

    During some profiling I noticed that default_idle causes a lot of
    memory traffic. I think that is caused by the atomic operations
    to clear/set the polling flag in thread_info. There is actually
    no reason to make this atomic - only the idle thread does it
    to itself, other CPUs only read it. So I moved it into ti->status.

the problem is this type of change:

        if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
-               clear_thread_flag(TIF_POLLING_NRFLAG);
+               current_thread_info()->status &= ~TS_POLLING;
                smp_mb__after_clear_bit();
                while (!need_resched()) {
                        local_irq_disable();

this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
clear_thread_flag() is defined as:

        clear_bit(flag, &ti->flags);

and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:

  static inline void clear_bit(int nr, volatile unsigned long * addr)
  {
          __asm__ __volatile__( LOCK_PREFIX
                  "btrl %1,%0"

hence smp_mb__after_clear_bit() is defined as a simple compile barrier:

  #define smp_mb__after_clear_bit()       barrier()

but the explicit TS_POLLING clearing introduced by the patch:

+               current_thread_info()->status &= ~TS_POLLING;

is not an atomic op! So the clearing of the TS_POLLING bit is freely
reorderable with the reading of the NEED_RESCHED bit - and both now
reside in different memory addresses.

CPU idle wakeup very much depends on ordered memory ops, the clearing of
the TS_POLLING flag must always be done before we test need_resched()
and hit the idle instruction(s). [Symmetrically, the wakeup code needs
to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
ordering is paramount.]

Fernando's dual-core Athlon64 system has a sufficiently advanced memory
ordering model so that it triggered this scenario very often.

( And it also turned out that the reason why these latencies never
  triggered on my testsystems is that i routinely use idle=poll, which
  was the only idle variant not affected by this bug. )

The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
act as an absolute barrier between the TS_POLLING write and the
NEED_RESCHED read. This affects almost all idling methods (default,
ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Fernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
[chrisw: backport to 2.6.19.1]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

9ba9b18a

[PATCH] i2c: fix broken ds1337 initialization · 2be250f7

Dirk Eibach authored Dec 20, 2006

On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to
2.6.18 broke RTC support.

The main problem are changes to ds1337_init_client().
When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7
in status register is set. This has to be reset by writing 0 to status
register. But since there are only 16 byte written to the chip and the
first byte is interpreted as an address, the status register (which is
the 16th) is never written.
The other problem is, that initializing all registers to zero is not
valid for day, date and month register. Funny enough this is checked by
ds1337_detect(), which depends on this values not being zero. So then
treated by ds1337_init_client() the ds1337 is not detected anymore,
whereas the failure bit in the status register is still set.

Broken by commit f9e89579 (2.6.16-rc1,
2006-01-06). This fix is in Linus' tree since 2.6.20-rc1 (commit
763d9c04).
Signed-off-by: Dirk Stieler <stieler@gdsys.de>
Signed-off-by: Dirk Eibach <eibach@gdsys.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

2be250f7

[PATCH] Bluetooth: Add packet size checks for CAPI messages (CVE-2006-6106) · d4ea7f9f

Marcel Holtmann authored Dec 11, 2006

With malformed packets it might be possible to overwrite internal
CMTP and CAPI data structures. This patch adds additional length
checks to prevent these kinds of remote attacks.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

d4ea7f9f

[PATCH] SCSI: add missing cdb clearing in scsi_execute() · f7323792

Tejun Heo authored Dec 16, 2006

Clear-garbage-after-CDB patch missed scsi_execute() and it causes some
ODDs (HL-DT-ST DVD-RAM GSA-H30N) choke during SCSI scan.  Note that
this patch is only for -stable.  There is another more reliable fix
for this problem proposed for devel tree.

http://thread.gmane.org/gmane.linux.ide/14605/focus=14605Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Douglas Gilbert <dougg@torque.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

f7323792

[PATCH] IB/srp: Fix FMR mapping for 32-bit kernels and addresses above 4G · 5033031c

Roland Dreier authored Dec 15, 2006

struct srp_device.fmr_page_mask was unsigned long, which means that
the top part of addresses above 4G was being chopped off on 32-bit
architectures.  Of course nothing good happens when data from SRP
targets is DMAed to the wrong place.

Fix this by changing fmr_page_mask to u64, to match the addresses
actually used by IB devices.

Thanks to Brian Cain <Brian.Cain@ge.com> and David McMillen
<davem@systemfabricworks.com> for help diagnosing the bug and testing
the fix.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

5033031c

[PATCH] sched: remove __cpuinitdata anotation to cpu_isolated_map · fb0ddf36

Tim Chen authored Dec 13, 2006

The structure cpu_isolated_map is used not only during initialization.
Multi-core scheduler configuration changes and exclusive cpusets
use this during run time.  During setting of sched_mc_power_savings
 policy, this structure is accessed to update sched_domains.
Signed-off-by: Tim Chen <tim.c.chen@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

fb0ddf36

[PATCH] ARM: Add sys_*at syscalls · 4a40b99a

Russell King authored Dec 13, 2006

Later glibc requires the *at syscalls.  Add them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

4a40b99a

[PATCH] kbuild: don't put temp files in source · 57696190

Roman Zippel authored Dec 12, 2006

The as-instr/ld-option need to create temporary files, but create them in the
output directory, when compiling external modules.  Reformat them a bit and
use $(CC) instead of $(AS) as the former is used by kbuild to assemble files.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Jan Beulich <jbeulich@novell.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: <jpdenheijer@gmail.com>
Cc: Horst Schirmeier <horst@schirmeier.com>
Cc: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

57696190

[PATCH] ieee1394: ohci1394: add PPC_PMAC platform code to driver probe · 459593b9

Stefan Richter authored Dec 12, 2006

Fixes http://bugzilla.kernel.org/show_bug.cgi?id=7431
iBook G3 threw a machine check exception and put the display backlight
to full brightness after ohci1394 was unloaded and reloaded.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
[dsd@gentoo.org: also added missing if condition, commit
 63cca59e]
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

459593b9

[PATCH] libata: handle 0xff status properly · a1803540

Tejun Heo authored Dec 12, 2006

libata waits for !BSY even when the status register reports 0xff.
This causes long boot delays when D8 isn't pulled down properly.  This
patch does the followings.

* don't wait if status register is 0xff in all wait functions

* make ata_busy_sleep() return 0 on success and -errno on failure.
  -ENODEV is returned on 0xff status and -EBUSY on other failures.

* make ata_bus_softreset() succeed on 0xff status.  0xff status is not
  reset failure.  It indicates no device.  This removes unnecessary
  retries on such ports.  Note that the code change assumes unoccupied
  port reporting 0xff status does not produce valid device signature.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Joe Jin <lkmaillist@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a1803540

[PATCH] Revert "[PATCH] zd1211rw: Removed unneeded packed attributes" · a151f584

John W. Linville authored Dec 12, 2006

This reverts commit 4e1bbd84.

Quoth Daniel Drake <dsd@gentoo.org>:

"A user reported that commit 4e1bbd84
(Remove unneeded packed attributes) breaks the zd1211rw driver on ARM."
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a151f584

[PATCH] V4L: Fix broken TUNER_LG_NTSC_TAPE radio support · f37a67a1

Hans Verkuil authored Dec 12, 2006

The TUNER_LG_NTSC_TAPE is identical in all respects to the
TUNER_PHILIPS_FM1236_MK3. So use the params struct for the Philips tuner.
Also add this LG_NTSC_TAPE tuner to the switches where radio specific
parameters are set so it behaves like a TUNER_PHILIPS_FM1236_MK3. This
change fixes the radio support for this tuner (the wrong bandswitch byte
was used).

Thanks to Andy Walls <cwalls@radix.net> for finding this bug.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

f37a67a1

[PATCH] DVB: lgdt330x: fix signal / lock status detection bug · 65bb9cf4

Michael Krufky authored Dec 12, 2006

In some cases when using VSB, the AGC status register has been known to
falsely report "no signal" when in fact there is a carrier lock. The
datasheet labels these status flags as QAM only, yet the lgdt330x
module is using these flags for both QAM and VSB.

This patch allows for the carrier recovery lock status register to be
tested, even if the agc signal status register falsely reports no signal.

Thanks to jcrews from #linuxtv in irc, for initially reporting this bug.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

65bb9cf4

[PATCH] bonding: incorrect bonding state reported via ioctl · fae0ef93

Andy Gospodarek authored Nov 21, 2006

This is a small fix-up to finish out the work done by Jay Vosburgh to
add carrier-state support for bonding devices.  The output in
/proc/net/bonding/bondX was correct, but when collecting the same info
via an iotcl it could still be incorrect.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

fae0ef93

[PATCH] x86-64: Mark rdtsc as sync only for netburst, not for core2 · 33e57a8e

Arjan van de Ven authored Dec 11, 2006

On the Core2 cpus, the rdtsc instruction is not serializing (as defined
in the architecture reference since rdtsc exists) and due to the deep
speculation of these cores, it's possible that you can observe time go
backwards between cores due to this speculation. Since the kernel
already deals with this with the SYNC_RDTSC flag, the solution is
simple, only assume that the instruction is serializing on family 15...

The price one pays for this is a slightly slower gettimeofday (by a
dozen or two cycles), but that increase is quite small to pay for a
really-going-forward tsc counter.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

33e57a8e

[PATCH] ieee80211softmac: Fix mutex_lock at exit of ieee80211_softmac_get_genie · 4ad328ff

Ulrich Kunitz authored Dec 10, 2006

ieee80211softmac_wx_get_genie locks the associnfo mutex at
function exit. This patch fixes it. The patch is against Linus'
tree (commit af1713e0).
Signed-off-by: Ulrich Kunitz <kune@deine-taler.de>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

4ad328ff

[PATCH] read_zero_pagealigned() locking fix · 18576724

Hugh Dickins authored Dec 10, 2006

Ramiro Voicu hits the BUG_ON(!pte_none(*pte)) in zeromap_pte_range: kernel
bugzilla 7645.  Right: read_zero_pagealigned uses down_read of mmap_sem,
but another thread's racing read of /dev/zero, or a normal fault, can
easily set that pte again, in between zap_page_range and zeromap_page_range
getting there.  It's been wrong ever since 2.4.3.

The simple fix is to use down_write instead, but that would serialize reads
of /dev/zero more than at present: perhaps some app would be badly
affected.  So instead let zeromap_page_range return the error instead of
BUG_ON, and read_zero_pagealigned break to the slower clear_user loop in
that case - there's no need to optimize for it.

Use -EEXIST for when a pte is found: BUG_ON in mmap_zero (the other user of
zeromap_page_range), though it really isn't interesting there.  And since
mmap_zero wants -EAGAIN for out-of-memory, the zeromaps better return that
than -ENOMEM.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Ramiro Voicu: <Ramiro.Voicu@cern.ch>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

18576724

[PATCH] sha512: Fix sha384 block size · 80355a9d

Herbert Xu authored Dec 10, 2006

The SHA384 block size should be 128 bytes, not 96 bytes.  This was
spotted by Andrew Donofrio.

This breaks HMAC which uses the block size during setup and the final
calculation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

80355a9d

[PATCH] dm-crypt: Select CRYPTO_CBC · 43cb0cab

Herbert Xu authored Dec 10, 2006

As CBC is the default chaining method for cryptoloop, we should select
it from cryptoloop to ease the transition.  Spotted by Rene Herman.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

43cb0cab

11 Dec, 2006 21 commits

Linux 2.6.19.1 · 1edb5a2d
Chris Wright authored Dec 11, 2006

1edb5a2d

[PATCH] NETLINK: Put {IFA,IFLA}_{RTA,PAYLOAD} macros back for userspace. · f558fdfa

David Miller authored Dec 08, 2006

GLIBC uses them etc.

They are guarded by ifndef __KERNEL__ so nobody will start
accidently using them in the kernel again, it's just for
userspace.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

f558fdfa

[PATCH] forcedeth: Disable INTx when enabling MSI in forcedeth · 39a17363

Daniel Barkalow authored Dec 08, 2006

At least some nforce cards continue to send legacy interrupts when MSI
is enabled, and these interrupts are treated as unhandled by the
kernel. This patch disables legacy interrupts explicitly when enabling
MSI mode.

The correct fix is to change the MSI infrastructure to disable legacy
interrupts when enabling MSI, but this is potentially risky if the
device isn't PCI-2.3 or is quirky, so the correct fix is going into
mainline, while patches like this one go into -stable.

Legend has it that it is most correct to disable legacy interrupts
before enabling MSI, but the mainline patch does it in the other
order, and this patch is "obviously" the same as mainline.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

39a17363

[PATCH] x86: Fix boot hang due to nmi watchdog init code · 3667bf6d

Ravikiran G Thirumalai authored Dec 09, 2006

2.6.19 stopped booting (or booted based on build/config) on our x86_64
systems due to a bug introduced in 2.6.19. check_nmi_watchdog schedules an
IPI on all cpus to busy wait on a flag, but fails to set the busywait
flag if NMI functionality is disabled. This causes the secondary cpus
to spin in an endless loop, causing the kernel bootup to hang.
Depending upon the build, the busywait flag got overwritten (stack variable)
and caused the kernel to bootup on certain builds. Following patch fixes
the bug by setting the busywait flag before returning from check_nmi_watchdog.
I guess using a stack variable is not good here as the calling function could
potentially return while the busy wait loop is still spinning on the flag.

AK: I redid the patch significantly to be cleaner
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

3667bf6d

[PATCH] m32r: make userspace headers platform-independent · a10457cc

Hirokazu Takata authored Dec 08, 2006

The m32r kernel 2.6.18-rc1 or after cause build errors of "unknown isa
configuration" for userspace application programs, such as glibc, gdb, etc.

This is because the recent kernel do not include linux/config.h not to expose
kernel headers for userspace.

To fix the above compile errors, this patch fixes two headers ptrace.h and
sigcontext.h for m32r and makes them platform-independent.
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a10457cc

[PATCH] softirq: remove BUG_ONs which can incorrectly trigger · a3956ef7

Zachary Amsden authored Dec 06, 2006

It is possible to have tasklets get scheduled before softirqd has had a chance
to spawn on all CPUs.  This is totally harmless; after success during action
CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
softirqd on the appropriate CPU to process the already pending tasklets.  So
there is no danger of having a missed wakeup for any tasklets that were
already pending.

In particular, i386 is affected by this during startup, and is visible when
using a very large initrd; during the time it takes for the initrd to be
decompressed, a timer IRQ can come in and schedule RCU callbacks.  It is also
possible that resending of a hardware IRQ via a softirq triggers the same bug.

Because of different timing conditions, this shows up in all emulators and
virtual machines tested, including Xen, VMware, Virtual PC, and Qemu.  It is
also possible to trigger on native hardware with a large enough initrd,
although I don't have a reliable case demonstrating that.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Cc: <caglar@pardus.org.tr>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a3956ef7

[PATCH] autofs: fix error code path in autofs_fill_sb() · 7f803f51

Jiri Kosina authored Dec 06, 2006

When kernel is compiled with old version of autofs (CONFIG_AUTOFS_FS), and
new (observed at least with 5.x.x) automount deamon is started, kernel
correctly reports incompatible version of kernel and userland daemon, but
then screws things up instead of correct handling of the error:

 autofs: kernel does not match daemon version
 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 automount/4199 is trying to release lock (&type->s_umount_key) at:
 [<c0163b9e>] get_sb_nodev+0x76/0xa4
 but there are no more locks to release!

 other info that might help us debug this:
 no locks held by automount/4199.

 stack backtrace:
  [<c0103b15>] dump_trace+0x68/0x1b2
  [<c0103c77>] show_trace_log_lvl+0x18/0x2c
  [<c01041db>] show_trace+0xf/0x11
  [<c010424d>] dump_stack+0x12/0x14
  [<c012e02c>] print_unlock_inbalance_bug+0xe7/0xf3
  [<c012fd4f>] lock_release+0x8d/0x164
  [<c012b452>] up_write+0x14/0x27
  [<c0163b9e>] get_sb_nodev+0x76/0xa4
  [<c0163689>] vfs_kern_mount+0x83/0xf6
  [<c016373e>] do_kern_mount+0x2d/0x3e
  [<c017513f>] do_mount+0x607/0x67a
  [<c0175224>] sys_mount+0x72/0xa4
  [<c0102b96>] sysenter_past_esp+0x5f/0x99
 DWARF2 unwinder stuck at sysenter_past_esp+0x5f/0x99
 Leftover inexact backtrace:
  =======================

and then deadlock comes.

The problem: autofs_fill_super() returns EINVAL to get_sb_nodev(), but
before that, it calls kill_anon_super() to destroy the superblock which
won't be needed.  This is however way too soon to call kill_anon_super(),
because get_sb_nodev() has to perform its own cleanup of the superblock
first (deactivate_super(), etc.).  The correct time to call
kill_anon_super() is in the autofs_kill_sb() callback, which is called by
deactivate_super() at proper time, when the superblock is ready to be
killed.

I can see the same faulty codepath also in autofs4.  This patch solves
issues in both filesystems in a same way - it postpones the
kill_anon_super() until the proper time is signalized by deactivate_super()
calling the kill_sb() callback.

[raven@themaw.net: update comment]
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Ian Kent <raven@themaw.net>
Cc: <stable@kernel.org>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

7f803f51

[PATCH] PM: Fix swsusp debug mode testproc · 1f583f62

Rafael J Wysocki authored Dec 06, 2006

The 'testproc' swsusp debug mode thaws tasks twice in a row, which is _very_
confusing.  Fix that.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

1f583f62

[PATCH] compat: skip data conversion in compat_sys_mount when data_page is NULL · 1157f828

Andrey Mirkin authored Dec 06, 2006

OpenVZ Linux kernel team has found a problem with mounting in compat mode.

Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
leads to oops:

Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
PGD 34d48067 PUD 34d03067 PMD 0
Oops: 0000 [1] SMP
CPU: 0
Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota
parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev
iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS
iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal
processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801
i2c_core e100 mii floppy ide_cd cdrom
Pid: 14656, comm: mount
RIP: 0060:[<ffffffff802bc7c6>]  [<ffffffff802bc7c6>]
compat_sys_mount+0xd6/0x290
RSP: 0000:ffff810034d31f38  EFLAGS: 00010292
RAX: 000000000000002c RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff810034c86bc0 RSI: 0000000000000096 RDI: ffffffff8061fc90
RBP: ffff810034d31f78 R08: 0000000000000000 R09: 000000000000000d
R10: ffff810034d31e58 R11: 0000000000000001 R12: ffff810039dc3000
R13: 000000000805ea48 R14: 0000000000000000 R15: 00000000c0ed0000
FS:  0000000000000000(0000) GS:ffffffff80749000(0033) knlGS:00000000b7d556b0
CS:  0060 DS: 007b ES: 007b CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000034d43000 CR4: 00000000000006e0
Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task
ffff810034c86bc0)
Stack:  0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000
Call Trace:
 [<ffffffff8021e64e>] ia32_sysret+0x0/0xa

Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a
RIP  [<ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
 RSP <ffff810034d31f38>
CR2: 0000000000000000

The problem is that data_page pointer can be NULL, so we should skip data
conversion in this case.
Signed-off-by: Andrey Mirkin <amirkin@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

1157f828

[PATCH] drm-sis linkage fix · ce9507af

Andrew Morton authored Dec 06, 2006

Fix http://bugzilla.kernel.org/show_bug.cgi?id=7606

WARNING: "drm_sman_set_manager" [drivers/char/drm/sis.ko] undefined!

Cc: <daniel-silveira@gee.inatel.br>
Cc: Dave Airlie <airlied@linux.ie>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ce9507af

[PATCH] add bottom_half.h · a030daed

Andrew Morton authored Dec 06, 2006

With CONFIG_SMP=n:

drivers/input/ff-memless.c:384: warning: implicit declaration of function 'local_bh_disable'
drivers/input/ff-memless.c:393: warning: implicit declaration of function 'local_bh_enable'

Really linux/spinlock.h should include linux/interrupt.h.  But interrupt.h
includes sched.h which will need spinlock.h.

So the patch breaks the _bh declarations out into a separate header and
includes it in bothj interrupt.h and spinlock.h.

Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a030daed

[PATCH] NETLINK: Restore API compatibility of address and neighbour bits · 04ff1391

Thomas Graf authored Dec 07, 2006

Restore API compatibility due to bits moved from rtnetlink.h to
separate headers.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

04ff1391

[PATCH] IrDA: Incorrect TTP header reservation · d58808bc

Jeet Chaudhuri authored Dec 08, 2006

We must reserve SAR + MAX_HEADER bytes for IrLMP to fit in.
This fixes an oops reported (and fixed) by Jeet Chaudhuri, when max_sdu_size
is greater than 0.
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

d58808bc

[PATCH] IPSEC: Fix inetpeer leak in ipv4 xfrm dst entries. · 5bcd4af5

David Miller authored Dec 07, 2006

We grab a reference to the route's inetpeer entry but
forget to release it in xfrm4_dst_destroy().

Bug discovered by Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

5bcd4af5

[PATCH] USB: Fix oops in PhidgetServo · 53f95659

Sean Young authored Dec 06, 2006

The PhidgetServo causes an Oops when any of its sysfs attributes are read
or written too, making the driver useless.
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

53f95659

[PATCH] XFRM: Use output device disable_xfrm for forwarded packets · 4bcae319

Patrick McHardy authored Dec 04, 2006

Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.

Always use the output device when checking disable_xfrm.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

4bcae319

[PATCH] TOKENRING: Remote memory corruptor in ibmtr.c · ad8ca99c

David Miller authored Dec 04, 2006

ip_summed changes last summer had missed that one.  As the result,
we have ip_summed interpreted as CHECKSUM_PARTIAL now.  IOW,
->csum is interpreted as offset of checksum in the packet.  net/core/*
will both read and modify the value as that offset, with obvious
reasons.  At the very least it's a remote memory corruptor.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

ad8ca99c

[PATCH] do_coredump() and not stopping rewrite attacks? (CVE-2006-6304) · a526d58e

Alexey Dobriyan authored Dec 02, 2006

On Sat, Dec 02, 2006 at 11:47:44PM +0300, Alexey Dobriyan wrote:
> David Binderman compiled 2.6.19 with icc and grepped for "was set but never
> used". Many warnings are on
> 	http://coderock.org/kj/unused-2.6.19-fs

Heh, the very first line:
fs/exec.c(1465): remark #593: variable "flag" was set but never used

fs/exec.c:
  1477		/*
  1478		 *	We cannot trust fsuid as being the "true" uid of the
  1479		 *	process nor do we know its entire history. We only know it
  1480		 *	was tainted so we dump it as root in mode 2.
  1481		 */
  1482		if (mm->dumpable == 2) {	/* Setuid core dump mode */
  1483			flag = O_EXCL;		/* Stop rewrite attacks */
  1484			current->fsuid = 0;	/* Dump root private */
  1485		}

And then filp_open follows with "flag" totally ignored.
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

a526d58e

[PATCH] IB/ucm: Fix deadlock in cleanup · 68057dcd

Michael S Tsirkin authored Dec 04, 2006

ib_ucm_cleanup_events() holds file_mutex while calling ib_destroy_cm_id().
This can deadlock since ib_destroy_cm_id() flushes event handlers, and
ib_ucm_event_handler() needs file_mutex, too.  Therefore, drop the
file_mutex during the call to ib_destroy_cm_id().
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

68057dcd

[PATCH] softmac: fix unbalanced mutex_lock/unlock in ieee80211softmac_wx_set_mlme · bed569c7

Maxime Austruy authored Dec 03, 2006

Routine ieee80211softmac_wx_set_mlme has one return that fails
to release a mutex acquired at entry.
Signed-off-by: Maxime Austruy <maxime@tralhalla.org>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

bed569c7

[PATCH] NETFILTER: bridge netfilter: deal with martians correctly · 721aed81

Bart De Schuymer authored Dec 04, 2006

The attached patch resolves an issue where a IP DNATed packet with a
martian source is forwarded while it's better to drop it. It also
resolves messages complaining about ip forwarding being disabled while
it's actually enabled. Thanks to lepton <ytht.net@gmail.com> for
reporting this problem.

This is probably a candidate for the -stable release.
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>

721aed81