Commits · 720e8a63908eb18aad1721c1429e89fbf7cf0ca6 · Kirill Smelkov / linux

24 Jun, 2004 5 commits

[PATCH] rcu lock update: Use a sequence lock for starting batches · 720e8a63

Andrew Morton authored Jun 23, 2004

From: Manfred Spraul <manfred@colorfullife.com>

Step two for reducing cacheline trashing within rcupdate.c:

rcu_process_callbacks always acquires rcu_ctrlblk.state.mutex and calls
rcu_start_batch, even if the batch is already running or already scheduled to
run.

This can be avoided with a sequence lock: A sequence lock allows to read the
current batch number and next_pending atomically.  If next_pending is already
set, then there is no need to acquire the global mutex.

This means that for each grace period, there will be

- one write access to the rcu_ctrlblk.batch cacheline

- lots of read accesses to rcu_ctrlblk.batch (3-10*cpus_online()).  Behavior
  similar to the jiffies cacheline, shouldn't be a problem.

- cpus_online()+1 write accesses to rcu_ctrlblk.state, all of them starting
  with spin_lock(&rcu_ctrlblk.state.mutex).

  For large enough cpus_online() this will be a problem, but all except two
  of the spin_lock calls only protect the rcu_cpu_mask bitmap, thus a
  hierarchical bitmap would allow to split the write accesses to multiple
  cachelines.

Tested on an 8-way with reaim.  Unfortunately it probably won't help with Jack
Steiner's 'ls' test since in this test only one cpu generates rcu entries.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

720e8a63

[PATCH] rcu lock update: Add per-cpu batch counter · 5c60169a

Andrew Morton authored Jun 23, 2004

From: Manfred Spraul <manfred@colorfullife.com>

Below is the one of my patches from my rcu lock update.  Jack Steiner tested
the first one on a 512p and it resolved the rcu cache line trashing.  All were
tested on osdl with STP.

Step one for reducing cacheline trashing within rcupdate.c:

The current code uses the rcu_cpu_mask bitmap both for keeping track of the
cpus that haven't gone through a quiescent state and for checking if a cpu
should look for quiescent states.  The bitmap is frequently changed and the
check is done by polling - together this causes cache line trashing.

If it's cheaper to access a (mostly) read-only cacheline than a cacheline that
is frequently dirtied, then it's possible to reduce the trashing by splitting
the rcu_cpu_mask bitmap into two cachelines:

The patch adds a generation counter and moves it into a separate cacheline.
This allows to removes all accesses to rcu_cpumask (in the read-write
cacheline) from rcu_pending and at least 50% of the accesses from
rcu_check_quiescent_state.  rcu_pending and all but one call per cpu to
rcu_check_quiescent_state access the read-only cacheline.  Probably not enough
for 512p, but it's a start, just for 128 byte more memory use, without slowing
down rcu grace periods.  Obviously the read-only cacheline is not really
read-only: it's written once per grace period to indicate that a new grace
period is running.

Tests on an 8-way Pentium III with reaim showed some improvement:

oprofile hits:
Reference: http://khack.osdl.org/stp/293075/
Hits	   %
23741     0.0994  rcu_pending
19057     0.0798  rcu_check_quiescent_state
6530      0.0273  rcu_check_callbacks

Patched: http://khack.osdl.org/stp/293076/
8291      0.0579  rcu_pending
5475      0.0382  rcu_check_quiescent_state
3604      0.0252  rcu_check_callbacks

The total runtime differs between both runs, thus the % number must
be compared: Around 50% faster. I've uninlined rcu_pending for the
test.

Tested with reaim and kernbench.

Description:

- per-cpu quiescbatch and qs_pending fields introduced: quiescbatch contains
  the number of the last quiescent period that the cpu has seen and qs_pending
  is set if the cpu has not yet reported the quiescent state for the current
  period.  With these two fields a cpu can test if it should report a
  quiescent state without having to look at the frequently written
  rcu_cpu_mask bitmap.

- curbatch split into two fields: rcu_ctrlblk.batch.completed and
  rcu_ctrlblk.batch.cur.  This makes it possible to figure out if a grace
  period is running (completed != cur) without accessing the rcu_cpu_mask
  bitmap.

- rcu_ctrlblk.maxbatch removed and replaced with a true/false next_pending
  flag: next_pending=1 means that another grace period should be started
  immediately after the end of the current period.  Previously, this was
  achieved by maxbatch: curbatch==maxbatch means don't start, curbatch!=
  maxbatch means start.  A flag improves the readability: The only possible
  values for maxbatch were curbatch and curbatch+1.

- rcu_ctrlblk split into two cachelines for better performance.

- common code from rcu_offline_cpu and rcu_check_quiescent_state merged into
  cpu_quiet.

- rcu_offline_cpu: replace spin_lock_irq with spin_lock_bh, there are no
  accesses from irq context (and there are accesses to the spinlock with
  enabled interrupts from tasklet context).

- rcu_restart_cpu introduced, s390 should call it after changing nohz:
  Theoretically the global batch counter could wrap around and end up at
  RCU_quiescbatch(cpu).  Then the cpu would not look for a quiescent state and
  rcu would lock up.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

5c60169a

[PATCH] Move saved_command_line to init/main.c · b884e838

Andrew Morton authored Jun 23, 2004

From: Rusty Russell <rusty@rustcorp.com.au>

Currently every arch declares its own char saved_command_line[].  Make sure
every arch defines COMMAND_LINE_SIZE in asm/setup.h, and declare
saved_command_line in linux/init.h (init/main.c contains the definition).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b884e838

[PATCH] jbd needs to wait for locked buffers · 4d4f4cc4

Andrew Morton authored Jun 23, 2004

From: Chris Mason <mason@suse.com>

jbd needs to wait for any io to complete on the buffer before changing the
end_io function.  Using set_buffer_locked means that it can change the
end_io function while the page is in the middle of writeback, and the
writeback bit on the page will never get cleared.

Since we set the buffer dirty earlier on, if the page was previously dirty,
pdflush or memory pressure might trigger a writepage call, which will race
with jbd's set_buffer_locked.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

4d4f4cc4

[PATCH] Allow i386 to reenable interrupts on lock contention · 36f9f209

Andrew Morton authored Jun 23, 2004

From: Zwane Mwaikambo <zwane@linuxpower.ca>

Following up on Keith's code, I adapted the i386 code to allow enabling
interrupts during contested locks depending on previous interrupt
enable status. Obviously there will be a text increase (only for non
CONFIG_SPINLINE case), although it doesn't seem so bad, there will be an
increased exit latency when we attempt a lock acquisition after spinning
due to the extra instructions. How much this will affect performance I'm
not sure yet as I haven't had time to micro bench.

   text    data     bss     dec     hex filename
2628024  921731       0 3549755  362a3b vmlinux-after
2621369  921731       0 3543100  36103c vmlinux-before
2618313  919222       0 3537535  35fa7f vmlinux-spinline

The code has been stress tested on a 16x NUMAQ (courtesy OSDL).
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

36f9f209

23 Jun, 2004 4 commits

Merge bk://kernel.bkbits.net/davem/net-2.6 · 5e1c40de
Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
5e1c40de

[PATCH] bug in V-link handling (arch/i386/pci/irq.c) · 28908331

Alexander Viro authored Jun 22, 2004

Via southbridges use register 0x3c of the on-board devices (USB and
AC97) to control interrupt routing for those. In drivers/pci/quirks.c we
set it correctly (dev->irq & 15). However, in pirq_enable_irq() where the
second half of that stuff lives, we forget to apply the mask.

That's what causes problems with ioapic on via motherboards in 2.6.
One-liner below ACKed by Alan, verified on via-based boxen here, obviously
doesn't affect non-via ones (we only set interrupt_line_quirk for via
chipsets).

28908331

Merge · 3f84dc4d
David S. Miller authored Jun 22, 2004

3f84dc4d
[NETFILTER]: Fix iptable_raw.c build with older gcc. · 5b447af2
Hideaki Yoshifuji authored Jun 22, 2004

5b447af2

22 Jun, 2004 9 commits

[PATCH] ppc32: Support for new Apple laptop models · f4897eb3

Jesse Barnes authored Jun 22, 2004

This adds sound support for some of the newer PowerBooks.  It appears
that this chip supports the AWACS sample rates, but has a snapper-style
mixer.  Tested and works on my PowerBook5,4. 
Signed-off-by: Jesse Barnes <jbarnes@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f4897eb3

[PATCH] Handle altivec assist exception properly · 7a08473b

Paul Mackerras authored Jun 22, 2004

This is the PPC64 counterpart of the PPC32 Altivec assist exception
handler that went in recently.

On PPC64 machines with Altivec (i.e.  machines that use the PPC970 chip,
such as the G5 powermac), the altivec floating-point instructions can
operate in two modes: one where denormalized inputs or outputs are
truncated to zero, and one where they aren't.  In the latter mode the
processor can take an exception when it encounters denormalized
floating-point inputs or outputs rather than dealing with them in
hardware.

This patch adds code to deal properly with the exception, by emulating
the instruction that caused the exception.  Previously the kernel just
switched the altivec unit into the truncate-to-zero mode, which works
but is a bit gross.  Fortunately there are only a limited set of altivec
instructions which can generate the assist exception, so we don't have
to emulate the whole altivec instruction set.

Note that Altivec is Motorola's name for the PowerPC vector/SIMD
instructions; IBM calls the same thing VMX, and currently only IBM makes
64-bit PowerPC CPU chips.  Nevertheless, I have used the term Altivec in
the PPC64 code for consistency with the PPC32 code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

7a08473b

[PATCH] radeonfb: Fix panel detection on some laptops · 6340e7ba

Benjamin Herrenschmidt authored Jun 22, 2004

The code in radeonfb looking for the BIOS image currently uses the BIOS
ROM if any, and falls back to the RAM image if not found.  This is
unfortunatly not correct for a bunch of laptops where the real panel
data are only present in the RAM image.

This works around this problem by preferring the RAM image on mobility
chipsets.  This is definitely not the best workaround, we need some arch
support for linking the RAM image to the PCI ID (preferrably by having
the arch snapshot it during boot, isolating us completely from the
details of where this image is in memory).  I'll see how we can get such
an improvement later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6340e7ba

[PATCH] ppc32: Support for new Apple laptop models · ca216b8a

Benjamin Herrenschmidt authored Jun 22, 2004

This adds support for newer Apple laptop models.  It adds the basic
identification for the new motherboards and the cpufreq support for
models using the new 7447A CPU from Motorola.

This is mostly the work of John Steele Scott <toojays@toojays.net> with
some bits from Sebastian Henschel <linux@kodeaffe.de> and some rework by
myself.  Please apply,
Signed-off-by: John Steele Scott <toojays@toojays.net>
Signed-off-by: Sebastian Henschel <linux@kodeaffe.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ca216b8a

[PATCH] ppc32: oprofile support · e5603f99

Benjamin Herrenschmidt authored Jun 22, 2004

This adds basic oprofile support to ppc32.  Originally from Anton
Blanchard, I just re-diffed it against current kernels.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e5603f99

[PATCH] ppc32: Cleanups & warning fixes of traps.c · b62102f6

Benjamin Herrenschmidt authored Jun 22, 2004

This cleans up arch/ppc/kernel/traps.c and vecemu.c to use the same
formatting style for all functions, and fixes 2 warnings in the altivec
floating point emulation code. No functional change.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b62102f6

Merge bk://gkernel.bkbits.net/libata-2.6 · bd67d886
Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
bd67d886

[libata sata_sil] Re-fix mod15write bug · 48c1a573

Jeff Garzik authored Jun 22, 2004

Certain early SATA drives have problems with write requests whose
length satisfy the equation "sectors % 15 == 1", on the SiI 3112.
Other drives, and other SiI controllers, are not affected.

The fix for this problem is to avoid such requests, in one of three
ways, for the affect drive+controller combos:
1) Limit all writes to 15 sectors
2) Use block layer features to avoid creating requests whose
   length satisfies the above equation.
3) When a request satisfies the above equation, split the request
   into two writes, neither of which satisfies the equation.

I chose fix #1, the most simple to implement.  After discussion with
Silicon Image and others regarding the impact of this fix, I have
decided to remain with fix #1, and will not be implementing a
"better fix".  This means that the affected SATA drives will see
decreased performance, but set of affected drives is small and will
never grow larger.

Further, the complexity of implementing solution #2 or
solution #3 is rather large.

When implementing lba48 'large request' support, I unintentionally
broke the fix for these affected drives.  Kudos to Ricky Beam for
noticing this.

This change restores the fix, by adding a flag ATA_DFLAG_LOCK_SECTORS
to indicate that the max_sectors value set by the low-level driver
should never be changed.

48c1a573

Merge bk://bk.arm.linux.org.uk/linux-2.6-rmk · 30c0d5b0
Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
30c0d5b0

23 Jun, 2004 5 commits

[ARM PATCH] 1940/1: asm-arm/checksum.h - missing include · 797b0515
David Vrabel authored Jun 23, 2004
```
Patch from David Vrabel

Missing include (cf. include/asm-i386/checksum.h) in include/asm-arm/checksum.h.
```
797b0515

[ARM PATCH] 1939/1: SA1100 watchdog driver also works on PXA2xx · f770ee2f

Ian Campbell authored Jun 23, 2004

Patch from Ian Campbell

The SA1100 watchdog driver also works fine on the PXA2xx. Tested on 
a PXA255 based platform.

f770ee2f

[ARM PATCH] 1936/1: Update collie fb entries to use new style initializers · f0524a16

John Lenz authored Jun 23, 2004

Patch from John Lenz

Switches the collie framebuffer mach_info structure to use
the new initializers.
Signed-off-by: John Lenz <jelenz@students.wisc.edu>

f0524a16

[ARM PATCH] 1935/1: Fix bug in sa1111 driver · 3f6e51d3

John Lenz authored Jun 23, 2004

Patch from John Lenz

The __sa1111_probe function is declared __init, and it is called from
the sa1111_probe function, which is not declared __init.
Signed-off-by: John Lenz <jelenz@students.wisc.edu>

3f6e51d3

Merge bk://dsaxena.bkbits.net/linux-2.6-for-rmk · 9013ca59
Russell King authored Jun 23, 2004
```
into flint.arm.linux.org.uk:/usr/src/bk/linux-2.6-rmk
```
9013ca59

22 Jun, 2004 3 commits
- [ARM] Correct MMCI clock rate on Integrator/CP. · f4445c95
  Russell King authored Jun 22, 2004
  
  f4445c95
- [ARM] Move cpu_switch_mm() and cpu_get_pgd() to asm/proc-fns.h · 14881d2b
  Russell King authored Jun 22, 2004
  
  14881d2b
- Merge bk://bk.arm.linux.org.uk/linux-2.6-serial · 03b8cad5
  Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  03b8cad5
23 Jun, 2004 2 commits

[ARM PATCH] 1933/1: Convert PXA serial driver to device model and implement suspend and resume · 33e8a4f8

Ian Campbell authored Jun 23, 2004

Patch from Ian Campbell

Patch 1848/1 removed the hack for preserving FFUART over sleep.

This patch adds back that support in the correct place by converting
the PXA serial driver to use the driver model and hooking up the 
suspend and resume methods.

33e8a4f8

[ARM PATCH] 1913/1: lh7a40x #3 (1/2) serial · 694b677f

Marc Singer authored Jun 23, 2004

Patch from Marc Singer

Serial console and port driver for the LH7a40x CPUs.  The only change
made since the last patch was to change the PORT_ID (again).  This
patch superceeds two serial driver patches.

694b677f

22 Jun, 2004 12 commits
- Merge bk://kernel.bkbits.net/davem/net-2.6 · ec2319ca
  Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  ec2319ca
- Merge bk://kernel.bkbits.net/davem/sparc-2.6 · 5438f37d
  Linus Torvalds authored Jun 22, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  5438f37d
- Merge davem@nuts.davemloft.net:/disk1/BK/sparc-2.6 · 2b188596
  David S. Miller authored Jun 22, 2004
```
into kernel.bkbits.net:/home/davem/sparc-2.6
```
  2b188596
- Merge davem@nuts.davemloft.net:/disk1/BK/nf-2.6 · 35fdc2d8
  David S. Miller authored Jun 22, 2004
```
into kernel.bkbits.net:/home/davem/nf-2.6
```
  35fdc2d8
- [NET]: In loopback, make get_stats() get correct per-cpu stats. · 26e9ff17
  Arthur Kepner authored Jun 22, 2004
```
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: David S. Miller <davem@redhat.com>
```
  26e9ff17
- [IPSEC]: In ESP, do not put scatterlist array on stack. · 61266d5f
  David S. Miller authored Jun 21, 2004
```
Put it in per-esp data instead.  Noticed by Linus.
Signed-off-by: David S. Miller <davem@redhat.com>
```
  61266d5f
- Merge bk://kernel.bkbits.net/acme/net-2.6 · b4321277
  David S. Miller authored Jun 21, 2004
```
into nuts.davemloft.net:/disk1/BK/net-2.6
```
  b4321277
- Merge nuts.davemloft.net:/disk1/BK/network-2.6 · bdba3437
  David S. Miller authored Jun 21, 2004
```
into nuts.davemloft.net:/disk1/BK/net-2.6
```
  bdba3437
- Merge bk://kernel.bkbits.net/gregkh/linux/driver-2.6 · 3e13e3b2
  Linus Torvalds authored Jun 21, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  3e13e3b2
- Merge bk://gkernel.bkbits.net/libata-2.6 · d829cefd
  Linus Torvalds authored Jun 21, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  d829cefd
- Merge bk://gkernel.bkbits.net/misc-2.6 · 0ee381c2
  Linus Torvalds authored Jun 21, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  0ee381c2
- [blk carmel] s/carmel/sx8/ in the driver itself · 5eef782f
  Jeff Garzik authored Jun 22, 2004
  
  5eef782f