Commits · d2279c446de39fbd1ea343a78db7e356be002784 · Kirill Smelkov / linux

30 Aug, 2002 40 commits

[PATCH] sock_writeable not appropriate for TCP sockets, for 2.5.32 · d2279c44

Chuck Lever authored Aug 30, 2002

sock_writeable determines whether there is space in a socket's output
buffer. socket write_space callbacks use it to determine whether to wake
up those that are waiting for more output buffer space.

however, sock_writeable is not appropriate for TCP sockets. because the
RPC layer's write_space callback uses it for TCP sockets, the RPC layer
hammers on sock_sendmsg with dozens of write requests that are only a few
hundred bytes long when it is trying to send a large write RPC request.
this patch adds logic to the RPC layer's write_space callback that
properly handles TCP sockets.

patch reviewed by Trond and Alexey.

d2279c44

[PATCH] prevent oops in xprt_lock_write, against 2.5.32 · 1758bdf3

Chuck Lever authored Aug 30, 2002

when several RPC requests want to reconnect a TCP transport socket at
once, xprt_lock_write serializes the tasks to prevent multiple socket
connects. however, TCP connects are always done by a RPC child task that
has no request slot. xprt_lock_write can oops if there is no request slot
allocated to the invoking RPC task. reviewed and accepted by Trond.

the xprt_lock_write changes are not yet in 2.4, so this patch does not
apply to 2.4.

1758bdf3

[PATCH] TLS boot-initialization bugfix on SMP, 2.5.32-BK · 44a05b3e
Ingo Molnar authored Aug 30, 2002
```
This fixes a bad TLS initialization bug found by Andi Kleen.  x86/SMP
only worked due to luck.
```
44a05b3e

[PATCH] scheduler fixes, 2.5.32-BK · 2c638ab0

Ingo Molnar authored Aug 30, 2002

This adds two scheduler related fixes:

 - changes the migration code to use struct completion. Andrew pointed out
   that there might be a small window in where the up() touches the
   semaphore while the waiting task goes on and frees its stack. And
   completion is more suited for this kind of stuff anyway.

 - removes two unneeded exports, pointed out by Andrew.

2c638ab0

[PATCH] clone-cleanup 2.5.32-BK · 1f9d6582

Ingo Molnar authored Aug 30, 2002

This moves CLONE_SETTID and CLONE_CLEARTID handling into kernel/fork.c,
where it belongs. [the CLONE_SETTLS is x86-specific and thus remains in
the per-arch process.c] This makes support for these two new flags much
easier: architectures only have to pass in the user_tid pointer.

1f9d6582

[PATCH] include/asm-i386/msr.h · a27b8fe9
Dominik Brodowski authored Aug 30, 2002
```
It would be helpful if these msr.h #defines could get in.
```
a27b8fe9

[PATCH] efi.h move · af05fc03

David Mosberger authored Aug 30, 2002

It makes no sense to keep efi.h as an ia64-specific header (there really
are x86 machines coming out with optional EFI BIOS support).

af05fc03

Merge http://lia64.bkbits.net/to-linus-2.5 · 9caf366e
Linus Torvalds authored Aug 30, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
9caf366e
[PATCH] ext3 __FUNCTION__ pasting fix · 7c0700ff
Andrew Morton authored Aug 30, 2002
```
Fix a __FUNCTION__ paste in revoke.c
```
7c0700ff

[PATCH] O_DIRECT for ext3 · a3b71057

Andrew Morton authored Aug 30, 2002

O_DIRECT support for ext3.

It works OK in all journalling modes.

Updates to the file metadata and inode are journalled as usual.

If the system crashes during an appending O_DIRECT write then journal
recovery will truncate the written-to file back to the length which it
had on entry to that write.

If the system crashes during a file overwrite to existing blocks then
the file contents will be an unknown mixture of old and new.

If the system crashes during a file overwrite which instantiates new
blocks in the middle of the file then there is a possibility of
uninitialised disk blocks being present in the file post-recovery.

a3b71057

[PATCH] fix an ext3 deadlock · bebff73c

Andrew Morton authored Aug 30, 2002

mpage_writepages() does a lock_page() on pages to be written back, even
when it is being used for page reclaim writeback.

This is normally OK, because the page is unlocked quickly - pages are
unlocked during writeback and nobody should be performing __GFP_FS
allocations inside lock_page().

But it has introduced a ranking problem in ext3:

generic_file_write
->lock_page
  ->ext3_prepare_write
    ->journal_start	(waits for a commit)

versus

ext3_create()
->journal_start()
  ->ext3_new_inode(GFP_KERNEL)
    ->page reclaim
      ->mpage_writepages
        ->lock_page	(locks up, transaction is held open)

Maybe sometime, I'll have to turn mpage_writepages' lock_page into a
trylock if the caller is PF_MEMALLOC.  But for now, let's make ext3's
inside-transaction allocations use GFP_NOFS.  There is only one of them.

bebff73c

[PATCH] writeback correctness and efficiency changes · ec12ac49

Andrew Morton authored Aug 30, 2002

This is a performance and correctness fix against the writeback paths.

The writeback code has competing requirements.  Sometimes it is used
for "memory cleansing": kupdate, bdflush, writer throttling, page
allocator writeback, etc.  And sometimes this same code is used for
data integrity pruposes: fsync, msync, fdatasync, sync, umount, various
other kernel-internal uses.

The problem is: how to handle a dirty buffer or page which is currently
under writeback.

For memory cleansing, we just want to skip that buffer/page and go onto
the next one.  But for sync, we must wait on the old writeback and then
start new writeback.

mpage_writepages() is current correct for cleansing, but incorrect for
sync.  block_write_full_page() is currently correct for sync, but
inefficient for cleansing.

The fix is fairly simple.

- In mpage_writepages(), don't skip the page is it's a sync
operation.

- In block_write_full_page(), skip the buffer if it is a sync
operation.  And return -EAGAIN to tell the caller that the writeout
didn't work out.  The caller must then set the page dirty again and
move it onto mapping->dirty_pages.

This is an extension of the writepage API: writepage can now return
EAGAIN.  There are only three callers, and they have been updated.

fail_writepage() and ext3_writepage() were actually doing this by
hand.  They have been changed to return -EAGAIN.  NTFS will want to
be able to return -EAGAIN from its writepage as well.

- A sticky question is: how to tell the writeout code which mode it
is operating in?  Cleansing or sync?

It's such a tiny code change that I didn't have the heart to go and
propagate a `mode' argument down every instance of writepages() and
writepage() in the kernel.  So I passed it in via current->flags.

Incidentally, the occurrence of a locked-and-dirty buffer in
block_write_full_page() is fairly rare: normally the collision avoidance
happens at the address_space level, via PageWriteback.  But some
mappings (blockdevs, ext3 files, etc) have their dirty buffers written
out via submit_bh().  It is these buffers which can stall
block_write_full_page().

This wart will be pretty intrusive to fix.  ext3 needs to become fully
page-based (ugh.  It's a block-based journalling filesystem, and pages
are unnatural).  blockdev mappings are still written out by buffers
because that's how filesystems use them.  Putting _all_ metadata
(indirects, inodes, superblocks, etc) into standalone address_spaces
would fix that up.

- filemap_fdatawrite() sets PF_SYNC.  So filemap_fdatawrite() is the
kernel function which will start writeback against a mapping for
"data integrity" purposes, whereas the unexported, internal-only
do_writepages() is the writeback function which is used for memory
cleansing.  This difference is the reason why I didn't consolidate
those functions ages ago...

- Lots of code paths had a bogus extra call to filemap_fdatawait(),
which I previously added in a moment of weak-headedness.  They have
all been removed.

ec12ac49

[PATCH] batched freeing of anon pages · 8fd3d458

Andrew Morton authored Aug 30, 2002

A reworked version of the batched page freeing and lock amortisation
for VMA teardown.

It walks the existing 507-page list in the mmu_gather_t in 16-page
chunks, drops their refcounts in 16-page chunks, and de-LRUs and
frees any resulting zero-count pages in up-to-16 page chunks.

8fd3d458

[PATCH] put_page() consolidation · 2b341443

Andrew Morton authored Aug 30, 2002

Clean up put_page() and page_cache_release().  It's pretty simple now:

#define page_cache_get(page)           get_page(page)
#define page_cache_release(page)       put_page(page)

2b341443

[PATCH] remove pagevec_lru_del() · e035a047

Andrew Morton authored Aug 30, 2002

it was only being used in invalidate_inode_pages(), and from there,
pagevec_release() does the same thing.

e035a047

[PATCH] debug check in put_page_testzero() · c99b0372
Andrew Morton authored Aug 30, 2002
```
As suggested by Daniel - it's a bug to run put_page_testzero
against a zero-ref page.
```
c99b0372

[PATCH] MAINTAINERS patch · cdf2f98b

Ingo Molnar authored Aug 30, 2002

please apply this patch (Robert ACK-ed it). While there is a preemptible
kernel entry already, i think listing this at the scheduler entry is
justfied, preemption has a number of scheduler interactions.

cdf2f98b

[PATCH] ldt-fix-2.5.32-A3 · 89d637a8

Ingo Molnar authored Aug 30, 2002

this is an updated version of the LDT fixes. It fixes the following kinds
of problems:

 - fix a possible gcc optimization causing a race causing the loading of a
   corrupt LDT descriptor upon context switch. [this fix got simplified
   over previous versions.]

 - remove an unconditional OOM printk, and there's no need to set ->size
   in the OOM path.

 - fix preemption bugs, load_LDT()/clear_LDT() was not preemption-safe,
   when it was used outside of spinlocks.

the context-switch race is the following. 'LDT modification' is the
following operation: the seg->ldt pointer is modified, then seg->size is
modified. In theory gcc is free to reschedule the two modifications, and
first modify ->size, then ->ldt. Thus if this modification is not
synchronized with context-switches, another thread might see a temporary
state of the new ->size [which was increased], but still the old pointer.
Ie.:

	CPU0				CPU1

	pc->size = newsize;
					load_LDT(); // (oldptr, newsize)
	pc->ldt = newptr;

the corrupt LDT is loaded until the SMP cross-call is sent, leaving the
window open for many usecs.

the fix is to put a wmb() after ->ldt modifications. [this is also in
preparation of not-write-ordered SMP x86 designs.]

89d637a8

Merge bk://linux-input.bkbits.net/linux-input · e5d588fe
Linus Torvalds authored Aug 30, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
e5d588fe

Ignore error 0xff - 'general error' in AUX wire test in i8042.c, · ed0a0a9c

Vojtech Pavlik authored Aug 30, 2002

some mainboards (Andrew Morton's Dell) report that even everything
is okay with AUX. Also remove a check for very old AMI i8042's, which
could generate false positives on modern buggy mainboards.

ed0a0a9c

Merge bk://jfs.bkbits.net/linux-2.5 · c71a4337
Linus Torvalds authored Aug 30, 2002
```
into home.transmeta.com:/home/torvalds/v2.5/linux
```
c71a4337
[PATCH] oss/gus_card.c - convert cli to spinlocks · 652cbb16
Peter Wächtler authored Aug 30, 2002

652cbb16
[PATCH] oss/nm256.h - convert cli to spinlocks · f7dc2012
Peter Wächtler authored Aug 30, 2002

f7dc2012
[PATCH] oss/pas2_card.c - convert cli to spinlocks · 81b1edf0
Peter Wächtler authored Aug 30, 2002

81b1edf0
[PATCH] oss/vwsnd.c - convert cli to spinlocks · 2fcfdf56
Peter Wächtler authored Aug 30, 2002

2fcfdf56
[PATCH] oss/trident.c - convert cli to spinlocks · 7a6316fd
Peter Wächtler authored Aug 30, 2002

7a6316fd
[PATCH] oss/midi_synth.c - convert cli to spinlocks · e8342e87
Peter Wächtler authored Aug 30, 2002

e8342e87
[PATCH] oss/sonicvibes.c - convert cli to spinlocks · 4cbc061a
Peter Wächtler authored Aug 30, 2002

4cbc061a
[PATCH] oss/esssolo1.c - convert cli to spinlocks · db0abdb5
Peter Wächtler authored Aug 30, 2002

db0abdb5
[PATCH] oss/rme96xx.c - convert cli to spinlocks · 256de87c
Peter Wächtler authored Aug 30, 2002

256de87c
[PATCH] oss/cmpci.c - convert cli to spinlocks · 8a8ce17b
Peter Wächtler authored Aug 30, 2002

8a8ce17b
[PATCH] oss/waveartist.c - convert cli to spinlocks · 69f5f47a
Peter Wächtler authored Aug 30, 2002

69f5f47a
[PATCH] oss/soundcard.c - convert cli to spinlocks · a5154ee9
Peter Wächtler authored Aug 30, 2002

a5154ee9
[PATCH] oss/wavfront.c - convert cli to spinlocks · a4826ccd
Peter Wächtler authored Aug 30, 2002

a4826ccd
[PATCH] oss/opl3sa2.c - convert cli to spinlocks · 4c6c6e5c
Peter Wächtler authored Aug 30, 2002

4c6c6e5c
[PATCH] oss/opl3sa.c - convert cli to spinlocks · 0b5bb847
Peter Wächtler authored Aug 30, 2002

0b5bb847
[PATCH] oss/dev_table.h - convert cli to spinlocks · 386c8a8e
Peter Wächtler authored Aug 30, 2002

386c8a8e
[PATCH] oss/sys_timer.c - convert cli to spinlocks · 0a4d98b4
Peter Wächtler authored Aug 30, 2002

0a4d98b4
[PATCH] oss/mad16.c - convert cli to spinlocks · e1f63f69
Peter Wächtler authored Aug 30, 2002

e1f63f69
[PATCH] oss/nec_vrc5477.c - convert cli to spinlocks · a74ebe7f
Peter Wächtler authored Aug 30, 2002

a74ebe7f