Commits · dd76695778b23a2b199ce518c5e17b19f1cff008 · nexedi / linux

26 May, 2004 1 commit
- JFS: check default acl for correctness before setting it · dd766957
  Dave Kleikamp authored May 26, 2004
```
This patch was orignally submitted by Andreas Gruenbacher and modified
by me.
```
  dd766957
24 May, 2004 3 commits
- Merge http://linux-sound.bkbits.net/linux-sound · 91d7f4e3
  Linus Torvalds authored May 23, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  91d7f4e3
- [PATCH] pa-risc: kernel/fork.c broken by the new rmap · be4284e3
  James Bottomley authored May 23, 2004
```
Any architecture (like pa-risc) that makes use of the helper function
flush_dcache_mmap_lock() won't compile with the new rmap due to use of
the wrong "mapping". 

Trivial fix.
```
  be4284e3
- Merge bk://linux-scsi.bkbits.net/scsi-for-linus-2.6 · dcde1f6f
  Linus Torvalds authored May 23, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
  dcde1f6f
23 May, 2004 16 commits

Merge bk://bk.arm.linux.org.uk/linux-2.6-pcmcia · 68da1c2a
Linus Torvalds authored May 23, 2004
```
into ppc970.osdl.org:/home/torvalds/v2.6/linux
```
68da1c2a

ALSA CVS update - Jaroslav Kysela <perex@suse.cz> · 3c65bcbb

Jaroslav Kysela authored May 23, 2004

PCM Midlevel,ALSA Core
Added SNDRV_PCM_SYNC_PTR_APPL and SNDRV_PCM_SYNC_PTR_AVAIL_MIN extensions
to SYNC_PTR ioctl for PCM API.

3c65bcbb

ALSA CVS update - Takashi Iwai <tiwai@suse.de> · 2b38ae75

Jaroslav Kysela authored May 23, 2004

VIA82xx driver
- added the DXS entry for ECS K7VTA3 v8.0
- fixed the DXS entry for ASUS A7V8X to NO_VRA.

2b38ae75

ALSA CVS update - Takashi Iwai <tiwai@suse.de> · c71e1496
Jaroslav Kysela authored May 23, 2004
```
ALSA Core
added reverse selections of components to CONFIG_SND_BIT32_EMUL.
```
c71e1496

ALSA CVS update - Takashi Iwai <tiwai@suse.de> · 0348756a

Jaroslav Kysela authored May 23, 2004

PCI drivers,ICE1712 driver,ICE1724 driver
- improved the description of ice1724 driver on Kconfig.
- better support of VT1720 with snd-ice1724 driver.
- check PCI subsystem IDs when no EEPROM is available (ice1724 only)
- change the driver name string if given in the board list.
- merged prodigy 7.1 support into aureon.c.  they are almost identical.
- allow to use PDMA4 and RMDA1 for non-SPDIF purpose if specified (ice1724 only).

0348756a

[NET]: Save some space with sysfs strings. · 9c2d00e7
Vadim Lobanov authored May 23, 2004

9c2d00e7
Merge http://linux-mh.bkbits.net/bluetooth-2.6 · 8fc64470
David S. Miller authored May 23, 2004
```
into nuts.davemloft.net:/disk1/BK/net-2.6
```
8fc64470

[Bluetooth] Define .kobj.k_name for the fake device · ca09bcc9

Marcel Holtmann authored May 23, 2004

The PCMCIA devices are not devices for the kernel and the bt3c_cs
driver uses a fake device for calling request_firmware(). The fake
device initialization must also set .kobj.k_name to prevent an oops
until PCMCIA devices are fully integrated into the driver model.

ca09bcc9

[Bluetooth] Use try_module_get() for RFCOMM sessions · f1b6caac

Marcel Holtmann authored May 23, 2004

It is not possible to use __module_get() when adding a new RFCOMM
session, because there is a case where no reference count is hold.
This happens when the module is not in use right now and an incoming
connection occurs.

f1b6caac

[PATCH] ipr driver version 2.0.7 · 61561538
Brian King authored May 23, 2004
```
Bump driver version
```
61561538

[PATCH] ipr remove anonymous unions for gcc 2.95 · 48947380

Brian King authored May 23, 2004

This patch removes all usage of anonymous unions from the ipr
driver since gcc 2.95 does not support anonymous unions.

48947380

[PATCH] ipr fix for ioa reset timeout oops · 943d7b0a

Brian King authored May 23, 2004

This patch fixes an oops discovered in test which can occur
on bad hardware if the ipr adapter times out coming operational.

943d7b0a

[PATCH] ipr add error logs to abort and reset paths · 8b6ab6a9

Brian King authored May 23, 2004

This patch adds additional error logging to abort, device reset,
and bus reset paths to help in diagnosing scsi problems on ipr.

8b6ab6a9

[PATCH] ipr gcc attributes fixes · f8e85c63

Brian King authored May 23, 2004

This patch fixes an issue where ipr was including a kernel
data structure, list_head, in a packed structure, which causes
compile issues on some architectures, and is just a bad thing to do.

f8e85c63

initial 2.6 fixup for ATP870U scsi · f59da631

James Bottomley authored May 23, 2004

From: 	Alan Cox <alan@redhat.com>

Pretty minimal. queue_command is now called locked, this requires propogating
some small locking changes for send_s870

f59da631

[PATCH] ncpfs compat ioctls · c72113b5

Alexander Viro authored May 22, 2004

This takes ncpfs ioctl handling into fs/compat_ioctl.c, removing it from
ppc64 and sparc64 code.

Code sanitized, switched to compat_alloc_user_space(), bunch of
{k,v}malloc() killed.

c72113b5

22 May, 2004 20 commits

Linux 2.6.7-rc1 · 86042707
Linus Torvalds authored May 22, 2004

86042707

[PATCH] bogus sigaltstack calls by rt_sigreturn · ce34221e

Roland McGrath authored May 22, 2004

There is a longstanding bug in the rt_sigreturn system call.
This exists in both 2.4 and 2.6, and for almost every platform.

I am referring to this code in sys_rt_sigreturn (arch/i386/kernel/signal.c):

	if (__copy_from_user(&st, &frame->uc.uc_stack, sizeof(st)))
		goto badframe;
	/* It is more difficult to avoid calling this function than to
	   call it and ignore errors.  */
	/*
	 * THIS CANNOT WORK! "&st" is a kernel address, and "do_sigaltstack()"
	 * takes a user address (and verifies that it is a user address). End
	 * result: it does exactly _nothing_.
	 */
	do_sigaltstack(&st, NULL, regs->esp);

As the comment says, this is bogus.  On vanilla i386 kernels, this is just
harmlessly stupid--do_sigaltstack always does nothing and returns -EFAULT.

However this code actually bites users on kernels using Ingo Molnar's 4G/4G
address space layout changes.  There some kernel stack address might very
well be a lovely and readable user address as well.  When that happens, we
make a sigaltstack call with some random buffer, and then the fun begins.

To my knowledge, this has produced trouble in the real world only for 4G
i386 kernels (RHEL and Fedora "hugemem" kernels) on machines that actually
have several GB of physical memory (and in programs that are actually using
sigaltstack and handling a lot of signals).  However, the same clearly
broken code has been blindly copied to most other architecture ports, and
off hand I don't know the address space details of any other well enough to
know if real kernel stack addresses and real user addresses are in fact
disjoint as they are on i386 when not using the nonstandard 4GB address
space layout.

The obvious intent of the call being there in the first place is to permit
a signal handler to diddle its ucontext_t.uc_stack before returning, and
have this effect a sigaltstack call on the signal handler return.  This is
not only an optimization vs doing the extra system call, but makes it
possible to make a sigaltstack change when that handler itself was running
on the signal stack.  AFAICT this has never actually worked before, so
certainly noone depends on it.  But the code certainly suggests that
someone intended at one time for that to be the behavior.  Thus I am
inclined to fix it so it works in that way, though it has not done so before.
It would also be reasonable enough to simply rip out the bogus call and not
have this functionality.

From the current state of code in both 2.4 and 2.6, there is no fathoming
how this broken code came about.  It's actually much simpler to just make
it work!  I can only presume that at some point in the past the sigaltstack
implementation functions were different such that this made sense.  Of the
few ports I've looked at briefly, only the ppc/pc64 porters (go paulus!)
actually tried to understand what the i386 code was doing and implemented
it correctly rather than just carefully transliterating the bug.

The patch below fixes only the i386 and x86_64 versions.  The x86_64
patches I have not actually tested.  I think each and every arch (except
ppc and ppc64) need to make the corresponding fixes as well.  Note that
there is a function to fix for each native arch, and then one for each
emulation flavor.  The details differ minutely for getting the calls right
in each emulation flavor, but I think that most or all of the arch's with
biarch/emulation support have similar enough code that each emulation
flavor's fix will look very much like the arch/x86_64/ia32/ia32_signal.c
patch here.

ce34221e

[PATCH] partial prefetch for vma_prio_tree_next · ad9beb31

Andrew Morton authored May 22, 2004

From: Rajesh Venkatasubramanian <vrajesh@umich.edu>

This patch adds prefetches for walking a vm_set.list.  Adding prefetches
for prio tree traversals is tricky and may lead to cache trashing.  So this
patch just adds prefetches only when walking a vm_set.list.

I haven't done any benchmarks to show that this patch improves performance.
 However, this patch should help to improve performance when vm_set.lists
are long, e.g., libc.  Since we only prefetch vmas that are guaranteed to
be used in the near future, this patch should not result in cache trashing,
theoretically.

I didn't add any NULL checks before prefetching because prefetch.h clearly
says prefetch(0) is okay.

ad9beb31

[PATCH] rmap 40 better anon_vma sharing · 17e8935f

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

anon_vma rmap will always necessarily be more restrictive about vma merging
than before: according to the history of the vmas in an mm, they are liable to
be allocated different anon_vma heads, and from that point on be unmergeable.

Most of the time this doesn't matter at all; but in two cases it may matter.
One case is that mremap refuses (-EFAULT) to span more than a single vma: so
it is conceivable that some app has relied on vma merging prior to mremap in
the past, and will now fail with anon_vma. Conceivable but unlikely, let's
cross that bridge if we come to it: and the right answer would be to extend
mremap, which should not be exporting the kernel's implementation detail of
vma to user interface.

The other case that matters is when a reasonable repetitive sequence of
syscalls and faults ends up with a large number of separate unmergeable vmas,
instead of the single merged vma it could have.

Andrea's mprotect-vma-merging patch fixed some such instances, but left other
plausible cases unmerged. There is no perfect solution, and the harder you
try to allow vmas to be merged, the less efficient anon_vma becomes, in the
extreme there being one to span the whole address space, from which hangs
every private vma; but anonmm rmap is clearly superior to that extreme.

Andrea's principle was that neighbouring vmas which could be mprotected into
mergeable vmas should be allowed to share anon_vma: good insight. His
implementation was to arrange this sharing when trying vma merge, but that
seems to be too early. This patch sticks to the principle, but implements it
in anon_vma_prepare, when handling the first write fault on a private vma:
with better results. The drawback is that this first write fault needs an
extra find_vma_prev (whereas prev was already to hand when implementing
anon_vma sharing at try-to-merge time).

17e8935f

[PATCH] rmap 39 add anon_vma rmap · 8aa3448c

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Andrea Arcangeli's anon_vma object-based reverse mapping scheme for anonymous
pages.  Instead of tracking anonymous pages by pte_chains or by mm, this
tracks them by vma.  But because vmas are frequently split and merged
(particularly by mprotect), a page cannot point directly to its vma(s), but
instead to an anon_vma list of those vmas likely to contain the page - a list
on which vmas can easily be linked and unlinked as they come and go.  The vmas
on one list are all related, either by forking or by splitting.

This has three particular advantages over anonmm: that it can cope
effortlessly with mremap moves; and no longer needs page_table_lock to protect
an mm's vma tree, since try_to_unmap finds vmas via page -> anon_vma -> vma
instead of using find_vma; and should use less cpu for swapout since it can
locate its anonymous vmas more quickly.

It does have disadvantages too: a lot more change in mmap.c to deal with
anon_vmas, though small straightforward additions now that the vma merging has
been refactored there; more lowmem needed for each anon_vma and vma structure;
an additional restriction on the merging of vmas (cannot be merged if already
assigned different anon_vmas, since then their pages will be pointing to
different heads).

(There would be no need to enlarge the vma structure if anonymous pages
belonged only to anonymous vmas; but private file mappings accumulate
anonymous pages by copy-on-write, so need to be listed in both anon_vma and
prio_tree at the same time.  A different implementation could avoid that by
using anon_vmas only for purely anonymous vmas, and use the existing prio_tree
to locate cow pages - but that would involve a long search for each single
private copy, probably not a good idea.)

Where before the vm_pgoff of a purely anonymous (not file-backed) vma was
meaningless, now it represents the virtual start address at which that vma is
mapped - which the standard file pgoff manipulations treat linearly as vmas
are split and merged.  But if mremap moves the vma, then it generally carries
its original vm_pgoff to the new location, so pages shared with the old
location can still be found.  Magic.

Hugh has massaged it somewhat: building on the earlier rmap patches, this
patch is a fifth of the size of Andrea's original anon_vma patch.  Please note
that this posting will be his first sight of this patch, which he may or may
not approve.

8aa3448c

[PATCH] rmap 38 remove anonmm rmap · a89cd0f0

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Before moving on to anon_vma rmap, remove now what's peculiar to anonmm rmap:
the anonmm handling and the mremap move cows.  Temporarily reduce
page_referenced_anon and try_to_unmap_anon to stubs, so a kernel built with
this patch will not swap anonymous at all.

a89cd0f0

[PATCH] rmap 37 page_add_anon_rmap vma · e1fd9cc9

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Silly final patch for anonmm rmap: change page_add_anon_rmap's mm arg to vma
arg like anon_vma rmap, to smooth the transition between them.

e1fd9cc9

[PATCH] rmap 36 mprotect use vma_merge · 2b2e2a36

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Earlier on, in 2.6.6, we took the vma merging code out of mremap.c and let it
rely on vma_merge instead (via copy_vma). Now take the vma merging code out
of mprotect.c and let it rely on vma_merge too: so vma_merge becomes the sole
vma merging engine. The fruit of this consolidation is that mprotect now
merges file-backed vmas naturally. Make this change now because anon_vma will
complicate the vma merging rules, let's keep them all in one place.

vma_merge remains where the decisions are made, whether to merge with prev
and/or next; but now [addr,end) may be the latter part of prev, or first part
or whole of next, whereas before it was always a new area.

vma_adjust carries out vma_merge's decision, but when sliding the boundary
between vma and next, must temporarily remove next from the prio_tree too.
And it turned out (by oops) to have a surer idea of whether next needs to be
removed than vma_merge, so the fput and freeing moves into vma_adjust.

Too much decipherment of what's going on at the start of vma_adjust? Yes, and
there's a delicate assumption that you may use vma_adjust in sliding a
boundary, or splitting in two, or growing a vma (mremap uses it in that way),
but not for simply shrinking a vma. Which is so, and must be so (how could
pages mapped in the part to go, be zapped without first splitting?), but would
feel better with some protection.

__vma_unlink can then be moved from mm.h to mmap.c, and mm.h's more misleading
than helpful can_vma_merge is deleted.

2b2e2a36

[PATCH] rmap 35 mmap.c cleanups · 06ecc0db

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Before some real vma_merge work in mmap.c in the next patch, a patch of
miscellaneous cleanups to cut down the noise:

- remove rb_parent arg from vma_merge: mm->mmap can do that case
- scatter pgoff_t around to ingratiate myself with the boss
- reorder is_mergeable_vma tests, vm_ops->close is least likely
- can_vma_merge_before take combined pgoff+pglen arg (from Andrea)
- rearrange do_mmap_pgoff's ever-confusing anonymous flags switch
- comment do_mmap_pgoff's mysterious (vm_flags & VM_SHARED) test
- fix ISO C90 warning on browse_rb if building with DEBUG_MM_RB
- stop that long MNT_NOEXEC line wrapping

Yes, buried in amidst these is indeed one pgoff replaced by "next->vm_pgoff -
pglen" (reverting a mod of mine which took pgoff supplied by user too
seriously in the anon case), and another pgoff replaced by 0 (reverting
anon_vma mod which crept in with NUMA API): neither of them really matters,
except perhaps in /proc/pid/maps.

06ecc0db

[PATCH] rmap 34 vm_flags page_table_lock · 4877b14f

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

First of a batch of seven rmap patches, based on 2.6.6-mm3. Probably the
final batch: remaining issues outstanding can have isolated patches. The
first half of the batch is good for anonmm or anon_vma, the second half of the
batch replaces my anonmm rmap by Andrea's anon_vma rmap.

Judge for yourselves which you prefer. I do think I was wrong to call
anon_vma more complex than anonmm (its lists are easier to understand than my
refcounting), and I'm happy with its vma merging after the last patch. It
just comes down to whether we can spare the extra 24 bytes (maximum, on
32-bit) per vma for its advantages in swapout and mremap.

rmap 34 vm_flags page_table_lock

Why do we guard vm_flags mods with page_table_lock when it's already
down_write guarded by mmap_sem? There's probably a historical reason, but no
sign of any need for it now. Andrea added a comment and removed the instance
from mprotect.c, Hugh plagiarized his comment and removed the instances from
madvise.c and mlock.c. Huge leap in scalability... not expected; but this
should stop people asking why those spinlocks.

4877b14f

[PATCH] rmap 33 install_arg_page vma · 114c71ee

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

anon_vma will need to pass vma to put_dirty_page, so change it and its
various callers (setup_arg_pages and its 32-on-64-bit arch variants); and
please, let's rename it to install_arg_page.

Earlier attempt to do this (rmap 26 __setup_arg_pages) tried to clean up
those callers instead, but failed to boot: so now apply rmap 27's memset
initialization of vmas to these callers too; which relieves them from
needing the recently included linux/mempolicy.h.

While there, moved install_arg_page's flush_dcache_page up before
page_table_lock - doesn't in fact matter at all, just saves one worry when
researching flush_dcache_page locking constraints.

114c71ee

[PATCH] rmap 32 zap_pmd_range wrap · 5911438d

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

From: Andrea Arcangeli <andrea@suse.de>

zap_pmd_range, alone of all those page_range loops, lacks the check for
whether address wrapped.  Hugh is in doubt as to whether this makes any
difference to any config on any arch, but eager to fix the odd one out.

5911438d

[PATCH] rmap 31 unlikely bad memory · 68c45e43

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

From: Andrea Arcangeli <andrea@suse.de>

Sprinkle unlikelys throughout mm/memory.c, wherever we see a pgd_bad or a
pmd_bad; likely or unlikely on pte_same or !pte_same.  Put the jump in the
error return from do_no_page, not in the fast path.

68c45e43

[PATCH] rmap 30 fix bad mapcount · d321a42d

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

From: Andrea Arcangeli <andrea@suse.de>

page_alloc.c's bad_page routine should reset a bad mapcount; and it's more
revealing to show the bad mapcount than just the boolean mapped.

d321a42d

[PATCH] rmap 29 VM_RESERVED safety · c3a17613

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

From: Andrea Arcangeli <andrea@suse.de>

Set VM_RESERVED in videobuf_mmap_mapper, to warn do_no_page and swapout not to
worry about its pages.  Set VM_RESERVED in ia64_elf32_init, it too provides an
unusual nopage which might surprise higher level checks.  Future safety: they
don't actually pose a problem in this current tree.

c3a17613

[PATCH] rmap 28 remove_vm_struct · bbdaef5f

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

The callers of remove_shared_vm_struct then proceed to do several more
identical things: gather them together in remove_vm_struct.

bbdaef5f

[PATCH] rmap 27 memset 0 vma · c8ba2065

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

We're NULLifying more and more fields when initializing a vma
(mpol_set_vma_default does that too, if configured to do anything).  Now use
memset to avoid specifying fields, and save a little code too.

(Yes, I realize anon_vma will want to set vm_pgoff non-0, but I think that
will be better handled at the core, since anon vm_pgoff is negotiable up until
an anon_vma is actually assigned.)

c8ba2065

[PATCH] rmap 24 no rmap fastcalls · ee7baa35

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

I like CONFIG_REGPARM, even when it's forced on: because it's easy to force
off for debugging - easier than editing out scattered fastcalls. Plus I've
never understood why we make function foo a fastcall, but function bar not.
Remove fastcall directives from rmap. And fix comment about mremap_moved
race: it only applies to anon pages.

ee7baa35

[PATCH] rmap 23 empty flush_dcache_mmap_lock · 4a72e942

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

Most architectures (like i386) do nothing in flush_dcache_page, or don't scan
i_mmap in flush_dcache_page, so don't need flush_dcache_mmap_lock to do
anything: define it and flush_dcache_mmap_unlock away. Noticed arm26, cris,
h8300 still defining flush_page_to_ram: delete it again.

4a72e942

[PATCH] rmap 22 flush_dcache_mmap_lock · 16ceff2d

Andrew Morton authored May 22, 2004

From: Hugh Dickins <hugh@veritas.com>

arm and parisc __flush_dcache_page have been scanning the i_mmap(_shared) list
without locking or disabling preemption.  That may be even more unsafe now
it's a prio tree instead of a list.

It looks like we cannot use i_shared_lock for this protection: most uses of
flush_dcache_page are okay, and only one would need lock ordering fixed
(get_user_pages holds page_table_lock across flush_dcache_page); but there's a
few (e.g.  in net and ntfs) which look as if they're using it in I/O
completion - and it would be restrictive to disallow it there.

So, on arm and parisc only, define flush_dcache_mmap_lock(mapping) as
spin_lock_irq(&(mapping)->tree_lock); on i386 (and other arches left to the
next patch) define it away to nothing; and use where needed.

While updating locking hierarchy in filemap.c, remove two layers of the fossil
record from add_to_page_cache comment: no longer used for swap.

I believe all the #includes will work out, but have only built i386.  I can
see several things about this patch which might cause revulsion: the name
flush_dcache_mmap_lock?  the reuse of the page radix_tree's tree_lock for this
different purpose?  spin_lock_irqsave instead?  can't we somehow get
i_shared_lock to handle the problem?

16ceff2d