Commits · 227d8064a796c3cc28f63401b7b9a1d3c25e9b95 · Kirill Smelkov / linux

19 May, 2004 40 commits

[PATCH] VFS cache sizing fix for small machines · 227d8064

Andrew Morton authored May 19, 2004

From: Matt Mackall <mpm@selenic.com>

Doing the algebra:

c = (a - b) * 3/2
a' = a - c = a - 3/2(a - b) = (2a - 3a + 3b)/2 = (3b - a)/2
a' >= 0
3b - a >= 0
3b >= a
b >= a/3
nr_free_pages() >= mempages/3

We can indeed get into trouble if we try to load a large kernel on a very
small box (ie kernel reserves more than 2/3 of usable memory).  Surprisingly I
haven't hit this, but here's a fix.

227d8064

[PATCH] svc_recv() fix · c0b5943f

Andrew Morton authored May 19, 2004

From: "J. Bruce Fields" <bfields@fieldses.org>

svc_recv may call svc_sock_release before rqstp->rq_res is initialized.

c0b5943f

[PATCH] kNFSd: Add a warning when upcalls fail, · eeb6f17b

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: "J. Bruce Fields" <bfields@fieldses.org>

To help the user diagnose problems caused by user-level daemons not running.

eeb6f17b

[PATCH] kNFSd: Remove check on number of threads waiting on user-space. · 95321936

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: "J. Bruce Fields" <bfields@fieldses.org>

Currently we are counting the number of threads already asleep and returning
an immediate NFS4ERR_DELAY (==JUKEBOX) error if more than half are already
asleep.

This patch removes that logic, so instead we only return NFS4ERR_DELAY if an
upcall times out (if it takes more than a second to return).

With the thread counting there is the risk that even when all the relevant
subsystems are responsive, the client may still see occasional NFS4ERR_DELAY
returns just because, by coincidence, several upcalls were initiated at the
same time. I expect clients will delay several seconds before retrying after
NFS4ERR_DELAY, so this will be quite noticeable to users. Sporadic long
delays like this are likely to lead users to suspect a problem somewhere, when
in fact there is none.

The current scheme ensures that we can still process requests not depending on
upcalls, even when all threads would otherwise be tied up waiting on upcalls.
However, this is not something that should happen under normal circumstances;
if a server spends a significant portion of its time with all threads waiting
for upcalls, this a sign that something is seriously wrong.

In such a circumstance (e.g., an ldap server dies), we can, at least, bound
the waiting time to a second without the need for counting threads.

In short, removing the thread-counting will allow us to behave predictably
when things are working, while still allowing some progress when they don't.

It would be a worthwhile project to measure the amount of time threads spend
waiting for upcalls (or for reads, for that matter); if a significant portion
of the time they spend handling requests is spent sleeping, then there's an
opportunity to improve nfsd performance: if we can break the one-to-one
mapping between requests and threads, then we can lower the number of threads
required to keep the nfs server busy.

However, both the currently available options for doing this are problematic:
returning JUKEBOX/DELAY errors at random times will lead to unpredictable
performance, and saving a copy of the request to be processed from scratch
again later is wasteful and makes it difficult to provide correct semantics,
especially in the NFSv4 case.

So for now I believe waits with short timeouts are the best option.

95321936

[PATCH] kNFSd: Reduce timeout when waiting for idmapper userspace daemon. · 5eb098e2

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: "J. Bruce Fields" <bfields@fieldses.org>

1 second should be plenty of time; if we're going to take longer than that
it's probably better just to return NFS4ERR_DELAY and let the client retry
anyway.

5eb098e2

[PATCH] kNFSd: Improve idmapper behaviour on failure. · 74d88830

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: "J. Bruce Fields" <bfields@fieldses.org>

Slightly better behavior on failed mapping (which may happen either because
idmapd is not running, or because there it has told us it doesn't know the
mapping.):

	on name->id (setattr), return BADNAME.  (I used ESRCH to
		communicate BADNAME, just because it was the first error in
		include/asm-generic/errno-base.h that had something to
		do with nonexistance of something, and that we weren't
		already using.)

	id->name (getattr), return a string representation of the numerical
		id.  This is probably useless to the client, especially
		since we're unlikely to accept such a string on a setattr,
		but perhaps some client will find it mildly helpful.

74d88830

[PATCH] kNFSd: Fix race conditions in idmapper · 63b60615

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

From: "J. Bruce Fields" <bfields@fieldses.org>

Also fix leaks on error; split up code a bit to make it easier to verify
correctness.

63b60615

[PATCH] kNFSd: Protect reference to exp across calls to nfsd_cross_mnt · 8dfec5a1

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

nfsd_cross_mnt can release the reference to the passed svc_export structure
when it returns a different svc_export structure. So we need to make sure we
have a counted reference before, and drop the reference afterwards.

8dfec5a1

[PATCH] kNFSd: Change fh_compose to NOT consume a reference to the dentry. · a1de7361

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

fh_compose currently consumes a reference to the dentry but not the export
point.  This is both inconsistent and confusing.

It is better if a routine like this doesn't consume reference points, so with
this patch, it doesn't.  This fixes a couple of very subtle and unusual
reference counting errors.

a1de7361

[PATCH] kNFSd: Allow larger writes to sunrpc/svc caches. · 4fc69688

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

We currently serialize all writes to these caches with queue_io_sem, so we
only needed one buffer.

There is some need for larger-than-one-page writes, so we can just statically
allocate a buffer.

4fc69688

[PATCH] kNFSd: Make sure CACHE_NEGATIVE is cleared when a cache entry is updates. · 00979a9f

Andrew Morton authored May 19, 2004

From: NeilBrown <neilb@cse.unsw.edu.au>

This is important for update-in-place caches which may change from being
negative to posative.

Thanks to "J.  Bruce Fields" <bfields@fieldses.org> and Olaf Kirch
<okir@suse.de>

00979a9f

[PATCH] kNFSd: Use correct _bh locking on sv_lock. · fb4bb3a0
Andrew Morton authored May 19, 2004
```
From: NeilBrown <neilb@cse.unsw.edu.au>

With the _bh, we can deadlock.
```
fb4bb3a0

[PATCH] initialise mca_bus_type even if !MCA_bus · 64e67fb0

Andrew Morton authored May 19, 2004

From: "Randy.Dunlap" <rddunlap@osdl.org>

We need to call mca_system_init() to register MCA bus struct, otherwise
find_mca_adapter() oopses with a NULL ptr dereference.

Fixes this oops reported last week:
	http://marc.theaimsgroup.com/?l=linux-kernel&m=108455738606747&w=2

Thanks to James Bottomley for pointing this out.

64e67fb0

[PATCH] drivers/cdrom/aztcd.c warning fix. · a856abc9

Andrew Morton authored May 19, 2004

From: "Luiz Fernando N. Capitulino" <lcapitulino@prefeitura.sp.gov.br>

drivers/cdrom/azctd.c:379: warning: `pa_ok' defined but not used

a856abc9

[PATCH] do_generic_mapping_read() cleanup · 6eb8058f
Andrew Morton authored May 19, 2004
```
We just tested the page's uptodateness, no point in doing it again.
```
6eb8058f

[PATCH] efivars: add MODULE_VERSION, remove unnecessary check in exit · 54a50867

Andrew Morton authored May 19, 2004

From: Matt Domsch <Matt_Domsch@dell.com>

* Adds MODULE_VERSION

* Remove check for efi_enabled in efivars_exit() - we aborted module load at
  init based on this already.

54a50867

[PATCH] EDD: remove unused SCSI header files · 17b62a72

Andrew Morton authored May 19, 2004

From: Matt Domsch <Matt_Domsch@dell.com>

EDD: Remove no longer needed SCSI header file inclusion.

Thanks to ArjanV for reminding me.

17b62a72

[PATCH] SubmittingDrivers completeness · dc4ed52a

Andrew Morton authored May 19, 2004

From: Jonathan Corbet <corbet@lwn.net>

I noticed a patch went in to Documentation/SubmittingDrivers which tweaked
the URL for KernelTraffic.  Here's a self-serving patch which makes that
section more complete; to be fair, I added two other sites too.  Just in
case it's useful.

dc4ed52a

[PATCH] Quota fix 3 - quota file corruption · e69c0c55

Andrew Morton authored May 19, 2004

From: Jan Kara <jack@ucw.cz>

This patch fixes possible quota files corruption which could happen when root
did not have any inodes&space allocated.

Originally this could not happen as structure would not be written to disk in
that case but with journalled quota we need to write even all-zero structure.
The fix is not very nice but change of the format on disk is probably worse (I
made a mistake with not including the usage-bitmaps into format :().

e69c0c55

[PATCH] SELinux: fix error handling in selinuxfs · cfeff004

Andrew Morton authored May 19, 2004

From: Stephen Smalley <sds@epoch.ncsc.mil>

This patch against 2.6.6 fixes error handling for two out-of-memory conditions
in selinuxfs, avoiding potential deadlock due to returning without releasing a
semaphore. The patch was submitted by Karl MacMillan of Tresys.

cfeff004

[PATCH] correct ps2esdi module parm name · 5187778f

Andrew Morton authored May 19, 2004

From: "Randy.Dunlap" <rddunlap@osdl.org>

The module parameter name is incorrect (looks like a thinko).

5187778f

[PATCH] Subject: [PATCH] kbuild SUBDIRS="more/ than/ one/" · 4da35e71

Andrew Morton authored May 19, 2004

From: Andreas Gruenbacher <agruen@suse.de>

Here is a patch that re-adds support for more than one directory in SUBDIRS.
We have a number of packages that use this.

The FORCE dependency of crmodverdir seems unnecessary; removing.

(acked by Sam)

4da35e71

[PATCH] mark the `planb' video driver broken · e6d0a0a2

Andrew Morton authored May 19, 2004

From: Christoph Hellwig <hch@lst.de>

This one is missing updates from the v4l1 interfaces in 2.4 to the 2.6ish
v4l2 and thus doesn't compile.  While we're at it also remove the
MOD_{INC,DEC}_USE_COUNT calls in it that were bogus even in 2.4 to avoid
false positives in grep.

e6d0a0a2

[PATCH] don't mention MOD_INC_USE_COUNT/MOD_DEC_USE_COUNT in docs · 96595459

Andrew Morton authored May 19, 2004

From: Christoph Hellwig <hch@lst.de>

If we want new drivers to not use obsolete interfaces we're better off not
mentioning it in the documentation.

96595459

[PATCH] replace MOD_INC_USE_COUNT in cyber2000fb · 00541475

Andrew Morton authored May 19, 2004

From: Christoph Hellwig <hch@lst.de>

This driver is unloadable for the pci case, but not if vlb cards are found so
we can't use the module_exit removal to lock it into memory.

Replace the MOD_INC_USE_COUNT with __module_get in it's module_init routine.

00541475

[PATCH] Remove blk_run_queues() remnants · e229be85
Andrew Morton authored May 19, 2004
```
It no longer exists.
```
e229be85

[PATCH] fore200e.c warning fix · ccee748e

Andrew Morton authored May 19, 2004

drivers/atm/fore200e.c: In function `fore200e_close':
drivers/atm/fore200e.c:1659: warning: use of cast expressions as lvalues is deprecated

ccee748e

[PATCH] reserve syscall slots for kexec · c450028f

Andrew Morton authored May 19, 2004

From: "Randy.Dunlap" <rddunlap@osdl.org>

kexec is a fairly major and popular feature.  People are shipping it in
products, although it is not known if Linux distributors plan to ship it.

The patch reserves the kexec syscall slots to pin the ABI down for
everyone.


- add kexec_load prototype to syscalls.h

- add LINUX_REBOOT_CMD_KEXEC to reboot.h

- add kexec_load syscall for ia32, ia64, x86_64, ppc32, ppc64

c450028f

[PATCH] Fix for Makefiles to get KBUILD_OUTPUT working · d06d15d2

Andrew Morton authored May 19, 2004

From: Mathieu Chouquet-Stringer <mchouque@online.fr>

If you use O=/someotherdir or KBUILD_OUTPUT=/someotherdir on the following
architectures: alpha, mips, sh and cris, the build process is probably
going to fail at one point or another, depending on the target you used,
because make can't find scripts/Makefile.build or scripts/Makefile.clean.

The following patch fixes this, I greped the whole tree and these four were
the only "offenders" I found.

d06d15d2

[PATCH] BeFS MAINTAINERS update · 64fa9f96
Andrew Morton authored May 19, 2004
```
From: "Sergey S. Kostyliov" <rathamahata@php4.ru>
```
64fa9f96

[PATCH] Work around gcc 3.3.3-hammer sched miscompilation on x86-64 · 05d2b90d

Andrew Morton authored May 19, 2004

From: Andi Kleen <ak@muc.de>

The new domain scheduler got miscompiled on x86-64 with gcc 3.3.3-hammer,
which is shipping with some distributions.  The kernel deadlocks eventually
under light stress on SMP systems with the right options.

After some experiments it seems this simple change avoids the
miscompilation.  It also doesn't pessimize the code unduly for other
architectures.

05d2b90d

[PATCH] slab: add kmem_cache_alloc_node · 68978ee7

Andrew Morton authored May 19, 2004

From: Manfred Spraul <manfred@colorfullife.com>

The attached patch adds a simple kmem_cache_alloc_node function: allocate
memory on a given node.  The function is intended for cpu bound structures.
 It's used for alloc_percpu and for the slab-internal per-cpu structures.
Jack Steiner reported a ~3% performance increase for AIM7 on a 64-way
Itanium 2.

Port maintainers: The patch could cause problems if CPU_UP_PREPARE is
called for a cpu on a node before the corresponding memory is attached
and/or if alloc_pages_node doesn't fall back to memory from another node if
there is no memory in the requested node.  I think noone does that, but I'm
not sure.

68978ee7

[PATCH] slab: allow arch override for kmem_bufctl_t · a3e754c2

Andrew Morton authored May 19, 2004

From: Manfred Spraul <manfred@colorfullife.com>

The slab allocator keeps track of the free objects in a slab with a linked
list of integers (typedef'ed to kmem_bufctl_t). Right now unsigned int is
used for kmem_bufctl_t, i.e. 4 bytes per-object overhead.

The attached patch implements a per-arch definition of for this type:
Theoretically, unsigned short is sufficient for kmem_bufctl_t and this would
reduce the per-object overhead to 2 bytes. But some archs cannot operate on
16-bit values efficiently, thus it's not possible to switch everyone to
ushort.

The chosen types are a result of dicussions with the various arch maintainers.

a3e754c2

[PATCH] slab: enable runtime cache line size on i386 · 897d49be

Andrew Morton authored May 19, 2004

From: Manfred Spraul <manfred@colorfullife.com>

the attached patch switches the SLAB_HWCACHE_ALIGN alignment from the
compile time L1 cache line size to the runtime detected value for i386. 
x86-64 already uses the runtime detection.

897d49be

[PATCH] Fix arithmetic in shrink_zone() · 6fa1d901

Andrew Morton authored May 19, 2004

From: Nick Piggin <nickpiggin@yahoo.com.au>

If the zone has a very small number of inactive pages, local variable
`ratio' can be huge and we do way too much scanning.  So much so that Ingo
hit an NMI watchdog expiry, although that was because the zone would have a
had a single refcount-zero page in it, and that logic recently got fixed up
via get_page_testone().

Nick's patch simply puts a sane-looking upper bound on the number of pages
which we'll scan in this round.


It fixes another failure case: if the inactive list becomes very small
compared to the size of the active list, active list scanning (and therefore
inactive list refilling) also becomes small.

This patch causes inactive list scanning to be keyed off the size of the
active+inactive lists.  It has the plus of hiding active and inactive
balancing implementation from the higher level scanning code.  It will
slightly change other aspects of scanning behaviour, but probably not
significantly.

6fa1d901

[PATCH] dentry size tuning · 8a98d6d1

Andrew Morton authored May 19, 2004

Experimenting with various values of DENTRY_STORAGE

dentry size objs/slab dentry size * objs/slab inline string

148 26 3848 32
152 26 3952 36
156 25 3900 40
160 24 4000 44

We're currently at 160. The patch fairly arbitrarily takes it down to 152, so
we can fit a 35-char name into the inline part of the dentry.

Also, go back to the old way of sizing d_iname so that any arch-specific
compiler-forced alignemnts are honoured.

8a98d6d1

[PATCH] Fix madvise length checking · f7efcc03

Andrew Morton authored May 19, 2004

Fix http://bugme.osdl.org/show_bug.cgi?id=2710.

When the user passed madvise a length of -1 through -4095, madvise blindly
rounds this up to 0 then "succeeds".

f7efcc03

[PATCH] Remove hardcoded offsets from i386 asm · c2273c87

Andrew Morton authored May 19, 2004

From: Brian Gerst <bgerst@didntduck.org>

Generate offsets for thread_info, cpuinfo_x86, and a few others instead of
hardcoding them.

c2273c87

[PATCH] fix radio-cadet `readq' namespace clash · dba396ff
Andrew Morton authored May 19, 2004
```
It conflicts with the readq() I/O function.
```
dba396ff

[PATCH] security: add disable param to capabilities module · 01db63f2

Andrew Morton authored May 19, 2004

From: Chris Wright <chrisw@osdl.org>

Add disable param to capabilities module.  Similar to the SELinux param for
disabling at boot time.  This allows vendors to ship single binary image with
capabilities compiled statically, and disable it if they provide another
security model compiled as module.

01db63f2