- 05 Jun, 2004 37 commits
-
-
Nick Piggin authored
David Mosberger noticed bw_pipe was way down on sched-domains kernels on SMP systems. That is due to two things: first, the previous wake-affine logic would *always* move a pipe wakee onto the waker's CPU. With the scheduler rework, this was toned down a lot (but extended to all types of wakeups). One of the ways this was damped was with the logic: don't move the wakee if its CPU is relatively idle compared to the waker's CPU. Without this, some workloads would pile everything up onto a few CPUs and get lots of idle time. However, the fix was a bit of a blunt hack: if the wakee runqueue was below 50% busy, and the waker's was above 50% busy, we wouldn't do the move. I think a better way to capture it is what this patch does: if the wakee runqueue is below 100% busy, and the sum of the two runqueue's loads is above 100% busy, and the wakee runqueue is less busy than the waker runqueue (ie. CPU utilisation would drop if we do the move), then we don't do the move. After I fixed this, I found things were still getting bounced around quite a bit. The reason is that we were attempting very aggressive idle balancing in order to cut down idle time in a dbt2-pgsql workload, which is particularly sensitive to idle. After having Mark Wong (markw@osdl.org) retest this load with this patch, it looks like we don't need to be so aggressive. I'm glad to be rid of this because it never sat too well with me. We should see slightly lower cost of schedule and slightly improved cache impact with this change too. Mark said: --- This looks pretty good: metric kernel 2334 2.6.7-rc2 2298 2.6.7-rc2-mm2 2329 2.6.7-rc2-mm2-sched-more-wakeaffine --- ie. within the noise. David said: --- Oooh, me likeee! Host OS Pipe AF UNIX --------- ------------- ---- ---- caldera.h Linux 2.6.6 3424 2057 (plain 2.6.6) caldera.h Linux 2.6.7-r 333. 1402 (original 2.6.7-rc1) caldera.h Linux 2.6.7-r 3086 4301 (2.6.7-rc1 with your patch) Pipe-bandwidth is still down about 10% but that may be due to unrelated changes (or perhaps warmup effects?). The AF UNIX bandwidth is just mindboggling. Moreover, with your patch 2.6.7-rc1 shows better context-switch times and lower communication latencies (more like the numbers you're getting on UP). So it seems like the overall balance of keeping things on the same CPU vs. distributing them across CPUs is improved. --- I also ran some tests on the NUMAQ. kernbench, dbench, hackbench, reaim were much the same. tbench was improved, very much so when clients < NR_CPU. Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
This condtion on this loop is primarily to avoid the loop if it doesn't appear to be needed. However it optimises a little too much and there is a case where it skips the loop when it is really needed. This patch fixes it. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jens Axboe authored
There's a bad length check in cdrom_get_random_writable(), it's off-by-4 since fh->data_len is the length of data _after_ that field (which is offset 4 bytes in the header). Check is pretty bogus anyways, so just kill it. Signed-Off-By: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Fulghum authored
* Fix cleanup on driver init failure (call pci_unregister_driver if necessary) * Keep driver loaded if no hardware found (for dynid support) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Fulghum authored
* Fix cleanup on driver init failure Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Paul Fulghum authored
* Fix cleanup on driver init failure (call pci_unregister_driver if necessary) * Keep driver loaded if no hardware found (for dynid support) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Kevin Corry authored
dm-ioctl.c: Use a size_t* instead of an int* in list_version_get_needed(). size_t and int are not the same size on all architectures. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
fs/nfs/direct.c: In function `nfs_file_direct_write': fs/nfs/direct.c:549: warning: initialization discards qualifiers from pointer target type Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Rusty Russell authored
As pointed out by Paul Jackson <pj@sgi.com>, sometimes 99 chars is not enough. We currently get a page from sysfs: that code should check we haven't overrun it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andi Kleen authored
Often users only report what syslogd reports with KERN_ALERT when a kernel crash occurs. Make an oops print mpre information with that (in particular the RIP) Patch for i386 and x86-64. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
This allows the number of "raid_disks" in a raid1 to be changed. This requires allocating a new pool of "r1bio" structures which a different number of bios, suspending IO, and swapping the new pool in place of the old. (and a few other related changes). Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
It is possible to have raid1/4/5/6 arrays that do not use all the space on the drive. This can be done explicitly, or can happen info you, one by one, replace all the drives with larger devices. This patch extends the "SET_ARRAY_INFO" ioctl (which previously invalid on active arrays) allow some attributes of the array to be changed and implements changing of the "size" attribute. "size" is the amount of each device that is actually used. If "size" is increased, the new space will immediately be "resynced". Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
If raid1 decides it needs to resync it will do so even if there is only one working device. This is pointless. With this patch we abort resync if there is nowhere to write to. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
If the superblock isn't persistent, we shouldn't allow room for it. From: Paul Clements <Paul.Clements@SteelEye.com> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Normally the size is chosen as a multiple of the chunk size, but if the size is explicitly chosen, it might not be. So we force it. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
This isn't really needed at the moment, but it is more consistant with the interface and may be needed later. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
md_check_recovery only locks a device and does stuff when it thinks there is a real likelyhood that something needs doing. So the test at the top must cover all possibilities. But it didn't cover the possibility that the last outstanding request on a failed device had finished and so the device needed to be removed. As a result, a failed drive might not get removed from the personalities perspective on the array, and so it could never be removed from the array as a whole. With this patch, whenever ->nr_pending hits zero on a faulty device, MD_RECOVERY_NEEDED is set so that md_check_recovery will do stuff. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
md/multipath has two separate pieces of code for choosing a device to use, one when a request is first made and the other when a request is being re-tried after failure. This patch discards multipath_read_balance and uses multipath_map in both situations. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Fix minor typos. From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Fix error return in create. (See comment in xdr for createtype4 at end of rfc3530.) From: Andy Adamson <andros@citi.umich.edu> From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Fix a somewhat bizarre corner case in clid processing: a clientid match isn't required for case 3. From: Andy Adamson <andros@citi.umich.edu> From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Oops: we were claiming to support the TIME_CREATE attribute, when we don't really. From: Andy Adamson <andros@citi.umich.edu> From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Fix oops in release_lockowner. We need to break out to two loops, not just one, and if the loop finds nothing, 'local' won't be NULL. So just put the body of the 'if' inside the loop. From: Andy Adamson <andros@citi.umich.edu> From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
rsc_lookup is a bit complicated: it either takes responsibility for the memory pointed to by handle.data and sets handle.data to NULL, or it leaves handle.data unchanged, in which case the caller is responsible for freeing handle.data. I forgot that the possibility of inserting a negative cache entry into the cache meant that this could happen even when rsc_lookup is called with set == 0. Note that the ip_map code has the same bug, not that it seems to matter much, since the memory in question in that case is always just a statically allocated string. From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
The server sunrpc code should take a reference on the relevant module before calling any authentication code. Also, it looks to me like the table of authops needs some locking. Finally, gss_svc_init wasn't checking the status of svc_auth_register, and gss_svc_shutdown wasn't calling svc_auth_unregister. From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
Encode names directly into xdr buffer; this optimizes out a data copy, reduces stack usage, and will make life simpler when doing acls. From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
there's a small typo in nfsd_acceptable. It calls err = permission(parent->d_inode, S_IXOTH, NULL); It really wants to use MAY_EXEC instead of S_IXOTH. Those happen to be the same at the moment, but may not do so forever. From: Olaf Kirch <okir@suse.de>: From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
If ek = exp_find_key() is not an error, then ek->ek_export should be set; no point in checking if it's NULL. From: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Neil Brown authored
The "offset" in an entry in an nfs3 readdir response is 64 bits long and as it has only a 32 bit alignment, it fall half in one page of the response and half in another. This patch adds a second offset pointer (offset1) which points to the second half in the unusual case of the offset being split between pages, and sets and uses it accordingly. From: Olaf Kirch <okir@suse.de> Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andreas Dilger authored
A problem with htree was recently discovered during Lustre testing when files were being renamed within the same directory. In some cases the addition of the new name caused a directory block split and the old dir_entry was pointing at the wrong entry, and the wrong entry was removed. This would seem entirely possible in a Maildir directory, since the MTA will be doing a lot of renames within the same directory. If old_de is pointing to the newly-added entry (i_ino is the same) we end up deleting the new entry instead of the old one. It looks as if the rename never happened. We need to verify that the name we are unlinking is what we expect. If is also possible that old_de is pointing to the now-unused space at the end of a newly-split leaf block, so we still need to try ext3_delete_entry() (which will skip the stale entry and return ENOENT) instead of just relying on the inum + name check. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
I've seen no warnings, nor heard any reports of warnings, that anon_vma ever misses ptes (nor anonmm before it). That WARN_ON (with its useless stack dump) was okay to goad developers into making reports, but would mainly be an irritation if it ever appears on user systems: kill it now. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
Andrea Arcangeli's fix to an ironic weakness with get_user_pages. try_to_unmap_one must check page_count against page->mapcount before unmapping a swapcache page: because the raised pagecount by which get_user_pages ensures the page cannot be freed, will cause any write fault to see that page as not exclusively owned, and therefore a copy page will be substituted for it - the reverse of what's intended. rmap.c was entirely free of such page_count heuristics before, I tried hard to avoid putting this in. But Andrea's fix rarely gives a false positive; and although it might be nicer to change exclusive_swap_page etc. to rely on page->mapcount instead, it seems likely that we'll want to get rid of page->mapcount later, so better not to entrench its use. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
For those arches (arm and parisc) which use the i_mmap tree to implement flush_dcache_page, during split_vma there's a small window in vma_adjust when flush_dcache_mmap_lock is dropped, and pages in the split-off part of the vma might for an instant be invisible to __flush_dcache_page. Though we're more solid there than ever before, I guess it's a bad idea to leave that window: so (with regret, it was structurally nicer before) take __vma_link_file (and vma_prio_tree_init) out of __vma_link. vma_prio_tree_init (which NULLs a few fields) is actually only needed when copying a vma, not when a new one has just been memset to 0. __insert_vm_struct is used by nothing but vma_adjust's split_vma case: comment it accordingly, remove its mark_mm_hugetlb (it can never create a new kind of vma) and its validate_mm (another follows immediately). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
Fix vma_adjust adjust_next wrapping: Rajesh V. pointed out that if end were 2GB or more beyond next->vm_start (on 32-bit), then next->vm_pgoff would have been negatively adjusted. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Rajesh Venkatasubramanian <vrajesh@umich.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
The follow_page write-access case is relying on pte_page before checking pfn_valid: rearrange that - and we don't need three struct page *pages. (I notice mempolicy.c's verify_pages is also relying on pte_page, but I'll leave that to Andi: maybe it ought to be failing on, or skipping over, VM_IO or VM_RESERVED vmas?) Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
Initialize swapper_space.i_mmap_nonlinear, so mapping_mapped reports false on it (as it used to do). Update comment on swapper_space, now more fields are used than those initialized explicitly. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andi Kleen authored
This patch fixes the problem some people had with their systems crashing early at boot. Also fix a problem in the LDT/TSS setup noticed by Paul Menage. And some other random fixes. - Update defconfig - Remove some unnecessary printks - Enlarge kernel mapping to 40MB - Fix acpi=ht (Suresh Siddha) - Use KERN_ALERT for more important oops lines - Fix LDT/TSS limit (Paul Menage) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
- 04 Jun, 2004 3 commits
-
-
Paul Mackerras authored
Brown-paper bag time... I put a typo in the asm for _raw_write_trylock (left in a spurious \n\). This patch fixes it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
bk://bk.arm.linux.org.uk/linux-2.6-pcmciaLinus Torvalds authored
into ppc970.osdl.org:/home/torvalds/v2.6/linux
-
Russell King authored
-