- 01 Feb, 2006 40 commits
-
-
Christoph Lameter authored
Migrate a page with buffers without requiring writeback This introduces a new address space operation migratepage() that may be used by a filesystem to implement its own version of page migration. A version is provided that migrates buffers attached to pages. Some filesystems (ext2, ext3, xfs) are modified to utilize this feature. The swapper address space operation are modified so that a regular migrate_page() will occur for anonymous pages without writeback (migrate_pages forces every anonymous page to have a swap entry). Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Modify policy layer to support direct page migration - Add migrate_pages_to() allowing the migration of a list of pages to a a specified node or to vma with a specific allocation policy in sets of MIGRATE_CHUNK_SIZE pages - Modify do_migrate_pages() to do a staged move of pages from the source nodes to the target nodes. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Add remove_from_swap remove_from_swap() allows the restoration of the pte entries that existed before page migration occurred for anonymous pages by walking the reverse maps. This reduces swap use and establishes regular pte's without the need for page faults. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Add direct migration support with fall back to swap. Direct migration support on top of the swap based page migration facility. This allows the direct migration of anonymous pages and the migration of file backed pages by dropping the associated buffers (requires writeout). Fall back to swap out if necessary. The patch is based on lots of patches from the hotplug project but the code was restructured, documented and simplified as much as possible. Note that an additional patch that defines the migrate_page() method for filesystems is necessary in order to avoid writeback for anonymous and file backed pages. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Check for PageSwapCache after looking up and locking a swap page. The page migration code may change a swap pte to point to a different page under lock_page(). If that happens then the vm must retry the lookup operation in the swap space to find the correct page number. There are a couple of locations in the VM where a lock_page() is done on a swap page. In these locations we need to check afterwards if the page was migrated. If the page was migrated then the old page that was looked up before was freed and no longer has the PageSwapCache bit set. Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Christoph Lameter <clameter@@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
If large amounts of zone memory are used by empty slabs then zone_reclaim becomes uneffective. This patch shakes the slab a bit. The problem with this patch is that the slab reclaim is not containable to a zone. Thus slab reclaim may affect the whole system and be extremely slow. This also means that we cannot determine how many pages were freed in this zone. Thus we need to go off node for at least one allocation. The functionality is disabled by default. We could modify the shrinkers to take a zone parameter but that would be quite invasive. Better ideas are welcome. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
In some situations one may want zone_reclaim to behave differently. For example a process writing large amounts of memory will spew unto other nodes to cache the writes if many pages in a zone become dirty. This may impact the performance of processes running on other nodes. Allowing writes during reclaim puts a stop to that behavior and throttles the process by restricting the pages to the local zone. Similarly one may want to contain processes to local memory by enabling regular swap behavior during zone_reclaim. Off node memory allocation can then be controlled through memory policies and cpusets. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Currently the zone_reclaim code has a fixed window of 30 seconds of off node allocations should a local zone have no unused pagecache pages left. Reclaim will be attempted again after this timeout period to avoid repeated useless scans for memory. This is also useful to established sufficiently large off node allocation chunks to relieve the local node. It may be beneficial to adjust that time period for some special situations. For example if memory use was exceeding node capacity one may want to give up for longer periods of time. If memory spikes intermittendly then one may want to shorten the time period to reduce the number of off node allocations. This patch allows just that.... Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Instead of scanning all the pages in a zone, imitate real swap and scan only a portion of the pages and gradually scan more if we do not free up enough pages. This avoids a zone suddenly loosing all unused pagecache pages (we may after all access some of these again so they deserve another chance) but it still frees up large chunks of memory if a zone only contains unused pagecache pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
zone_reclaim should leave that to the real swapper. We are only interested in evicting unmapped pages. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
2.6.15's hugepage faulting introduced huge_pages_needed accounting into hugetlbfs: to count how many pages are already in cache, for spot check on how far a new mapping may be allowed to extend the file. But it's muddled: each hugepage found covers HPAGE_SIZE, not PAGE_SIZE. Once pages were already in cache, it would overshoot, wrap its hugepages count backwards, and so fail a harmless repeat mapping with -ENOMEM. Fixes the problem found by Don Dupuis. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-By: Adam Litke <agl@us.ibm.com> Acked-by: William Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Benjamin LaHaise authored
Improve the performance of slab_put_obj(). Without the cast, gcc considers ptrdiff_t a 64 bit signed integer and ends up emitting code to use a full signed 128 bit divide on EM64T, which is substantially slower than a 32 bit unsigned divide. I noticed this when looking at the profile of a case where the slab balance is just on edge and thrashes back and forth freeing a block. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
- If we only reclaim nr_pages then its okay to stay on node. Switch from > to >= for the comparison. - vm_table[] entry for zone_reclaim_mode is a bit screwed up. - Add empty lines around shrink_zone to show that this is the central function to be called. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Make sc->may_writepage control the writeout behavior of shrink_list. Remove the laptop_mode trick from shrink_list and instead set may_writepage in try_to_free_pages properly. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andy Whitcroft authored
GFP_ZONETYPES calculate from GFP_ZONEMASK GFP_ZONETYPES's value is directly related to the value of GFP_ZONEMASK. It takes one of two forms depending whether the top bit of GFP_ZONEMASK is a 'loner'. Supply both forms, enabling the loner. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andy Whitcroft authored
GFP_ZONETYPES define using GFP_ZONEMASK and add commentry Add commentry explaining the optimisation that we can apply to GFP_ZONETYPES when the leftmost bit is a 'loaner', it can only be set in isolation. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Zone reclaim is usually only run on the local node. Headless nodes do not have any local processors. This patch checks for headless nodes and performs zone reclaim on them. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Lameter authored
Ensure that the performance of off node pages stays the same as before. Off node pagefault tests showed an 18% drop in performance without this patch. - Increase the timeout to 30 seconds to reduce the overhead. - Move all code possible out of the off node hot path for zone reclaim (Sorry Andrew, the struct initialization had to be sacrificed). The read_page_state() bit us there. - Check first for the timeout before any other checks. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Alexey Dobriyan authored
OpenBSD doesn't see "." correctly in directories created by Linux. Copying files over several KB will buy you infinite loop in __getblk_slow(). Copying files smaller than 1 KB seems to be OK. Sometimes files will be filled with zeros. Sometimes incorrectly copied file will reappear after next file with truncated size. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Gibson authored
The flattened device tree is the only supported way of booting ARCH=powerpc kernels on non Open Firmware machines. The documentation for the flattened tree format and contents has been discussed on mailing lists and lately has been living in the dtc git tree. Really, it ought to go in the kernel's Documentation directory for maximum visibility. Signed-off-by: David Gibson <dwg@au1.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Rafael J. Wysocki authored
Prevent the kernel from setting the log level to 10 unconditionally during suspend/resume which was needed in the past for debugging, but generally is undesirable. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Dave Jones authored
Whoops. kobject_register failed for hexium HV-PCI6/Orion (-13) [<c01d3eb6>] kobject_register+0x31/0x47 [<c023a996>] bus_add_driver+0x4a/0xfd [<c01de3c1>] __pci_register_driver+0x82/0xa4 [<d083400a>] hexium_init_module+0xa/0x47 [hexium_orion] [<c013bdae>] sys_init_module+0x167b/0x1822 [<c01633f7>] do_sync_read+0xb8/0xf3 [<c0133fa3>] autoremove_wake_function+0x0/0x2d [<c0145390>] audit_syscall_entry+0x118/0x13f [<c0106ae2>] do_syscall_trace+0x104/0x14a [<c0103d21>] syscall_call+0x7/0xb slashes in kobject names aren't allowed. Signed-off-by: Dave Jones <davej@redhat.com> Cc: Mauro Carvalho Chehab <mchehab@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
V. Ananda Krishnan authored
Scott Kilau <Scott_Kilau@digi.com> Digi serial port console doesn't work when baud rates are set higher than 38400. So the lookup table and code in jsm_neo.c has been modified and tested. Please let me have the feed-back. Signed-off-by: V.Ananda Krishnan <mansarov@us.ibm.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
john stultz authored
Avoid lost tick compensation early in boot before the TSCs are synchronized. Currently timekeeping is enabled before the TSCs are synchronized, thus when the TSCs are synched (reset to zero), it appears that a number of lost ticks have occurred. This can cause premature expiry of timers and in extreme cases can cause the soft lockup detection to fire. This resolves issues reported by Andy Whitcroft as well as bug #5366 reported by Tim Mann. Signed-off-by: John Stultz <johnstul@us.ibm.com> Acked-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jack Steiner authored
Change sched_getaffinity() so that it returns a bitmap that indicates the legally schedulable cpus that a task is allowed to run on. Without this patch, if CONFIG_HOTPLUG_CPU is enabled, sched_getaffinity() unconditionally returns (at least on IA64) a mask with NR_CPUS bits set. This conveys no useful infornmation except for a kernel compile option. This fixes a breakage we obseved running recent kernels. We have MPI jobs that use sched_getaffinity() to determine where to place their threads. Placing them on non-existant cpus is problematic :-) Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Nathan Lynch <ntl@pobox.com> Cc: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Bryan O'Sullivan authored
This assembly version is measurably faster than the generic version in lib/iomap_copy.c. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Bryan O'Sullivan authored
This arch-independent routine copies data to a memory-mapped I/O region, using 32-bit accesses. The naming is double-underscored to make it clear that it does not guarantee write ordering, nor does it perform a memory barrier afterwards; the kernel doc also explicitly states this. This style of access is required by some devices. This change also introduces include/linux/io.h, at Andrew's suggestion. It only has one occupant at the moment, but is a logical destination for oft-replicated contents of include/asm-*/{io,iomap}.h to migrate to. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Bryan O'Sullivan authored
This can make the intent behind some arithmetic expressions clearer. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Adrian Bunk authored
This function is neither used nor has any real contents. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Thomas Gleixner authored
The expiry time for relative timers with SIGEV_NONE set was never updated to the correct value. Pointed out by George Anzinger. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Thomas Gleixner authored
At some point we added credits to people who actively helped to bring k/hr-timers along. This was lost in the big code revamp. Add it back. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
George Anzinger authored
Clean up the interface to hrtimers by changing the init code to pass the mode as well as the clock. This allow the init code to select the correct base and eliminates extra timer re-init code in posix-timers. We also simplify the restart interface nanosleep use. Signed-off-by: George Anzinger <george@mvista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
akpm@osdl.org authored
From: Steven Rostedtrostedt@goodmis.org <rostedt@goodmis.org> CPU0 expires a posix-timer and runs the callback function. The signal is queued. After releasing the posix-timer lock and before returning to hrtimer_run_queue CPU0 gets interrupted. CPU1 delivers the queued signal and rearms the timer. CPU0 comes back to hrtimer_run_queue and sets the timer state to expired. The next modification of the timer can result in an oops, because the state information is wrong. Keep track of state = RUNNING and check if the state has been in the return path of hrtimer_run_queue. In case the state has been changed, ignore a restart request and do not touch the state variable. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Thomas Gleixner authored
This resolves bugzilla bug#5617. The oldvalue of the timer was read after the timer was cancelled, so the remaining time was always zero. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Thomas Gleixner authored
Fixup the conversion of posix-timers to hrtimers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Thomas Gleixner authored
The itimer conversion removed the locking which protects the timer and variables in the shared signal structure. Steven Rostedt found the problem in the latest -rt patches. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Rafael J. Wysocki authored
Make swsusp use bytes as the image size units, which is needed for future compatibility. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Pekka Enberg authored
CC arch/um/sys-i386/ldt.o arch/um/sys-i386/ldt.c:19:21: proc_mm.h: No such file or directory make[1]: *** [arch/um/sys-i386/ldt.o] Error 1 Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Alexey Dobriyan authored
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Kylene Jo Hall authored
Remove event_data_size since it was pointed out in tpm_bios-indexing- fix.patch that is was ugly and it wasn't actually being used. Signed-off-by: Kylene Hall <kjhall@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-