- 04 Jan, 2005 40 commits
-
-
Andrew Morton authored
We hold i_sem during the various sync() operations to prevent livelocks: if another thread is dirtying the file, a sync() may never return. Or at least, that used to be true when we were using the per-address_space page lists. Since writeback has used radix tree traversal it is not possible to livelock the sync() operations, because they only visit each page a single time. sync_page_range() (used by O_SYNC writes) has not been holding i_sem for quite some time, for the above reasons. The patch converts fsync(), fdatasync() and msync() to also not hold i_sem during the radix-tree-based writeback. Now, we _do_ still need to hold i_sem across the file->f_op->fsync() call, because that is still based on a list_head walk, and is still livelockable. But in the case of msync() I deliberately left i_sem untaken. This is because we're currently deadlockable in msync, because mmap_sem is already held, and mmap_sem nexts inside i_sem, due to direct-io.c. And yes, the ranking of down_read() veruss down() does matter: Task A Task B Task C down_read(rwsem) down(sem) down_write(rwsem) down(sem) down_read(rwsem) C's down_write() will cause B's down_read to block. B holds `sem', so A will never release `rwsem'. So the patch fixes a hard-to-hit triple-task deadlock, but adds a possible livelock in msync(). It is possible to fix sys_msync() so that it takes i_sem outside i_mmap_sem. Later. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
We can call might_sleep() functions on the oops handling path (under do_exit). There seem little point in emitting spurious might_sleep() warnings into the logs after the kernel has oopsed. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Prasanna Meda authored
Bring the total_forks under tasklist_lock. When most of the fork code icluding nr_threads is moved to copy_process() from do_fork() code in 2.6, this is left out. Althought accuracy of total_forks is not important, it would be nice to add this. It does not involve additional cost, and the code will be cleaner if it is grouped with nr_threads. The difference is, total_forks will increase on fork, but nr_threads will increase on fork and decrease on the exit. I also moved extern decleration to sched.h from proc_misc.c. Signed-off-by: Prasanna Meda <pmeda@akamai.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Li Shaohua authored
After resume from S3, 'date' shows time run too fast. Signed-off-by: Li Shaohua <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Matthew Dobson authored
In the course of another patch I've been working on, I stumbled across some weirdness with some of the SD_*_INIT sched_domains initializers. A day or so of digging narrowed it down to the CPU_MASK_NONE initializer nested inside the sched_domain initializers. The errors I got were: kernel/sched.c:4812: error: initializer element is not constant kernel/sched.c:4812: error: (near initialization for `sched_domain_dummy') kernel/sched.c:4812: error: initializer element is not constant which was this line: static struct sched_domain sched_domain_dummy = SD_CPU_INIT; Janis Johnson, a GCC hacker, told me the following:
-
Stephen C. Tweedie authored
This patch improves ext3's ability to deal with corruption on-disk. If we try to delete a metadata block twice, we confuse ext3's internal revoke error-checking, resulting in a BUG(). But this can occur in practice due to a corrupt indirect block, so we should attempt to fail gracefully. Downgrade the assert failure to a JH_EXPECT_BH failure, and return EIO when it occurs. This is easily reproduced with a sample ext3 fs image containing an inode which references the same indirect block more than once. Deleting that inode will BUG() an unfixed kernel with: Assertion failure in journal_revoke() at fs/jbd/revoke.c:379: "!buffer_revoked(bh)" With the fix, ext3 recovers gracefully. Signed-off-by: Stephen Tweedie <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Stephen C. Tweedie authored
This patch improves ext3's ability to deal with corruption on-disk. If we ever get a corrupt inode or indirect block, then an attempt to delete it can end up trying to remove any block on the fs, including bitmap blocks. This can cause ext3 to assert-fail as we end up trying to do an ext3_forget on a buffer with b_committed_data set. The fix is to downgrade this to an IO error and journal abort, so that we take the filesystem readonly but don't bring down the whole kernel. Make J_EXPECT_JH() return a value so it can be easily tested and yet still retained as an assert failure if we build ext3 with full internal debugging enabled. Make journal_forget() return an error code so that in this case the error can be passed up to the caller. This is easily reproduced with a sample ext3 fs image containing an inode whose direct and indirect blocks refer to a block bitmap block. Allocating new blocks and then deleting that inode will BUG() with: Assertion failure in journal_forget() at fs/jbd/transaction.c:1228: "!jh->b_committed_data" With the fix, ext3 recovers gracefully. Signed-off-by: Stephen Tweedie <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Stephen C. Tweedie authored
This patch improves ext3's error logging when we encounter an on-disk corruption. Previously, a transaction (such as a truncate) which encountered many corruptions (eg. a single highly-corrupt indirect block) would emit copious "aborting transaction" errors to the log. Even worse, encountering an aborted journal can count as such an error, leading to a flood of spurious "aborting transaction: Journal has aborted" errors. With the fix, only emit that message on the first error. The patch also restores a missing \n in that printk path. Signed-off-by: Stephen Tweedie <sct@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Adrian Bunk authored
All blk.h users were converted in 2.5, and at the same time blk.h began giving a warning. The patch below removes this obsolete file. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Corey Minyard authored
This patch removes some unneeded cruft that Adrian found, and also turns off the shutdown of the timer when removing the module. Since the timer is shutdown when the driver is closed (unless no way out is specified) this is unnecessary and defeats the no way out option. - remove some completely unused code - make some needlessly global code static - removal of some EXPORT_SYMBOL'ed code with zero users. - Removal of the timer shutdown on module removal Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Robin Holt authored
Testing revealed long pauses of the entire system while autofs initiated umounts as a result of timing out the mounts. It was noticed that during a umount, the BKL is held while scanning the inode_list and removing and inodes that are candidates. This patch moves locking until after the first pass had gone through the inode_list. Testing revelead that on an ia64 machine with a filesystem that had 8.4 Million inodes, there were no observable pauses during the umount. This was down from over 4 seconds without this patch. Signed-Off-By: Robin Holt <holt@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Hellwig authored
The only common field in irq_cpustat is __softirq_pending, i386 and ppc have some of their own. Remove all unused obsolete fields from various architectures. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Christoph Hellwig authored
This code is the same for all architectures with the following invariants: - arm gurantees irqs are disabled when calling irq_exit so it can call __do_softirq directly instead of do_softirq - arm26 is totally broken for about half a year, I didn't care for it - some architectures use softirq_pending(smp_processor_id()) instead of local_softirq_pending, but they always evaluate to the same This patch moves the out of line irq_exit implementation from kernel/irq/handle.c which depends on CONFIG_GENERIC_HARDIRQS to kernel/softirq.c which is always compiled, tweaks it for the arm special case and moves the irq_enter/irq_exit/nmi_enter/nmi_exit bits from asm-*/hardirq.h to linux/hardirq.h Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Randy Dunlap authored
Fix module parameter quote handling. Module parameter strings (with spaces) are quoted like so: "modprm=this test" and not like this: modprm="this test" Signed-off-by: Randy Dunlap <rddunlap@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Kirill Korotaev authored
This patch fixes incorrect address range check in do_getname(). Theoretically this can lead to do_getname() failure on kernel address space string on the TASK_SIZE boundary addresses when 4GB split is ON. (akpm: I don't see why this check exists at all, actually. afaict the only effect of removing it is that we'll then generate -EFAULT on a non-null-terminated pathname which ends exactly at TASK_SIZE). Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jay Lan authored
This patch is to offer common accounting data collection method at memory usage for various accounting packages including BSD accounting, ELSA, CSA and any other acct packages that use a common layer of data collection. New struct fields are added to mm_struct to save high watermarks of rss usage as well as virtual memory usage. New struct fields are added to task_struct to collect accumulated rss usage and vm usages. These data are collected on per process basis. Signed-off-by: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
Jay Lan authored
This patch is to offer common accounting data collection method at I/O for various accounting packages including BSD accounting, ELSA, CSA and any other acct packages that use a common layer of data collection. Patch is made to fs/read_write.c to collect per process data on character read/written in bytes and number of read/write syscalls made. New struct fields are added to task_struct to store the data. These data are collected on per process basis. Signed-off-by: Jay Lan <jlan@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch includes prio-tree support and adds cross-referencing of VMAs with address spaces back in, as is done under normal MMU Linux. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch applies some further fixes and extensions to the nommu mmap implementation: (1) /proc/maps distinguishes shareable private mappings and real shared mappings by marking the former with 's' and the latter with 'S'. (2) Rearrange and optimise the checking portion of do_mmap_pgoff() to make it easier to follow. (3) Only set VM_SHARED on MAP_SHARED mappings. Its presence indicates that the backing memory is supplied by the underlying file or chardev. VM_MAYSHARE indicates that a VMA may be shared if it's a private VMA. The memory for a private VMA is allocated by do_mmap_pgoff() from a kmalloc slab and then the file contents are read into it before returning. (4) Permit MAP_SHARED + PROT_WRITE on memory-backed files[*] and chardevs to indicate a contiguous area of memory when its get_unmapped_area() is called if the backing fs/chardev is willing. [*] file->f_mapping->backing_dev_info->memory_backed == 1 (5) Require chardevs and files that support to provide a get_unmapped_area() file operation. (6) Made sure a private mapping of /dev/zero is possible. Shared mappings of /dev/zero are not currently supported because this'd need greater interaction of mmap with the chardev driver than is currently supported. (7) Add in some extra checks from mm/mmap.c: security, file having write access for a writable shared mapping, file not being in append mode. (8) Only account the mapping memory if it's allocated here; memory belonging to a shared chardev or file is not accounted. With this patch it should be possible to map contiguous flash files directly out of ROM simply by providing get_unmapped_area() for a read-only/shared mapping. I think that it might be worth splitting do_mmap_pgoff() up into smaller subfunctions: one to handle the checking, one to handle shared mappings and one to handle private mappings. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch does the following things: (1) It uniquifies permitted overlapping VMAs (eg: MAP_SHARED on chardevs) in nommu_vma_tree. Identical entries break the assumptions on which rbtrees work. Since we don't need to share VMAs in this case, we uniquify such VMAs by using the pointer to the VMA. They're only kept in the tree for /proc/maps visibility. (2) Extracts VMA unlinking into its own function so that the source is adjacent to the VMA linking function. (3) No longer releases memory belonging to a shared chardev or file (the underlying driver is expected to provide mappable memory). (4) Frees the file attached to a VMA whether or not that VMA is shared or is a memory-mapped I/O mapping. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch implements a nommu version of find_vma(). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch changes the PML4 bits of the FRV arch to the new PUD way. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch fixes the following issues with support for the FR55x CPUs: (1) The FR555 has a 64-byte cacheline size; everything else that we've come across has a 32-byte cacheline size. (2) Fix machine_restart() for FR55x. (3) Fix frv_cpu_suspend() for FR55x. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Mark Salter <msalter@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch makes the following fixes to the frv arch: (1) pte_offset() should no longer be around; the fault handler should use pte_offset_kernel() instead when fixing up vmalloc misses. (2) The PGEs/PMEs do not hold PTEs. They have greater address resolution and fewer control bits. (3) The data access error pattern in ESR15.EC should be 10000 not 10100. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch stops the FRV kernel-instruction-TLB-miss handler from setting the write-protect bit on a mapping entry when punting an entry from the mapping fast cache registers (DAMR1/IAMR1) to the TLB. This patch derives the WP value from the DAMPR1 register (which actually has a WP bit) rather than the IAMPR1 register (which does not). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch updates the FRV trap tables comment to make it more appropriate. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch gets rid of the perfctr_info syscall from the FRV arch now that its implementation has gone and it has been removed from the i386 arch and the i386 syscalls have been renumbered. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch does two things: (1) Implements the ext2/ext3 bitops in terms of the main bitops functions. (2) Changes the Minix bitops to use the ext2 bitops (LE) rather than the main bitops (BE). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch fixes three debugging problems in the frv arch: (1) Single-stepping in userspace steps through into the kernel-mode interrupt handler when a hardware interrupt happens, and sometimes it gets past where the debug-mode handler would normally catch it. This patch extends the range of detected PC values. (2) When setting up the kernel-mode exception frame from the debug-mode handler for a userspace debugging event, we weren't setting the LR register to generate a return to the exception handler epilogue. (3) sys_ptrace() now needs to "put" the inferior task_struct not "free" it as was done in 2.4. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch makes more syscalls available for the FR-V arch. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch changes the nommu bits of the FRV arch to incorporate the name changes made to the nommu core stuff. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch changes the nommu procfs routines to match the nommu changes in patch 1/1. This is an exercise in structure renaming and handling the fact that the list of VMAs in the system is now held together by vma->vm_rb. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch further changes the nommu stuff previously changed. These new changes do the following: (0) Some additional variables have been defined to make nommu even compile. (1) Get rid of the alternate vm_area_struct. The nommu mmap now uses the normal one. There's a refcount field added to the normal one, contingent on !CONFIG_MMU. (2) vm_rb is now used to keep track of the VMAs in an rbtree rather than adding a separate list. (3) mm_tblock_struct is now vm_list_struct. (4) put_vma() now calls vma->vm_ops->close() if available on nommu. (5) A dummy generic_file_vm_ops has been provided. It does nothing, but permits tiny-shmem to compile. tiny-shmem and ramfs still need attention, such that files contained therein can be mmapped shared-writably to some extent on nommu. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch fixes the following problems in the ELF-FDPIC binfmt driver: (1) elf_fdpic_map_file() should be passed an mm_struct pointer, not NULL. (2) do_mmap() should be called with the mmap_sem held. (3) mm_struct::end_brk doesn't exist in 2.6 (debugging only). (4) Avoid debugging warnings by casting certain values to unsigned long before printing them. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch adds a new binary format driver that allows a special variety of ELF to be used that permits the dynamic sections that comprise an executable, its dynamic loader and its shared libaries and its stack and data to be located anywhere within the address space. This is used to provide shared libraries and shared executables (at least, as far as the read-only dynamic sections go) on uClinux. Not only that, but the same binaries can be run on MMU linux without a problem. This is achieved by: (1) Passing loadmaps to the dynamic loader (or to a statically linked executable) to indicate the whereabouts of the various dynamic sections. (2) Using a GOT inside the program. (3) Passing setup_arg_pages() the stack pointer to be. (4) Allowing the arch greated control over how an executable is laid out in memory in MMU Linux. (5) Rewriting mm/nommu.c to support MAP_PRIVATE on files, thus allowing _mmap_ to handle sharing of private-readonly mappings. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch fixes the usage of setup_arg_pages() in the IA64, MIPS, S390 and Sparc64 arches. This function now takes an extra parameter: the initial top of stack. This is useful in uClinux when there's no fixed location to which the stack pointer can be initialised. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch changes setup_arg_pages() to take the proposed initial stack top for the new executable image. This makes it easier for the binfmt to place the stack at a non-fixed location, such as happens in !MMU configurations. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch splits some memory-related procfs files into MMU and !MMU versions and places them in separate conditionally-compiled files. A header file local to the fs/proc/ directory is used to declare functions and the like. Additionally, a !MMU-only proc file (/proc/maps) is provided so that master VMA list in a uClinux kernel is viewable. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch changes mm/nommu.c to better support mmap() when MMU support is disabled (as it is in uClinux). This was discussed on the uclibc mailing list in a thread revolving around the following message: Date: Thu, 1 Apr 2004 12:05:50 +1000 From: David McCullough <davidm@snapgear.com> To: David Howells <dhowells@redhat.com> Cc: Alexandre Oliva <aoliva@redhat.com>, uclibc@uclibc.org Subject: Re: [uClibc] mmaps for malloc should be private Message-ID: <20040401020550.GG3150@beast> The revised rules are: (1) Anonymous mappings can be shared or private, read or write. (2) Chardevs can be mapped shared, provided they supply a get_unmapped_area() file operation and use that to set the address of the mapping (as a frame buffer driver might do, for instance). (3) Files (and blockdevs) cannot be mapped shared since it is not really possible to honour this by writing any changes back to the backing device. (4) Files (or sections thereof) can be mapped read-only private, in which case the mapped bit will be read into memory and shared, and its address will be returned. Any excess beyond EOF will be cleared. (5) Files (or sections thereof) can be mapped writable private, in which case a private copy of the mapped bit will be read into a new bit memory, and its address will be returned. Any excess beyond EOF will be cleared. Mappings are per MM structure still. You can only unmap what you've mapped. Fork semantics are irrelevant, since there's no fork. A global list of VMA's is maintained to keep track of the bits of memory currently mapped on the system. The new binfmt makes use of (4) to implement shared libraries. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-
David Howells authored
The attached patch makes calibrate_delay() optional. In this architecture, it's a waste of time since we can predict exactly what it's going to come up with just by looking at the CPU's hardware clock registers. Thus far, we haven't seen a board with any clock not dependent on the CPU's clock. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
-