- 29 Dec, 2003 40 commits
-
-
Andrew Morton authored
From: Badari Pulavarty <pbadari@us.ibm.com> I found the problem with O_DIRECT memory leak. The problem is, when we are doing DIO read and crossed the end of file - we don't release referencess on all the pages we got from get_user_pages(). (since it is a success case). The fix is to call dio_cleanup() even for sucess cases.
-
Andrew Morton authored
From: Roland McGrath <roland@redhat.com> The following test program will crash every time if dynamically linked. I think this bites all 32-bit platforms, including 32-bit executables on 64-bit platforms that support them (and could in theory bite 64-bit platforms with bss sizes beyond the bounds of comprehension). volatile char hugebss[1080000000]; main() { printf("%p..%p\n", &hugebss[0], &hugebss[sizeof hugebss]); system("cat /proc/$PPID/maps"); hugebss[sizeof hugebss - 1] = 1; return 23; } The problem is that the kernel maps ld.so at 0x40000000 or some such place, before it maps the bss. Here the bss is so large that it overlaps and clobbers that mapping. I've changed it to map the bss before it loads the interpreter, so that part of the address space is reserved before ld.so's mapping (which doesn't really care where it goes) is done. This patch also adds error checking to the bss setup (and interpreter's bss setup). With the aforementioned change but no error checking, "ulimit -v 65536; ./hugebss" will crash in the store after the `system' call, because the kernel will have failed to allocate the bss and ignored the error, so the program runs without those pages being mapped at all. With this change it dies with a SIGKILL as for a failure to set up stack pages. It might be even better to try to detect the case earlier so that execve can return an error before it has wiped out the address space. But that seems like it would always be fragile and miss some corner cases, so I did not try to add such complexity.
-
Andrew Morton authored
From: Joe Korty <joe.korty@ccur.com> do_gettimeofday() is using tick_usec which is defined in terms of USER_HZ not HZ.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> stripe to be effective. This patch sets ra_pages appropriately.
-
Andrew Morton authored
From: NeilBrown <neilb@cse.unsw.edu.au> As no md personalities honour the merge_bvec_fn of underlying devices, we must make sure never to submit a bio larger than 1 page when a merge_bvec_fn is defined. raid5 already does this (it never submits bios larger than one page). With this patch, all other raid personalities limit their max_sectors when a merge_bvec_fn is present.
-
Andrew Morton authored
From: glee@gnupilgrims.org I think Adrian had forgotten to update the help text.
-
Andrew Morton authored
From: Jan Kara <jack@ucw.cz> here's patch which should fix deadlock with quotas+ext3 reported in 2.4 (the same problem existed in 2.6 but nobody found it).
-
Andrew Morton authored
From: Jan Kara <jack@ucw.cz> I'm sending you a fix of possible Oops in vfs_quota_sync(). Actually nobody has run into that I found it when I was looking through the code.
-
Andrew Morton authored
From: Geoffrey Lee <glee@gnupilgrims.org> This fixes what seems to be an obvious = vs == bug in the init301.c sis file.
-
Andrew Morton authored
From: William Lee Irwin III <wli@holomorphy.com> This field is 100% unused. This patch removes it.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> 32bit siginfo would sometimes get passed incorrectly on x86-64. This change fixes the conversion function to be a bit dumber, but more correct.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Merge i386 fix. Don't panic in MP table parsing when the table is bad.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Merge signal race fixes from i386 to x86-64. Fix a bug in system call restart, noted by John Blackwood.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Merge the i386 fix for the page fault from Linus to x86-64 (I'm not actually sure what it fixes, but if it's good for 32bit it is likely good for 64bit too)
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Make sure we never access anything in kernel mapping while doing the prefetch workaround checks on x86-64. Originally suggested by Jamie Lockier.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Another potential data corruption fix. The 32bit truncate64 on x86-64 did silently truncate offsets >32bit. That broke mysql for example. Fix that. From Chris Wilson
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> From Badari Pulavarty Without this sysrq-t shows the same backtrace for all processes on x86-64
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> A lot of people have run into this: the x86-64 cpuid driver didn't compile as module. Using a kludge suggested by Sam Ravnsborg.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Please consider applying this patch, I would consider it critical for x86-64. The 2.6.0 x86-64 IOMMU code unfortunately had a few problems, leading to non booting systems and in a few cases to data corruption. It fixes a two serious bugs in handling special kinds of scatter gather lists in pci_map_sg. AGP was completely broken with IOMMU because of a wrong #ifdef. Fix that. One TLB flush optimization I did a long time ago seems to break on some 3ware boards (who require IOMMU because they don't support 64bit addresses). The breakage lead to data corruption. This patch diables the optimization for now and fixes a potential SMP race in the flush code too. The TLB flush is done in a slower, but more reliable way now too. This patch fixes them. Please consider applying, because some of these problems hit quite many people. This also disables the IOMMU_DEBUG in the defconfig. A lot of people were using the IOMMU when they didn't need to, which multiplied the problems. IOMMU merge is disabled for now. This was an experimental optimization which helped with some block devices, but for production it seems to be better to disable it for now because there are some questionable corner cases when the IOMMU aperture fragments. The same is done for IOMMU SAC force, which was related to that. i386 has quite broken semantics for pci_alloc_consistent(). It uses the standard device DMA mask instead of the consistent mask. Make us bug-to-bug compatible here. This fixes problems with some sound drivers that don't support full 32bit addressing.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> Add 32bit a.out support for x86-64. Not exactly an important bug fix, but maybe it will help someone. This should increase the current 98% compatibility to i386 to perhaps 98.1% @) I tested an old a.out SuSE 4.2 installation in chroot and it worked. It also ran some very old linux binaries from '92 found on ftp.funet.fi. The only program that didn't was the SuSE a.out GNU emacs, but I was too lazy to track that down. Core dumps are not supported.
-
Andrew Morton authored
From: Andi Kleen <ak@muc.de> It fixes the statfs64 emulation on x86-64. The problem is that x86-64 needs an __attribute__((aligned)) on the compat_statfs64 structure. The conclusion last time this was discussed was that the structure should be duplicated. Essentially it is the old shared structure copied to every user and x86-64 uses __attribute__((packed)).
-
Andrew Morton authored
From: Mark Haverkamp <markh@osdl.org> About three weeks ago markw at osdl posted a mail about a panic that he was seeing: http://marc.theaimsgroup.com/?l=linux-kernel&m=106737176716474&w=2 I believe what is happening, is that the dm __clone_and_map function is generating bio structures with the bi_idx field non-zero. When __blk_queue_bounce creates a new bio with bounce pages, it sets the bi_idx field to 0 rather than the bi_idx of the original. This causes trouble since bv_page pointers will be dereferenced later that are zero. The following uses the original bio structure's bi_idx in the new bio structure and in copy_to_high_bio_irq and bounce_end_io. This has cleared up the panic when using the volume. (acked by Joe Thornber)
-
Andrew Morton authored
From: Neil Brown <neilb@cse.unsw.edu.au> Change ext3 to run bd_claim() against external journal devices. It is significant only for those who have ext3 journals on a separate device, and gets exclusive access to that device.
-
Andrew Morton authored
From: Arnaldo Carvalho de Melo <acme@conectiva.com.br> pagemap.h, do not include thyself.
-
Andrew Morton authored
Remove pointless lock_kernel(), replace with the standard-but-still-odd i_sem-based lseek locking.
-
Andrew Morton authored
proc_kill_inodes() walks the s_files list, playing with ->f_dentry. But there is a window in which __fput() will leave a file on that list with a null f_dentry and f_vfsmnt. I'm not sure it was ever confirmed that this fixed the reported oops, but it seems much better to set those fields to null _after_ removing the filp from the list. (Actually, there's no need to null those pointers out at all. But whatever; it caught a bug).
-
Andrew Morton authored
From: William Lee Irwin III <wli@holomorphy.com> Our accounting of minor faults versus major faults is currently quite wrong. To fix it up we need to propagate the actual fault type back to the higher-level code. Repurpose the currently-unused third arg to ->nopage for this.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> The patch below removes the CLONE_FILES flag from the kernel_thread() call which starts init. This is to prevent other kernel threads from sharing file descriptors opened by init (try 'lsof /dev/initctl' on a 2.6 system :-). The reason this patch is being proposed is so that usermode helper apps launched via kernel threads (e.g. modprobe, hotplug) do not then inherit any such file descriptors. This is not a problem in itself so far (other than being messy), but it is a problem for SELinux, which will otherwise need to grant access to /dev/initctl by modprobe and hotplug, a somewhat undesirable scenario. As far as I can tell, there is no reason why init needs to be spawned with CLONE_FILES. Please let me know if there are any objections to the change, which I would like to propose for 2.6.0+ as a cleanup.
-
Andrew Morton authored
From: Aniket Malatpure <aniket@sgi.com> Adds support for the IOC4 IDE part.
-
Andrew Morton authored
From: Paul Jackson <pj@sgi.com> This patch is a followup to one from Bill Irwin. On Nov 17, he had consolidated the half-dozen chunks of code that displayed cpumasks in /proc/irq/prof_cpu_mask and /proc/irq/<pid>/smp_affinity into a single routine, which he called format_cpumask(). I believe that Andrew Morton has accepted Bill's patch into his 2.6.0-test10-mm1 patch set as the "format_cpumask" patch. I hope that the following patch will replace Bill's patch. I look forward to Bill's feedback on this patch. The following patch carries Bill's work further: 1) It also consolidates the input side (write syscalls). 2) It adapts a new format, same on input and output. 3) The core routines work for any multi-word bitmask, not just cpumasks. 4) The core routines avoid overrunning their output buffers. Note esp. for David Mosberger: The small patch I sent you and the linux-ia64 list yesterday entitled: "check user access ok writing /proc/irq/<pid>/smp_affinity" for arch ia64 only is _separate_ from the following patch. Neither presumes the other. However, they do collide on one line. Last one in is a Monkey's Uncle and will need an updated patch from me (or otherwise need to resolve the one obvious collision). Details of the following patch: Both the display and input of cpumasks on 9 arch's are consolidated into a single pair of routines, which use the same format for input and output, as recommended by Tony Luck. The two common routines work on any multi-word bitmask (array of unsigned longs). A pair of trivial inline wrappers cpumask_snprintf() and cpumask_parse() hide this generality for the common case of cpumask input and output. My real motivation for consolidating this code will become visible later - when I seek to add a nodemask_t that resembles cpumask_t (just a different length). These common underlying routines will be used there as well, following up on a suggestion of Christoph Hellwig that I investigate implementing nodemask_t as an ADT sharing infrastructure with cpumask_t. However, I believe that this patch stands on its own merit, consolidating a couple hundred lines of duplicated code, and making the cpumask display format usable on very large systems. There are two exceptions to the consolidation - the alpha and sparc64 arch's manipulate bare unsigned longs, not cpumask_t's, on input (write syscall), and do stuff that was more funky than I could make sense of. So the input side of these two arch's was left as-is. I'd welcome someone with access to either of these systems to provide additional patches. The new format consists of multiple 32 bit words, separated by commas, displayed and input in hex. The following comment from this patch describes this format further: * The ascii representation of multi-word bit masks displays each * 32bit word in hex (not zero filled), and for masks longer than * one word, uses a comma separator between words. Words are * displayed in big-endian order most significant first. And hex * digits within a word are also in big-endian order, of course. * * Examples: * A mask with just bit 0 set displays as "1". * A mask with just bit 127 set displays as "80000000,0,0,0". * A mask with just bit 64 set displays as "1,0,0". * A mask with bits 0, 1, 2, 4, 8, 16, 32 and 64 set displays * as "1,1,10117". The first "1" is for bit 64, the second * for bit 32, the third for bit 16, and so forth, to the * "7", which is for bits 2, 1 and 0. * A mask with bits 32 through 39 set displays as "ff,0". The essential reason for adding the comma breaks was to make the long masks from our (SGI's) big 512 CPU systems parsable by humans. An unbroken string of 128 hex digits is pretty difficult to read. For those who are compiling systems with CONFIG_NR_CPUS of 32 or less, there should be no visible change in format. There are of course a thousand possible output formats that meet similar criteria. If someone wants to lobby for and seek consensus behind another such format, that's fine. Now that the format is consolidated into a single pair of routines, it should be easy to adapt whatever we choose. Internally, the display routine uses snprintf to track the remaining space in its output buffer, to avoid the risk of overrunning it. A new file, lib/mask.c, is added to the lib directory, to hold the two common routines. I anticipate adding a few more common routines for generic support of multi-word bit masks to lib/mask.c, in subsequent patches that will add a nodemask_t type as an ADT sharing implementation with cpumask_t.
-
Andrew Morton authored
From: Paul Jackson <pj@sgi.com> Push the cpumask implementation from linux/cpumask.h into asm/cpumask.h, so that ia64 can do special things without breaking sparc64. 1) Each arch has its own include/asm-<arch>/cpumask.h file 2) That arch-specific header file can include <asm-generic/cpumask.h>, if it wants to make use of the generic cpumask implementation. 3) Using code should continue to include linux/cpumask.h, which in turn includes asm/cpumask.h. Some common implementation independent cpumask related items, such as the cpu_online_map, are declared directly in linux/cpumask.h.
-
Andrew Morton authored
From: Will Dyson <will_dyson@pobox.com> Add documentation and comments to lib/parser.c and include/linux/parser.h
-
Andrew Morton authored
From: Alan Cox <alan@redhat.com> Capability elevation bug in 2.6.0 IDE. Long fixed in 2.4.x, trivial to cure
-
Andrew Morton authored
From: Alan Cox <alan@redhat.com> IDE core code had the mmio==2 (ioremap) mode supported but two small changes had been missed for ide-dma.c. Without this fix mmio IDE controllers bomb if you have plenty of memory as it uses request_mem_region on an ioremap return.
-
Andrew Morton authored
From: Peter Chubb <peterc@gelato.unsw.edu.au> If you try to disable IDE DMA from Kconfig, you'll end up with an undefined symbol, ide_hwif_setup_dma(). The attached rather ugly patch fixes the problem by defining a dummy function.
-
Andrew Morton authored
From: Peter Chubb <peterc@gelato.unsw.edu.au> The PIIX5 IDE controller on I2000 IA64 boxen using the 460GX chipset will hang on startup if an ordinary harddrive is plugged into it (it seems to workj for the LSI120 and the CDROM drives). This is because the 460GX chipset contains a PCI expanssion bridge that works like the 450NX PXB, and has the same PCI ID (but a later revision). The PIIX driver, to work around interactions between PIIX4 and the 450NX PXB, tries to disable DMA. Unfortunately, the way it tries to disable DMA doesn't work, and the higher layers think that DMA is still on, and so timeout waiting for DMA, and then hang on bootup. A simple workaround is to tighten the check for the buggy chipset, as in the attached patch. However, someone with more time (and who actually *understands* the IDE subsystem) needs to fix the real bug as well.
-
Andrew Morton authored
From: Bartlomiej Zolnierkiewicz <B.Zolnierkiewicz@elka.pw.edu.pl>, Stuart Hayes <stuart_hayes@dell.com> - Check drive's write protect bit, try to return appropriate errors when attempting to write a write-protected tape. - Moved "idetape_read_position" call in idetape_chrdev_open after the "wait_ready" call. - Added IDETAPE_MEDIUM_PRESENT flag so driver would know not to rewind tape after ejecting it. - Fixed bug with ide_abort_pipeline (it was deleting stages from tape->next_stage to end, instead of from new_last_stage->next (tape->next_stage was set to NULL by idetape_discard_read_pipeline before calling!). - Made improvements to idetape_wait_ready. - Added a few comments here and there. - Made MTOFFL unlock tape drive door before attempting to eject. - Added fixes to get Seagate STT3401A Travan working: Handle drives that don't support 0-length reads/writes increased timeout (retension takes ~10 minutes before irq is returned). Fixed request mode page packet command byte 3. Also remove code depending on NO_LONGER_REQUIRED to match 2.4.x (me).
-
Andrew Morton authored
From: Arun Sharma <arun.sharma@intel.com> - Several instances where we were using pid_t instead of uid_t - If the caller passed a NULL `oldact' pointer into sys_sigprocmask then don't try to write the old sigmask there.
-
Andrew Morton authored
From: gleb@nbase.co.il (Gleb Natapov) There is inconsistency in fops->write() implementation in different watchdog drivers. Some of them return number of bytes written while others return 1. I think the correct implementation should always return number of bytes written (we examine all the buffer after all) otherwise "echo V > /dev/watchdog" doesn't work as expected (it doesn't stop watchdog).
-
Andrew Morton authored
From: Olaf Hering <olh@suse.de> We need to update `offset' here so that the subsequent push_pad() (which uses `offset') will do the right thing.
-