- 05 Feb, 2007 12 commits
-
-
Patrick Caulfield authored
I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton. These four should really be calls to schedule() or the dlm can busy-wait. Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
S. Wendy Cheng authored
Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This is partially derrived from a patch written by Russell Cattelan. It fixes a bug where there is a race between readpages and truncate by ignoring readpages for stuffed files. This is ok because a stuffed file will never be more than one block (minus sizeof(struct gfs2_dinode)) in size and block size is always less than page size, so we do not lose anything efficiency-wise by not doing readahead for stuffed files. They will have already been "read ahead" by the action of reading the inode in, in the first place. This is the remaining part of the fix for Red Hat bugzilla #218966 which had not yet made it upstream. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Russell Cattelan <cattelan@redhat.com>
-
Steven Whitehouse authored
This patch fixes Red Hat bugzilla #212627 in which a deadlock occurs due to trying to take the i_mutex while holding a glock. The correct locking order is defined as i_mutex -> glock in all cases. I've left dealing with allocating writes. I know that we need to do that, but for now this should do the trick. We don't need to take the i_mutex on write, because the VFS has already taken it for us. On read we don't need it since the glock is enough protection. The reason that I've made some of the checks into a separate function is that we'll need to do the checks again in the allocating write case eventually, so this is partly in preparation for this. Likewise the return value test of != 1 might look a bit odd and thats because we'll need a third return value in case of requiring an allocation. I've made the change to deferred mode on the glock to ensure flushing read caches on other nodes. I notice that (using blktrace to look at whats going on) we appear to do a better job of large I/Os than ext3 after this patch (in terms of not splitting up the I/Os). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Wendy Cheng <wcheng@redhat.com>
-
Adrian Bunk authored
Remove the following unused functions: - lowcomms_send_message() - lowcomms_max_buffer_size() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Patrick Caulfield <pcaulfie@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
When the dlm fakes an unlock/cancel reply from a failed node using a stub message struct, it wasn't setting the flags in the stub message. So, in the process of receiving the fake message the lkb flags would be updated and cleared from the zero flags in the message. The problem observed in tests was the loss of the USER flag which caused the dlm to think a user lock was a kernel lock and subsequently fail an assertion checking the validity of the ast/callback field. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
LVB's are not sent as part of new requests, but the code receiving the request was copying data into the lvb anyway. The space in the message where it mistakenly thought the lvb lived actually contained the resource name, so it wound up incorrectly copying this name data into the lvb. Fix is to just create the lvb, not copy junk into it. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
The send_args() function is used to copy parameters into a message for a number different message types. Only some of those types are set up beforehand (in create_message) to include space for sending lvb data. send_args was wrongly copying the lvb for all message types as long as the lock had an lvb. This means that the lvb data was being written past the end of the message into unknown space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
Check if we receive a message from another lockspace member running a version of the dlm with an incompatible inter-node message protocol. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
A reply to a recovery message will often be received after the relevant recovery sequence has aborted and the next recovery sequence has begun. We need to ignore replies to these old messages from the previous recovery. There's already a way to do this for synchronous recovery requests using the rc_id number, but not for async. Each recovery sequence already has a locally unique sequence number associated with it. This patch adds a field to the rcom (recovery message) structure where this recovery sequence number can be placed, rc_seq. When a node sends a reply to a recovery request, it copies the rc_seq number it received into rc_seq_reply. When the first node receives the reply to its recovery message, it will check whether rc_seq_reply matches the current recovery sequence number, ls_recover_seq, and if not then it ignores the old reply. An old, inadequate approach to filtering out old replies (checking if the current stage of recovery has moved back to the start) has been removed from two spots. The protocol version number is changed to reflect the different rcom structures. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
There's a chance the new master of resource hasn't learned it's the new master before another node sends it a lock during recovery. The node sending the lock needs to resend if this happens. - A sends a master lookup for resource R to C - B sends a master lookup for resource R to C - C receives A's lookup, assigns A to be master of R and sends a reply back to A - C receives B's lookup and sends a reply back to B saying that A is the master - B receives lookup reply from C and sends its lock for R to A - A receives lock from B, doesn't think it's the master of R and sends an error back to B - A receives lookup reply from C and becomes master of R - B gets error back from A and resends its lock back to A (this resending is what this patch does) - A receives lock from B, it now sees it's the master of R and takes the lock Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
David Teigland authored
If an fs has already been shut down, a lockfs callback should do nothing. An fs that's been shut down can't acquire locks or do anything with respect to the cluster. Also, remove FIXME comment in withdraw function. The missing bits of the withdraw procedure are now all done by user space. Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
-
- 04 Feb, 2007 3 commits
-
-
Linus Torvalds authored
-
Frédéric Riss authored
When calling into the EFI firmware, the parameters need to be passed on the stack. The recent change to use -mregparm=3 breaks x86 EFI support. This patch is needed to allow the new Intel-based Macs to suspend to ram (efi.get_time is called during the suspend phase). Signed-off-by: Frederic Riss <frederic.riss@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Al Viro authored
That code doesn't do what its author apparently thought it would do... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 03 Feb, 2007 11 commits
-
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect [SCSI] qla4xxx: bug fixes [SCSI] Fix scsi_add_device() for async scanning
-
Jeff Garzik authored
x86-64 is missing these: Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
John Keller authored
The SN Altix platform does not conform to the IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and return. Due to an oversight, this code was not added previously when similar code was added to acpi_register_gsi(). http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2Signed-off-by: John Keller <jpk@sgi.com> Acked-by: Len Brown <lenb@kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Frysinger authored
We went and named them __NR_sys_foo instead of __NR_foo. It may be too late to change this, but we can at least add the proper names now. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Korsgaard authored
smc911x_phy_configure's error handling unconditionally unlocks the spinlock even if it wasn't locked. Patch fixes it. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Cc: Jeff Garzik <jeff@garzik.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Magnus Damm authored
This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this by skipping migration of already disabled irqs. This is most likely a problem on other ia64 platforms as well, but I've only been able to reproduce it on one machine so far. The full story is that handle_bad_irq() gets invoked before starting the new kernel without this patch. This seems to happen when fixup_irqs() calls generic_handle_irq() on already migrated (and disabled) irqs. So by avoiding migration of disabled irqs we stay away of handle_bad_irq(). The code has been tested on three different ia64 machines, all with good results. It is possible to trigger the same bug by offlining a processor using echo 0 > /sys/devices/system/cpu/cpuX/online. More detailed information is available in the following mail thread: http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Zou, Nanhai <nanhai.zou@intel.com> Acked-by: Jay Lan <jlan@sgi.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Ken Chen authored
An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Adrian Bunk authored
Fix this by letting NF_CONNTRACK_H323 depend on (IPV6 || IPV6=n). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Patrick McHardy authored
CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event': net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark' make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nagendra Singh Tomar authored
sd_probe() calls class_device_add() even before initializing the sdkp->device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a crash as the show function for allow_restart (i.e sd_show_allow_restart) returns the attribute value by reading the sdkp->device->allow_restart variable. As the sdkp->device is not initialized before calling the user mode hotplug helper, this results in a crash. The patch below solves it by calling class_device_add() only after the necessary fields in the scsi_disk structure are initialized properly. Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
-
- 02 Feb, 2007 14 commits
-
-
Linus Torvalds authored
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: Initialize nbytes for internal sg commands libata: Fix ata_busy_wait() kernel docs pata_via: Correct missing comments pata_atiixp: propogate cable detection hack from drivers/ide to the new driver ahci/pata_jmicron: fix JMicron quirk
-
Brian King authored
Some LLDDs, like ipr, use nbytes and pad_len to determine the total data transfer length of a command. Make sure nbytes gets initialized for internally generated commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
> Looks like you should use ata_busy_wait() here, rather than reproducing > the same code again. It waits in 10uS chunks while 1uS chunks were used in the workaround. Could indeed do that once I know the fix is right. While I'm at it the ata_busy_wait kerneldoc is borked so here's a fix Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
The 8237S was added to the chipsets but not to the comments. Fix this Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Alan authored
Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Tejun Heo authored
For all JMicrons except for 361 and 368, AHCI mode enable bits in the Control(1) should be set. This used to be done in both ahci and pata_jmicron but while moving programming to PCI quirk, it was removed from ahci part while still left in pata_jmicron. The implemented JMicron PCI quirk was incorrect in that it didn't program AHCI mode enable bits. If pata_jmicron is loaded first and programs those bits, the ahci ports work; otherwise, ahci device detection fails miserably. This patch makes JMicron PCI quirk clear SATA IDE mode bits and set AHCI mode bits and remove the respective part from pata_jmicron. Tested on JMB361, 363 and 368. Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Linus Torvalds authored
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: spidernet : fix memory leak in spider_net_stop e100: fix napi ifdefs removing needed code netxen patches
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/bnx2-2.6: [BNX2]: PHY workaround for 5709 A0.
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [NET_SCHED]: act_ipt: fix regression in ipt action
-
Linus Torvalds authored
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC32]: Fix over-optimization by GCC near ip_fast_csum.
-
Evgeniy Dushistov authored
Mark ufs file system as maintainable, and add me as maintainer, to help people find appropriate person to assign bugs. Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
This reverts commit e4f0ae0e. It's not wrong, but it's not right either, and everybody seems to agree that the right fix is probably to do the ccr3 write after the ccr4 one (and that we also should clean it up a bit). And after that we need to really validate that all the bits that we write to ccr4 actually do work. The old 2.6.19 code was insane, and basically didn't change ccr4 at all (even though it certainly looks like it was the *intent* to do so). So let's revert the change that may fix things, just because it's not what was actually ever tested when the code was written, even if it _was_ the intent. There's a discussion on http://lkml.org/lkml/2007/1/9/63 that was started by the patch that now gets reverted, and that discussion may well contain the proper long-term fix. Suggested-by: Adrian Bunk <bunk@stusta.de> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jens Osterkamp authored
We forget to call spider_net_free_rx_chain_contents which does the actual dev_kfree_skb. New skbs are allocated from skbuff_head_cache on each "ifconfig up" letting the cache grow infinitely. This patch fixes it. Signed-off-by: Jens Osterkamp <jens@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-
Auke Kok authored
e100: fix napi ifdefs removing needed code From: Auke Kok <auke-jan.h.kok@intel.com> The e100 driver is NAPI mode only. We need to netif_poll_disable during suspend and shutdown. The non-NAPI driver code was removed and is only avaiable in the out-of-tree e100 kernel driver. Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
-