- 03 Feb, 2010 1 commit
-
-
Mikael Pettersson authored
This fixes a futex key reference count bug in futex_lock_pi(), where a key's reference count is incremented twice but decremented only once, causing the backing object to not be released. If the futex is created in a temporary file in an ext3 file system, this bug causes the file's inode to become an "undead" orphan, which causes an oops from a BUG_ON() in ext3_put_super() when the file system is unmounted. glibc's test suite is known to trigger this, see <http://bugzilla.kernel.org/show_bug.cgi?id=14256>. The bug is a regression from 2.6.28-git3, namely Peter Zijlstra's 38d47c1b "[PATCH] futex: rely on get_user_pages() for shared futexes". That commit made get_futex_key() also increment the reference count of the futex key, and updated its callers to decrement the key's reference count before returning. Unfortunately the normal exit path in futex_lock_pi() wasn't corrected: the reference count is incremented by get_futex_key() and queue_lock(), but the normal exit path only decrements once, via unqueue_me_pi(). The fix is to put_futex_key() after unqueue_me_pi(), since 2.6.31 this is easily done by 'goto out_put_key' rather than 'goto out'. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Darren Hart <dvhltc@us.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org>
-
- 01 Feb, 2010 1 commit
-
-
Jason Wessel authored
When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets the time from hardware such as the TSC on x86. In this configuration kgdb will report a softlock warning message on resuming or detaching from a debug session. Sequence of events in the problem case: 1) "cpu sched clock" and "hardware time" are at 100 sec prior to a call to kgdb_handle_exception() 2) Debugger waits in kgdb_handle_exception() for 80 sec and on exit the following is called ... touch_softlockup_watchdog() --> __raw_get_cpu_var(touch_timestamp) = 0; 3) "cpu sched clock" = 100s (it was not updated, because the interrupt was disabled in kgdb) but the "hardware time" = 180 sec 4) The first timer interrupt after resuming from kgdb_handle_exception updates the watchdog from the "cpu sched clock" update_process_times() { ... run_local_timers() --> softlockup_tick() --> check (touch_timestamp == 0) (it is "YES" here, we have set "touch_timestamp = 0" at kgdb) --> __touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp" to "get_timestamp()" (Here, the "touch_timestamp" will still be set to 100s.) ... scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched clock" to "hardware time" = 180s) ... } 5) The Second timer interrupt handler appears to have a large jump and trips the softlockup warning. update_process_times() { ... run_local_timers() --> softlockup_tick() --> "cpu sched clock" - "touch_timestamp" = 180s-100s > 60s --> printk "soft lockup error messages" ... } note: ***(A) reset "touch_timestamp" to "get_timestamp(this_cpu)" Why is "touch_timestamp" 100 sec, instead of 180 sec? When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of get_timestamp() is: get_timestamp(this_cpu) -->cpu_clock(this_cpu) -->sched_clock_cpu(this_cpu) -->__update_sched_clock(sched_clock_data, now) The __update_sched_clock() function uses the GTOD tick value to create a window to normalize the "now" values. So if "now" value is too big for sched_clock_data, it will be ignored. The fix is to invoke sched_clock_tick() to update "cpu sched clock" in order to recover from this state. This is done by introducing the function touch_softlockup_watchdog_sync(). This allows kgdb to request that the sched clock is updated when the watchdog thread runs the first time after a resume from kgdb. [yong.zhang0@gmail.com: Use per cpu instead of an array] Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Dongdong Deng <Dongdong.Deng@windriver.com> Cc: kgdb-bugreport@lists.sourceforge.net Cc: peterz@infradead.org LKML-Reference: <1264631124-4837-2-git-send-email-jason.wessel@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
- 27 Jan, 2010 2 commits
-
-
Oleg Nesterov authored
Lockdep has found the real bug, but the output doesn't look right to me: > ========================================================= > [ INFO: possible irq lock inversion dependency detected ] > 2.6.33-rc5 #77 > --------------------------------------------------------- > emacs/1609 just changed the state of lock: > (&(&tty->ctrl_lock)->rlock){+.....}, at: [<ffffffff8127c648>] tty_fasync+0xe8/0x190 > but this lock took another, HARDIRQ-unsafe lock in the past: > (&(&sighand->siglock)->rlock){-.....} "HARDIRQ-unsafe" and "this lock took another" looks wrong, afaics. > ... key at: [<ffffffff81c054a4>] __key.46539+0x0/0x8 > ... acquired at: > [<ffffffff81089af6>] __lock_acquire+0x1056/0x15a0 > [<ffffffff8108a0df>] lock_acquire+0x9f/0x120 > [<ffffffff81423012>] _raw_spin_lock_irqsave+0x52/0x90 > [<ffffffff8127c1be>] __proc_set_tty+0x3e/0x150 > [<ffffffff8127e01d>] tty_open+0x51d/0x5e0 The stack-trace shows that this lock (ctrl_lock) was taken under ->siglock (which is hopefully irq-safe). This is a clear typo in check_usage_backwards() where we tell the print a fancy routine we're forwards. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100126181641.GA10460@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
-
Greg Kroah-Hartman authored
Commit 70362511 exposed that f_modown() should call write_lock_irqsave instead of just write_lock_irq so that because a caller could have a spinlock held and it would not be good to renable interrupts. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 26 Jan, 2010 10 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag ext4: Fix quota accounting error with fallocate ext4: Handle -EDQUOT error on write
-
git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [WATCHDOG] sbc_fitpc2_wdt: fix I/O space access technique. [WATCHDOG] ixp2000: Fix build failure caused by missing include
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: fix a memory-leak in wm8903 ALSA: hda - add possibility to choose speakers configuration for 4930g ALSA: hda - Fix HP T5735 automute ALSA: hda - Turn on EAPD only if available for Realtek codecs ALSA: hda - Fix parsing pin node 0x21 on ALC259
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init() KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks KVM: S390: fix potential array overrun in intercept handling KVM: fix spurious interrupt with irqfd eventfd - allow atomic read and waitqueue remove KVM: MMU: bail out pagewalk on kvm_read_guest error KVM: properly check max PIC pin in irq route setup KVM: only allow one gsi per fd KVM: x86: Fix host_mapping_level() KVM: powerpc: Show timing option only on embedded KVM: Fix race between APIC TMR and IRR
-
git://git.infradead.org/ubi-2.6Linus Torvalds authored
* 'linux-next' of git://git.infradead.org/ubi-2.6: UBI: fix memory leak in update path UBI: add more checks to chdev open UBI: initialise update marker
-
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/stagingLinus Torvalds authored
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging: hwmon: (fschmd) Fix a memleak on multiple opens of /dev/watchdog hwmon: (asus_atk0110) Do not fail if MBIF is missing hwmon: (amc6821) Double unlock bug hwmon: (smsc47m1) Fix section mismatch
-
git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6Linus Torvalds authored
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (95 commits) drm/radeon/kms: preface warning printk with driver name drm/radeon/kms: drop unnecessary printks. drm: fix regression in fb blank handling drm/radeon/kms: make hibernate work on IGPs drm/vmwgfx: Optimize memory footprint for DMA buffers. drm/ttm: Allow system memory as a busy placement. drm/ttm: Fix race condition in ttm_bo_delayed_delete (v3, final) drm/nv50: prevent switching off SOR when in use for DVI-over-DP drm/nv50: fail auxch transaction if reply count not what we expect drm/nouveau: fix failure path if userspace specifies no valid memtypes drm/nouveau: report LVDS as disconnected if lid closed drm/radeon/kms: fix legacy get_engine/memory clock drm/radeon/kms/atom: atom parser fixes drm/radeon/kms: clean up atombios pll code drm/radeon/kms: clean up pll struct drm/radeon/kms/atom: fix crtc lock ordering drm/radeon: r6xx/r7xx possible security issue, system ram access drm/radeon/kms: r600/r700 don't test ib if ib initialization fails drm/radeon/kms: Forbid creation of framebuffer with no valid GEM object drm/radeon/kms: r600 handle irq vector ring overflow ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix IRQ ->set_affinity() methods. sparc: cpumask_of_node() should handle -1 as a node sparc64: Update defconfig. sparc: Add missing SW perf fault events. sparc64: Fully support both performance counters. sparc64: Add perf callchain support. sparc: convert to arch_gettimeoffset() sparc: leds_resource.end assigned to itself in clock_board_probe() sparc32: Fix page_to_phys(). sparc: Simplify param.h by simply including <asm-generic/param.h> sparc32: Update defconfig. SPARC: use helpers for rlimits sparc: copy_from_user() should not return -EFAULT
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits) virtio_net: Make delayed refill more reliable sfc: Use fixed-size buffers for MCDI NVRAM requests sfc: Add workspace for GMAC bug workaround to MCDI MAC_STATS buffer tcp_probe: avoid modulus operation and wrap fix qlge: Only free resources if they were allocated netns xfrm: deal with dst entries in netns sky2: revert config space change vlan: fix vlan_skb_recv() netns xfrm: fix "ip xfrm state|policy count" misreport sky2: Enable/disable WOL per hardware device net: Fix IPv6 GSO type checks in Intel ethernet drivers igb/igbvf: cleanup exception handling in tx_map_adv MAINTAINERS: Add Intel igbvf maintainer e1000/e1000e: don't use small hardware rx buffers fmvj18x_cs: add new id (Panasonic lan & modem card) be2net: swap only first 2 fields of mcc_wrb Please add support for Microsoft MN-120 PCMCIA network card be2net: fix bug in rx page posting wimax/i2400m: Add support for more i6x50 SKUs e1000e: enhance frame fragment detection ...
-
Linus Torvalds authored
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (25 commits) OMAP2/3: DMTIMER: Clear pending interrupts when stopping a timer PM debug: Fix warning when no CONFIG_DEBUG_FS OMAP3: PM: DSS PM_WKEN to refill DMA OMAP: timekeeping: time should not stop during suspend OMAP3: PM: Force write last pad config register into save area OMAP: omap3_pm_get_suspend_state() error ignored in pwrdm_suspend_get() OMAP3: PM: Enable wake-up from McBSP2, 3 and 4 modules OMAP3: PM debug: fix build error when !CONFIG_DEBUG_FS OMAP3: PM: Removing redundant and potentially dangerous PRCM configration OMAP3: Fixed ARM aux ctrl register save/restore OMAP3: CPUidle: Fixed timer resolution OMAP3: PM: Remove duplicate code blocks OMAP3: PM: Disable interrupt controller AUTOIDLE before WFI OMAP3: PM: Enable system control module autoidle OMAP3: PM: Ack pending interrupts before entering suspend omap: Enable GPMC clock in gpmc_init OMAP1 clock: fix for "BUG: spinlock lockup on CPU#0" OMAP4: clocks: Fix the clksel_rate struct DPLL divs OMAP4: PRCM: Fix the base address for CHIRONSS reg defines OMAP: dma_chan[lch_head].flag & OMAP_DMA_ACTIVE tested twice in omap_dma_unlink_lch() ...
-
- 25 Jan, 2010 26 commits
-
-
Herbert Xu authored
I have seen RX stalls on a machine that experienced a suspected OOM. After the stall, the RX buffer is empty on the guest side and there are exactly 16 entries available on the host side. As the number of entries is less than that required by a maximal skb, the host cannot proceed. The guest did not have a refill job scheduled. My diagnosis is that an OOM had occured, with the delayed refill job scheduled. The job was able to allocate at least one skb, but not enough to overcome the minimum required by the host to proceed. As the refill job would only reschedule itself if it failed completely to allocate any skbs, this would lead to an RX stall. The following patch removes this stall possibility by always rescheduling the refill job until the ring is totally refilled. Testing has shown that the RX stall no longer occurs whereas previously it would occur within a day. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ben Hutchings authored
The low-level MCDI code always uses 32-bit MMIO operations, and callers must pad input and output buffers to multiples of 4 bytes. The MCDI NVRAM functions are not doing this. Also, their buffers are declared as variable-length arrays with no explicit maximum length. Switch to a fixed buffer size based on the chunk size used by the MTD driver (which is a multiple of 4). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guido Barzini authored
Due to a hardware bug in the SFC9000 family, the firmware must transfer raw GMAC statistics to host memory before aggregating them into the cooked (speed-independent) MAC statistics. Extend the stats buffer to support this. The length of the buffer is explicit in the MAC_STATS command, so this change is backward-compatible on both sides. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
By rounding up the buffer size to power of 2, several expensive modulus operations can be avoided. This patch also solves a bug where the gap need when ring gets full was not being accounted for. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Breno Leitao authored
Currently qlge tries to release regions even if they were not allocated. This causes messages like the following in the kernel log Trying to free nonexistent resource <00000000006af400-00000000006af4ff> Trying to free nonexistent resource <00003c04ff9f4000-00003c04ff9f7fff> Trying to free nonexistent resource <00003c04ffc00000-00003c04ffcfffff> This patch fixes the goto logic in order to not release the resources if they were not allocated. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Denis Turischev authored
The mdelay function was used between I/O access commands, that causes peak in CPU usage. Fix it by substitution mdelay to msleep. Expand usage on fitPC2 compatible boards according to DMI identification. Signed-off-by: Denis Turischev <denis@compulab.co.il> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Takashi Iwai authored
-
Guennadi Liakhovetski authored
Remember to free the temporary register-cache. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: stable@kernel.org
-
Wei Yongjun authored
In function kvm_arch_vcpu_init(), if the memory malloc for vcpu->arch.mce_banks is fail, it does not free the memory of lapic date. This patch fixed it. Cc: stable@kernel.org Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Wei Yongjun authored
vcpu->arch.mce_banks is malloc in kvm_arch_vcpu_init(), but never free in any place, this may cause memory leak. So this patch fixed to free it in kvm_arch_vcpu_uninit(). Cc: stable@kernel.org Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Christian Borntraeger authored
kvm_handle_sie_intercept uses a jump table to get the intercept handler for a SIE intercept. Static code analysis revealed a potential problem: the intercept_funcs jump table was defined to contain (0x48 >> 2) entries, but we only checked for code > 0x48 which would cause an off-by-one array overflow if code == 0x48. Use the compiler and ARRAY_SIZE to automatically set the limits. Cc: stable@kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Michael S. Tsirkin authored
kvm didn't clear irqfd counter on deassign, as a result we could get a spurious interrupt when irqfd is assigned back. this leads to poor performance and, in theory, guest crash. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
-
Davide Libenzi authored
KVM needs a wait to atomically remove themselves from the eventfd ->poll() wait queue head, in order to handle correctly their IRQfd deassign operation. This patch introduces such API, plus a way to read an eventfd from its context. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@redhat.com>
-
Marcelo Tosatti authored
Exit the guest pagetable walk loop if reading gpte failed. Otherwise its possible to enter an endless loop processing the previous present pte. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Marcelo Tosatti authored
Otherwise memory beyond irq_states[16] might be accessed. Noticed by Juan Quintela. Cc: stable@kernel.org Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
-
Michael S. Tsirkin authored
Looks like repeatedly binding same fd to multiple gsi's with irqfd can use up a ton of kernel memory for irqfd structures. A simple fix is to allow each fd to only trigger one gsi: triggering a storm of interrupts in guest is likely useless anyway, and we can do it by binding a single gsi to many interrupts if we really want to. Cc: stable@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Acked-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>
-
Sheng Yang authored
When found a error hva, should not return PAGE_SIZE but the level... Also clean up the coding style of the following loop. Cc: stable@kernel.org Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Alexander Graf authored
Embedded PowerPC KVM has an exit timing implementation to track and evaluate how much time was spent in which exit path. For Book3S, we don't implement it. So let's not expose it as a config option either. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
-
Avi Kivity authored
When we queue an interrupt to the local apic, we set the IRR before the TMR. The vcpu can pick up the IRR and inject the interrupt before setting the TMR, and perhaps even EOI it, causing incorrect behaviour. The race is really insignificant since it can only occur on the first interrupt (usually following interrupts will not change TMR), but it's better closed than open. Fixed by reordering setting the TMR vs IRR. Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
-
Hans de Goede authored
When /dev/watchdog gets opened a second time we return -EBUSY, but we already have got a kref then, so we end up leaking our data struct. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: stable@kernel.org
-
Luca Tettamanti authored
MBIF (motherboard identification) is only used to print the name of the board, it's not essential for the driver; do not fail if it's missing. Based on Juan's patch. Signed-off-by: Luca Tettamanti <kronos.it@gmail.com> Acked-by: Juan RP <xtraeme@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
-
Dan Carpenter authored
The mutex gets unlocked after we goto EXIT. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
-
Jeff Mahoney authored
smsc47m1_restore is called from sm_smsc47m1_exit, which is an __exit function, so it can't be __init. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
-
Łukasz Wojniłowicz authored
Now one can choose speaker configuration in e.g. PulseAudio mixer Signed-off-by: Łukasz Wojniłowicz <lukasz.wojnilowicz@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
-
Alexey Dobriyan authored
GC is non-existent in netns, so after you hit GC threshold, no new dst entries will be created until someone triggers cleanup in init_net. Make xfrm4_dst_ops and xfrm6_dst_ops per-netns. This is not done in a generic way, because it woule waste (AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns. Reorder GC threshold initialization so it'd be done before registering XFRM policies. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
Obviously, this register had some other impact that is causing the regression. Either it is masking some other access or needs to be reset in some path. Either, way it is best to just revert the change for 2.6.33 This reverts commit 166a0fd4. Signed-off-by: David S. Miller <davem@davemloft.net>
-