Commits · 24993d53495d1f9b844f8eb3ebd1b9efd3521617 · Kirill Smelkov / linux

04 Mar, 2008 4 commits

KVM: make MMU_DEBUG compile again · 24993d53

Marcelo Tosatti authored Feb 14, 2008

the cr3 variable is now inside the vcpu->arch structure.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

24993d53

KVM: move alloc_apic_access_page() outside of non-preemptable region · 5e4a0b3c

Marcelo Tosatti authored Feb 14, 2008

alloc_apic_access_page() can sleep, while vmx_vcpu_setup is called
inside a non preemptable region. Move it after put_cpu().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

5e4a0b3c

KVM: SVM: fix Windows XP 64 bit installation crash · a2938c80

Joerg Roedel authored Feb 13, 2008

While installing Windows XP 64 bit wants to access the DEBUGCTL and the last
branch record (LBR) MSRs. Don't allowing this in KVM causes the installation to
crash. This patch allow the access to these MSRs and fixes the issue.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

a2938c80

KVM: remove the usage of the mmap_sem for the protection of the memory slots. · 72dc67a6

Izik Eidus authored Feb 10, 2008

This patch replaces the mmap_sem lock for the memory slots with a new
kvm private lock, it is needed beacuse untill now there were cases where
kvm accesses user memory while holding the mmap semaphore.
Signed-off-by: Izik Eidus <izike@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

72dc67a6

03 Mar, 2008 5 commits

KVM: emulate access to MSR_IA32_MCG_CTL · c7ac679c

Joerg Roedel authored Feb 11, 2008

Injecting an GP when accessing this MSR lets Windows crash when running some
stress test tools in KVM.  So this patch emulates access to this MSR.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

c7ac679c

KVM: Make the supported cpuid list a host property rather than a vm property · 674eea0f

Avi Kivity authored Feb 11, 2008

One of the use cases for the supported cpuid list is to create a "greatest
common denominator" of cpu capabilities in a server farm. As such, it is
useful to be able to get the list without creating a virtual machine first.

Since the code does not depend on the vm in any way, all that is needed is
to move it to the device ioctl handler. The capability identifier is also
changed so that binaries made against -rc1 will fail gracefully.
Signed-off-by: Avi Kivity <avi@qumranet.com>

674eea0f

KVM: Fix kvm_arch_vcpu_ioctl_set_sregs so that set_cr0 works properly · d7306163

Paul Knowles authored Feb 06, 2008

Whilst working on getting a VM to initialize in to IA32e mode I found
this issue. set_cr0 relies on comparing the old cr0 to the new one to
work correctly.  Move the assignment below so the compare can work.
Signed-off-by: Paul Knowles <paul@transitive.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

d7306163

KVM: SVM: set NM intercept when enabling CR0.TS in the guest · 6b390b63

Joerg Roedel authored Jan 29, 2008

Explicitly enable the NM intercept in svm_set_cr0 if we enable TS in the guest
copy of CR0 for lazy FPU switching. This fixes guest SMP with Linux under SVM.
Without that patch Linux deadlocks or panics right after trying to boot the
other CPUs.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

6b390b63

KVM: SVM: Fix lazy FPU switching · 334df50a

Joerg Roedel authored Jan 21, 2008

If the guest writes to cr0 and leaves the TS flag at 0 while vcpu->fpu_active
is also 0, the TS flag in the guest's cr0 gets lost. This leads to corrupt FPU
state an causes Windows Vista 64bit to crash very soon after boot.  This patch
fixes this bug.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Markus Rechberger <markus.rechberger@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>

334df50a

02 Mar, 2008 4 commits

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 · 038f2f72

Linus Torvalds authored Mar 02, 2008

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: fix crash in automatic module unloading
  firewire: potentially invalid pointers used in fw_card_bm_work
  firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device

038f2f72

firewire: fix crash in automatic module unloading · 855c603d

Stefan Richter authored Feb 27, 2008

"modprobe firewire-ohci; sleep .1; modprobe -r firewire-ohci" used to
result in crashes like this:

    BUG: unable to handle kernel paging request at ffffffff8807b455
    IP: [<ffffffff8807b455>]
    PGD 203067 PUD 207063 PMD 7c170067 PTE 0
    Oops: 0010 [1] PREEMPT SMP
    CPU 0
    Modules linked in: i915 drm cpufreq_ondemand acpi_cpufreq freq_table applesmc input_polldev led_class coretemp hwmon eeprom snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss button thermal processor sg snd_hda_intel snd_pcm snd_timer snd snd_page_alloc sky2 i2c_i801 rtc [last unloaded: crc_itu_t]
    Pid: 9, comm: events/0 Not tainted 2.6.25-rc2 #3
    RIP: 0010:[<ffffffff8807b455>]  [<ffffffff8807b455>]
    RSP: 0018:ffff81007dcdde88  EFLAGS: 00010246
    RAX: ffff81007dc95040 RBX: ffff81007dee5390 RCX: 0000000000005e13
    RDX: 0000000000008c8b RSI: 0000000000000001 RDI: ffff81007dee5388
    RBP: ffff81007dc5eb40 R08: 0000000000000002 R09: ffffffff8022d05c
    R10: ffffffff8023b34c R11: ffffffff8041a353 R12: ffff81007dee5388
    R13: ffffffff8807b455 R14: ffffffff80593bc0 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:ffffffff8055a000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: ffffffff8807b455 CR3: 0000000000201000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process events/0 (pid: 9, threadinfo ffff81007dcdc000, task ffff81007dc95040)
    Stack:  ffffffff8023b396 ffffffff88082524 0000000000000000 ffffffff8807d9ae
    ffff81007dc5eb40 ffff81007dc9dce0 ffff81007dc5eb40 ffff81007dc5eb80
    ffff81007dc9dce0 ffffffffffffffff ffffffff8023be87 0000000000000000
    Call Trace:
    [<ffffffff8023b396>] ? run_workqueue+0xdf/0x1df
    [<ffffffff8023be87>] ? worker_thread+0xd8/0xe3
    [<ffffffff8023e917>] ? autoremove_wake_function+0x0/0x2e
    [<ffffffff8023bdaf>] ? worker_thread+0x0/0xe3
    [<ffffffff8023e813>] ? kthread+0x47/0x74
    [<ffffffff804198e0>] ? trace_hardirqs_on_thunk+0x35/0x3a
    [<ffffffff8020c008>] ? child_rip+0xa/0x12
    [<ffffffff8020b6e3>] ? restore_args+0x0/0x3d
    [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
    [<ffffffff8023e68a>] ? kthreadd+0x14c/0x171
    [<ffffffff8023e7cc>] ? kthread+0x0/0x74
    [<ffffffff8020bffe>] ? child_rip+0x0/0x12

    Code:  Bad RIP value.
    RIP  [<ffffffff8807b455>]
    RSP <ffff81007dcdde88>
    CR2: ffffffff8807b455
    ---[ end trace c7366c6657fe5bed ]---

Note that this crash happened _after_ firewire-core was unloaded.  The
shared workqueue tried to run firewire-core's device initialization jobs
or similar jobs.

The fix makes sure that firewire-ohci and hence firewire-core is not
unloaded before all device shutdown jobs have been completed.  This is
determined by the count of device initializations minus device releases.

Also skip useless retries in the node initialization job if the node is
to be shut down.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>

855c603d

firewire: potentially invalid pointers used in fw_card_bm_work · 15803478

Stefan Richter authored Feb 24, 2008

The bus management workqueue job was in danger to dereference NULL
pointers.  Also, after having temporarily lifted card->lock, a few node
pointers and a device pointer may have become invalid.

Add NULL pointer checks and get the necessary references.  Also, move
card->local_node out of fw_card_bm_work's sight during shutdown of the
card.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>

15803478

firewire: fw-sbp2: better fix for NULL pointer dereference in scsi_remove_device · f8436158

Stefan Richter authored Feb 26, 2008

Patch "firewire: fw-sbp2: fix NULL pointer deref. in scsi_remove_device"
had the unintended effect that firewire-sbp2 could not be unloaded
anymore until all SBP-2 devices were unplugged.

We now fix the NULL pointer bug by reacquiring a reference to the sdev
instead of holding a reference to the sdev (and to the module) all the
time.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Jarod Wilson <jwilson@redhat.com>

f8436158

01 Mar, 2008 5 commits

[PATCH] drop EOE records from printk · 8d07a67c

Steve Grubb authored Feb 21, 2008

Hi,

While we are looking at the printk issue, I see that its printk'ing the EOE
(end of event) records which is really not something that we need in syslog.
Its really intended for the realtime audit event stream handled by the audit
daemon. So, lets avoid printk'ing that record type.
Signed-off-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

8d07a67c

[RFC] AUDIT: do not panic when printk loses messages · b29ee87e

Eric Paris authored Feb 21, 2008

On the latest kernels if one was to load about 15 rules, set the failure
state to panic, and then run service auditd stop the kernel will panic.
This is because auditd stops, then the script deletes all of the rules.
These deletions are sent as audit messages out of the printk kernel
interface which is already known to be lossy. These will overun the
default kernel rate limiting (10 really fast messages) and will call
audit_panic(). The same effect can happen if a slew of avc's come
through while auditd is stopped.

This can be fixed a number of ways but this patch fixes the problem by
just not panicing if auditd is not running. We know printk is lossy and
if the user chooses to set the failure mode to panic and tries to use
printk we can't make any promises no matter how hard we try, so why try?
At least in this way we continue to get lost message accounting and will
eventually know that things went bad.

The other change is to add a new call to audit_log_lost() if auditd
disappears. We already pulled the skb off the queue and couldn't send
it so that message is lost. At least this way we will account for the
last message and panic if the machine is configured to panic. This code
path should only be run if auditd dies for unforeseen reasons. If
auditd closes correctly audit_pid will get set to 0 and we won't walk
this code path.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

b29ee87e

[PATCH] Audit: Fix the format type for size_t variables · 422b03cf

Paul Moore authored Feb 27, 2008

Fix the following compiler warning by using "%zu" as defined in C99.

  CC      kernel/auditsc.o
  kernel/auditsc.c: In function 'audit_log_single_execve_arg':
  kernel/auditsc.c:1074: warning: format '%ld' expects type 'long int', but
  argument 4 has type 'size_t'
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

422b03cf

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev · d395991c

Linus Torvalds authored Feb 29, 2008

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  [libata] wrap kmap_atomic(KM_IRQ0) with local_irq_save/restore()
  sata_svw: Add support for HT1100 SATA controller

d395991c

[libata] wrap kmap_atomic(KM_IRQ0) with local_irq_save/restore() · b445c568

Jeff Garzik authored Feb 29, 2008

Interrupts must be disabled if using kmap_atomic(KM_IRQ0), but that was
not the case in a few code paths coming directly from ATA driver
interrupt handlers (which use spin_lock rather than spin_lock_irqsave).
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

b445c568

29 Feb, 2008 22 commits

Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm · b73384f0

Linus Torvalds authored Feb 29, 2008

* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 4843/1: Add GCR_CLKBPB for PXA3xx
  [ARM] 4842/1: pxa: remove redundant IRQ saving/restoring in clk_pxa3xx_cken_*
  [ARM] 4841/1: pxa: fix typo in LCD platform data definition code for zylonite
  [ARM] 4840/1: pxa: fix the typo in get_irqnr_and_base
  [ARM] 4839/1: fixes kernel Oops in /dev/mem device driver for memory map with PHYS_OFF
  [ARM] eliminate MODULE_PARM() usage
  [ARM] 4838/1: Fix kexec for SA1100 machines
  [ARM] 4837/1: make __get_unaligned_*() return unsigned types
  [ARM] 4836/1: Make ATAGS_PROC depend on KEXEC

b73384f0

[ARM] 4843/1: Add GCR_CLKBPB for PXA3xx · d862ccc5

Mark Brown authored Feb 27, 2008

The PXA3xx AC97 controller has an additional control bit GCR_CLKBPB
which must be used during cold reset.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

d862ccc5

[ARM] 4842/1: pxa: remove redundant IRQ saving/restoring in clk_pxa3xx_cken_* · ceee4f98

eric miao authored Feb 27, 2008

This is unnecessary since it is already protected by
spin_lock_irq{save, restore} in clock.c.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ceee4f98

[ARM] 4841/1: pxa: fix typo in LCD platform data definition code for zylonite · 7a987e82
eric miao authored Feb 27, 2008
```
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
```
7a987e82

[ARM] 4840/1: pxa: fix the typo in get_irqnr_and_base · a3359e21

eric miao authored Feb 27, 2008

This typo causes the incorrect calculation of the IRQ numbers
in the ICIP2 registers.
Signed-off-by: eric miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

a3359e21

[ARM] 4839/1: fixes kernel Oops in /dev/mem device driver for memory map with PHYS_OFF · 9ae3ae0b

Alexandre Rusev authored Feb 26, 2008

"cat /dev/mem" may cause kernel Oops for boards with PHYS_OFFSET != 0
because character device is mapped to addresses starting from zero
and there is no protection against such situation.
Patch just add this.
Signed-off-by: Alexandre Rusev <arusev@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

9ae3ae0b

[ARM] eliminate MODULE_PARM() usage · c710e39c

Randy Dunlap authored Feb 27, 2008

Convert debug-only (and removed) MODULE_PARM() to module_param().
Compiles cleanly (with DEBUG=1).
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

c710e39c

[ARM] 4838/1: Fix kexec for SA1100 machines · 5ce94e9e

Thomas Kunze authored Feb 24, 2008

This patch sets KEXEC_CONTROL_MEMORY_LIMIT to (-1)UL. As the value is
compared with physical addresses TASK_SIZE makes no sense. Machines
where the RAM addresses start above TASK_SIZE kexecs eats all memory
and crashes the kernel without this patch.
Signed-off-by: Thomas Kunze <thommycheck@gmx.de>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

5ce94e9e

[ARM] 4837/1: make __get_unaligned_*() return unsigned types · 94a3f785

Lennert Buytenhek authored Feb 23, 2008

Eric Sandeen tracked an XFS on ARM corruption bug down to a function
under fs/xfs/ involving some get_unaligned() calls on u64 pointers.
As it turns out, calling ARM's get_unaligned() on a u64 pointer
pointing to the following byte sequence:

	80 81 82 83 84 85 86 87

would return ffffffff83828180 (LE mode.)  This turns out to be
because of implicit u8 -> int promotion in ARM's implementation of
various helpers for get_unaligned(), causing them to accidentally
return signed instead of unsigned values, which in turn caused the
subsequent casts to unsigned long long in __get_unaligned_8_[bl]e()
to sign-extend the lower words.

Fix by casting the return values of __get_unaligned_[24]_[bl]e()
to unsigned int.

Cc: Eric Sandeen <sandeen@sandeen.net>
Cc: Rabeeh Khoury <rabeeh@marvell.com>
Cc: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

94a3f785

[ARM] 4836/1: Make ATAGS_PROC depend on KEXEC · b98d7291

Uli Luckas authored Feb 22, 2008

On Wed, Feb 20, 2008 at 11:50:33AM +0100, Guennadi Liakhovetski wrote:
> arch/arm/kernel/atags.c uses for some reason the
> KEXEC_BOOT_PARAMS_SIZE macro, which is only defined if CONFIG_KEXEC
> is set. So, either this macro should be defined always, or another
> macro should be used, or ATAGS_PROC should depend on KEXEC.

As the procfs export of ATAGS is not meant as a stable, general purpose
ABI it shouldn't be an independent, general configuration option.

This patch make ATAGS_PROC depend on KEXEC
Signed-off-by: Uli Luckas <u.luckas@road.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

b98d7291

rcupreempt: remove never-migrates assumption from rcu_process_callbacks() · c9e71002

Paul E. McKenney authored Feb 28, 2008

This patch fixes a potentially invalid access to a per-CPU variable in
rcu_process_callbacks().

This per-CPU access needs to be done in such a way as to guarantee that
the code using it cannot move to some other CPU before all uses of the
value accessed have completed.  Even though this code is currently only
invoked from softirq context, which currrently cannot migrate to some
other CPU, life would be better if this code did not silently make such
an assumption.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

c9e71002

rcupreempt: fix hibernate/resume in presence of PREEMPT_RCU and hotplug · ae778869

Paul E. McKenney authored Feb 27, 2008

This fixes a oops encountered when doing hibernate/resume in presence of
PREEMPT_RCU.

The problem was that the code failed to disable preemption when
accessing a per-CPU variable.  This is OK when called from code that
already has preemption disabled, but such is not the case from the
suspend/resume code path.
Reported-by: Dave Young <hidave.darkstar@gmail.com>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ae778869

Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched · 076d84bb

Linus Torvalds authored Feb 29, 2008

* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
  softlockup: fix task state setting
  rcu: add support for dynamic ticks and preempt rcu

076d84bb

xen: mask out SEP from CPUID · d40e7059

Jeremy Fitzhardinge authored Feb 29, 2008

Fix 32-on-64 pvops kernel:

we don't want userspace using syscall/sysenter, even if the hypervisor
supports it, so mask it out from CPUID.
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

d40e7059

x86 ptrace: fix ptrace_bts_config structure declaration · 53c58588

Dave Anderson authored Feb 21, 2008

The 2.6.25 ptrace_bts_config structure in asm-x86/ptrace-abi.h
is defined with u32 types:

   #include <asm/types.h>

   /* configuration/status structure used in PTRACE_BTS_CONFIG and
      PTRACE_BTS_STATUS commands.
   */
   struct ptrace_bts_config {
           /* requested or actual size of BTS buffer in bytes */
           u32 size;
           /* bitmask of below flags */
           u32 flags;
           /* buffer overflow signal */
           u32 signal;
           /* actual size of bts_struct in bytes */
           u32 bts_size;
   };
   #endif

But u32 is only accessible in asm-x86/types.h if __KERNEL__,
leading to compile errors when ptrace.h is included from
user-space. The double-underscore versions that are exported
to user-space in asm-x86/types.h should be used instead.
Signed-off-by: Dave Anderson <anderson@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

53c58588

x86: disable BTS ptrace extensions for now · b4ef95de

Ingo Molnar authored Feb 26, 2008

revert the BTS ptrace extension for now.

based on general objections from Roland McGrath:

    http://lkml.org/lkml/2008/2/21/323

we'll let the BTS functionality cook some more and re-enable
it in v2.6.26. We'll leave the dead code around to help the
development of this code.

(X86_BTS is not defined at the moment)
Signed-off-by: Ingo Molnar <mingo@elte.hu>

b4ef95de

x86: CPA: avoid split of alias mappings · 8be8f54b

Thomas Gleixner authored Feb 23, 2008

avoid over-eager large page splitup.

When the target area needs to be split or is split already (ioremap)
then the current code enforces the split of large mappings in the alias
regions even if we could avoid it.

Use a separate variable processed in the cpa_data structure to carry
the number of pages which have been processed instead of reusing the
numpages variable. This keeps numpages intact and gives the alias code
a chance to keep large mappings intact.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

8be8f54b

x86: delay the export removal of init_mm · 757265b8

Ingo Molnar authored Feb 28, 2008

delay the removal of this symbol export by one more kernel release,
giving external modules such as VirtualBox a chance to stop using it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

757265b8

x86: fix leak un ioremap_page_range() failure · b16bf712

Ingo Molnar authored Feb 28, 2008

Jan Beulich noticed it during code review that if a driver's ioremap()
fails (say due to -ENOMEM) then we might leak the struct vm_area.

Free it properly.
Signed-off-by: Ingo Molnar <mingo@elte.hu>

b16bf712

x86 vdso: fix build locale dependency · f2dbe03d

Roland McGrath authored Feb 27, 2008

Priit Laes discovered that the sed command processing nm output was
sensitive to locale settings.  This was addressed in commit
03994f01 by using [:alnum:] in place of
[a-zA-Z0-9].

But that solution too is locale-dependent and may not always match
the identifiers it needs to.  The better fix is just to run sed et al
with a fixed locale setting in all builds.
Signed-off-by: Roland McGrath <roland@redhat.com>
CC: Priit Laes <plaes@plaes.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

f2dbe03d

x86: restore vsyscall64 prochandler · d67bbacb

Thomas Gleixner authored Feb 27, 2008

a recent fix:

  commit ce28b986
  Author: Thomas Gleixner <tglx@linutronix.de>
  Date:   Wed Feb 20 23:57:30 2008 +0100

    x86: fix vsyscall wreckage

removed the broken /kernel/vsyscall64 handler completely.
This triggers the following debug check:

  sysctl table check failed: /kernel/vsyscall64  No proc_handler

Restore the sane part of the proc handler.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

d67bbacb

x86: fix pmd_bad and pud_bad to support huge pages · cded932b

Hans Rosenfeld authored Feb 18, 2008

I recently stumbled upon a problem in the support for huge pages. If a
program using huge pages does not explicitly unmap them, they remain
mapped (and therefore, are lost) after the program exits.

I observed that the free huge page count in /proc/meminfo decreased when
running my program, and it did not increase after the program exited.
After running the program a few times, no more huge pages could be
allocated.

The reason for this seems to be that the x86 pmd_bad and pud_bad
consider pmd/pud entries having the PSE bit set invalid. I think there
is nothing wrong with this bit being set, it just indicates that the
lowest level of translation has been reached. This bit has to be (and
is) checked after the basic validity of the entry has been checked, like
in this fragment from follow_page() in mm/memory.c:

  if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
          goto no_page_table;

  if (pmd_huge(*pmd)) {
          BUG_ON(flags & FOLL_GET);
          page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE);
          goto out;
  }

Note that this code currently doesn't work as intended if the pmd refers
to a huge page, the pmd_huge() check can not be reached if the page is
huge.

Extending pmd_bad() (and, for future 1GB page support, pud_bad()) to
allow for the PSE bit being set fixes this. For similar reasons,
allowing the NX bit being set is necessary, too. I have seen huge pages
having the NX bit set in their pmd entry, which would cause the same
problem.
Signed-Off-By: Hans Rosenfeld <hans.rosenfeld@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

cded932b