Commits · f38e098ff3a315bb74abbb4a35cba11bbea8e2fa · Kirill Smelkov / linux

24 Oct, 2010 40 commits

KVM: x86: TSC reset compensation · f38e098f

Zachary Amsden authored Aug 19, 2010

Attempt to synchronize TSCs which are reset to the same value.  In the
case of a reliable hardware TSC, we can just re-use the same offset, but
on non-reliable hardware, we can get closer by adjusting the offset to
match the elapsed time.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

f38e098f

KVM: x86: Move TSC offset writes to common code · 99e3e30a

Zachary Amsden authored Aug 19, 2010

Also, ensure that the storing of the offset and the reading of the TSC
are never preempted by taking a spinlock.  While the lock is overkill
now, it is useful later in this patch series.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

99e3e30a

KVM: x86: Convert TSC writes to TSC offset writes · f4e1b3c8

Zachary Amsden authored Aug 19, 2010

Change svm / vmx to be the same internally and write TSC offset
instead of bare TSC in helper functions.  Isolated as a single
patch to contain code movement.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

f4e1b3c8

KVM: x86: Drop vm_init_tsc · ae38436b

Zachary Amsden authored Aug 19, 2010

This is used only by the VMX code, and is not done properly;
if the TSC is indeed backwards, it is out of sync, and will
need proper handling in the logic at each and every CPU change.
For now, drop this test during init as misguided.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

ae38436b

KVM: MMU: fix missing percpu counter destroy · 45bf21a8

Wei Yongjun authored Aug 23, 2010

commit ad05c88266b4cce1c820928ce8a0fb7690912ba1
(KVM: create aggregate kvm_total_used_mmu_pages value)
introduce percpu counter kvm_total_used_mmu_pages but never
destroy it, this may cause oops when rmmod & modprobe.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

45bf21a8

KVM: MMU: fix regression from rework mmu_shrink() code · 80b63faf

Xiaotian Feng authored Aug 24, 2010

Latest kvm mmu_shrink code rework makes kernel changes kvm->arch.n_used_mmu_pages/
kvm->arch.n_max_mmu_pages at kvm_mmu_free_page/kvm_mmu_alloc_page, which is called
by kvm_mmu_commit_zap_page. So the kvm->arch.n_used_mmu_pages or
kvm_mmu_available_pages(vcpu->kvm) is unchanged after kvm_mmu_prepare_zap_page(),
This caused kvm_mmu_change_mmu_pages/__kvm_mmu_free_some_pages loops forever.
Moving kvm_mmu_commit_zap_page would make the while loop performs as normal.
Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
Tested-by: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

80b63faf

KVM: x86 emulator: add JrCXZ instruction emulation · e4abac67

Wei Yongjun authored Aug 19, 2010

Add JrCXZ instruction emulation (opcode 0xe3)
Used by FreeBSD boot loader.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

e4abac67

KVM: x86 emulator: add LDS/LES/LFS/LGS/LSS instruction emulation · 09b5f4d3

Wei Yongjun authored Aug 23, 2010

Add LDS/LES/LFS/LGS/LSS instruction emulation.
(opcode 0xc4, 0xc5, 0x0f 0xb2, 0x0f 0xb4~0xb5)
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

09b5f4d3

KVM: create aggregate kvm_total_used_mmu_pages value · 45221ab6

Dave Hansen authored Aug 19, 2010

Of slab shrinkers, the VM code says:

 * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
 * querying the cache size, so a fastpath for that case is appropriate.

and it *means* it.  Look at how it calls the shrinkers:

    nr_before = (*shrinker->shrink)(0, gfp_mask);
    shrink_ret = (*shrinker->shrink)(this_scan, gfp_mask);

So, if you do anything stupid in your shrinker, the VM will doubly
punish you.

The mmu_shrink() function takes the global kvm_lock, then acquires
every VM's kvm->mmu_lock in sequence.  If we have 100 VMs, then
we're going to take 101 locks.  We do it twice, so each call takes
202 locks.  If we're under memory pressure, we can have each cpu
trying to do this.  It can get really hairy, and we've seen lock
spinning in mmu_shrink() be the dominant entry in profiles.

This is guaranteed to optimize at least half of those lock
aquisitions away.  It removes the need to take any of the locks
when simply trying to count objects.

A 'percpu_counter' can be a large object, but we only have one
of these for the entire system.  There are not any better
alternatives at the moment, especially ones that handle CPU
hotplug.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

45221ab6

KVM: replace x86 kvm n_free_mmu_pages with n_used_mmu_pages · 49d5ca26

Dave Hansen authored Aug 19, 2010

Doing this makes the code much more readable.  That's
borne out by the fact that this patch removes code.  "used"
also happens to be the number that we need to return back to
the slab code when our shrinker gets called.  Keeping this
value as opposed to free makes the next patch simpler.

So, 'struct kvm' is kzalloc()'d.  'struct kvm_arch' is a
structure member (and not a pointer) of 'struct kvm'.  That
means they start out zeroed.  I _think_ they get initialized
properly by kvm_mmu_change_mmu_pages().  But, that only happens
via kvm ioctls.

Another benefit of storing 'used' intead of 'free' is
that the values are consistent from the moment the structure is
allocated: no negative "used" value.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

49d5ca26

KVM: rename x86 kvm->arch.n_alloc_mmu_pages · 39de71ec

Dave Hansen authored Aug 19, 2010

arch.n_alloc_mmu_pages is a poor choice of name. This value truly
means, "the number of pages which _may_ be allocated".  But,
reading the name, "n_alloc_mmu_pages" implies "the number of allocated
mmu pages", which is dead wrong.

It's really the high watermark, so let's give it a name to match:
nr_max_mmu_pages.  This change will make the next few patches
much more obvious and easy to read.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

39de71ec

KVM: abstract kvm x86 mmu->n_free_mmu_pages · e0df7b9f

Dave Hansen authored Aug 19, 2010

"free" is a poor name for this value.  In this context, it means,
"the number of mmu pages which this kvm instance should be able to
allocate."  But "free" implies much more that the objects are there
and ready for use.  "available" is a much better description, especially
when you see how it is calculated.

In this patch, we abstract its use into a function.  We'll soon
replace the function's contents by calculating the value in a
different way.

All of the reads of n_free_mmu_pages are taken care of in this
patch.  The modification sites will be handled in a patch
later in the series.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Tim Pepper <lnxninja@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

e0df7b9f

KVM: x86 emulator: implement CWD (opcode 99) · 61429142
Avi Kivity authored Aug 19, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
61429142
KVM: x86 emulator: implement IMUL REG, R/M, IMM (opcode 69) · d46164db
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
d46164db
KVM: x86 emulator: add Src2Imm decoding · 7db41eb7
Avi Kivity authored Aug 18, 2010
```
Needed for 3-operand IMUL.
Signed-off-by: Avi Kivity <avi@redhat.com>
```
7db41eb7
KVM: x86 emulator: consolidate immediate decode into a function · 39f21ee5
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
39f21ee5
KVM: x86 emulator: implement RDTSC (opcode 0F 31) · 48bb5d3c
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
48bb5d3c
KVM: x86 emulator: remove SrcImplicit · 7077aec0
Avi Kivity authored Aug 18, 2010
```
Useless.
Signed-off-by: Avi Kivity <avi@redhat.com>
```
7077aec0
KVM: x86 emulator: implement IMUL REG, R/M (opcode 0F AF) · 5c82aa29
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
5c82aa29
KVM: x86 emulator: implement IMUL REG, R/M, imm8 (opcode 6B) · f3a1b9f4
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
f3a1b9f4
KVM: x86 emulator: implement RET imm16 (opcode C2) · 40ece7c7
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
40ece7c7
KVM: x86 emulator: add SrcImmU16 operand type · b250e605
Avi Kivity authored Aug 18, 2010
```
Used for RET NEAR instructions.
Signed-off-by: Avi Kivity <avi@redhat.com>
```
b250e605
KVM: x86 emulator: implement CALL FAR (FF /3) · 0ef753b8
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
0ef753b8
KVM: x86 emulator: implement DAS (opcode 2F) · 7af04fc0
Avi Kivity authored Aug 18, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
7af04fc0

KVM: x86 emulator: Use a register for ____emulate_2op() destination · fb2c2641

Avi Kivity authored Aug 16, 2010

Most x86 two operand instructions allow the destination to be a memory operand,
but IMUL (for example) requires that the destination be a register.  Change
____emulate_2op() to take a register for both source and destination so we
can invoke IMUL.
Signed-off-by: Avi Kivity <avi@redhat.com>

fb2c2641

KVM: x86 emulator: pass destination type to ____emulate_2op() · b3b3d25a

Avi Kivity authored Aug 16, 2010

We'll need it later so we can use a register for the destination.
Signed-off-by: Avi Kivity <avi@redhat.com>

b3b3d25a

KVM: x86 emulator: add LOOP/LOOPcc instruction emulation · f2f31845

Wei Yongjun authored Aug 18, 2010

Add LOOP/LOOPcc instruction emulation (opcode 0xe0~0xe2).
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

f2f31845

KVM: x86 emulator: add CBW/CWDE/CDQE instruction emulation · e8b6fa70

Wei Yongjun authored Aug 18, 2010

Add CBW/CWDE/CDQE instruction emulation.(opcode 0x98)
Used by FreeBSD's boot loader.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

e8b6fa70

KVM: x86 emulator: fix REPZ/REPNZ termination condition · 0fa6ccbd

Avi Kivity authored Aug 17, 2010

EFLAGS.ZF needs to be checked after each iteration, not before.
Signed-off-by: Avi Kivity <avi@redhat.com>

0fa6ccbd

KVM: x86 emulator: implement SCAS (opcodes AE, AF) · f6b33fc5
Avi Kivity authored Aug 17, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
f6b33fc5

KVM: x86 emulator: fix INTn emulation not pushing EFLAGS and CS · 5c56e1cf

Avi Kivity authored Aug 17, 2010

emulate_push() only schedules a push; it doesn't actually push anything.
Call writeback() to flush out the write.
Signed-off-by: Avi Kivity <avi@redhat.com>

5c56e1cf

KVM: x86 emulator: remove dup code of in/out instruction · a13a63fa
Wei Yongjun authored Aug 06, 2010
```
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
```
a13a63fa

KVM: x86 emulator: change OUT instruction to use dst instead of src · 41167be5

Wei Yongjun authored Aug 06, 2010

Change OUT instruction to use dst instead of src, so we can
reuse those code for all out instructions.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

41167be5

KVM: x86 emulator: introduce DstImmUByte for dst operand decode · 943858e2

Wei Yongjun authored Aug 06, 2010

Introduce DstImmUByte for dst operand decode, which
will be used for out instruction.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

943858e2

KVM: x86 emulator: remove useless label from x86_emulate_insn() · c483c02a
Wei Yongjun authored Aug 06, 2010
```
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
```
c483c02a

KVM: x86 emulator: add setcc instruction emulation · ee45b58e

Wei Yongjun authored Aug 06, 2010

Add setcc instruction emulation (opcode 0x0f 0x90~0x9f)
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

ee45b58e

KVM: x86: explain 'no-kvmclock' kernel parameter · 9cf4c4fc

Jiri Kosina authored Aug 16, 2010

no-kvmclock kernel parameter is missing its explanation in
Documentation/kernel-parameters.txt. Add it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Avi Kivity <avi@redhat.com>

9cf4c4fc

KVM: x86 emulator: add XADD instruction emulation · 92f738a5

Wei Yongjun authored Aug 17, 2010

Add XADD instruction emulation (opcode 0x0f 0xc0~0xc1)
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

92f738a5

KVM: x86 emulator: put register operand write back to a function · 31be40b3

Wei Yongjun authored Aug 17, 2010

Introduce function write_register_operand() to write back the
register operand.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

31be40b3

KVM: PPC: fix leakage of error page in kvmppc_patch_dcbz() · 646bab55

Wei Yongjun authored Aug 17, 2010

Add kvm_release_page_clean() after is_error_page() to avoid
leakage of error page.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

646bab55