Commits · f495c6e5e8fdc972162241df5bdff5bcebb4dc33 · Kirill Smelkov / linux

01 Aug, 2010 40 commits

KVM: VMX: Fix incorrect rcu deref in rmode_tss_base() · f495c6e5
Avi Kivity authored Jun 10, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
f495c6e5

KVM: Fix unused but set warnings · a24e8099

Andi Kleen authored Jun 10, 2010

No real bugs in this one.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

a24e8099

KVM: Fix KVM_SET_SIGNAL_MASK with arg == NULL · 376d41ff

Andi Kleen authored Jun 10, 2010

When the user passed in a NULL mask pass this on from the ioctl
handler.

Found by gcc 4.6's new warnings.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

376d41ff

KVM: MMU: delay local tlb flush · 3b5d1321

Xiao Guangrong authored Jun 08, 2010

delay local tlb flush until enter guest moden, it can reduce vpid flush
frequency and reduce remote tlb flush IPI(if KVM_REQ_TLB_FLUSH bit is
already set, IPI is not sent)
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

3b5d1321

KVM: MMU: use wrapper function to flush local tlb · 5304efde

Xiao Guangrong authored Jun 08, 2010

Use kvm_mmu_flush_tlb() function instead of calling
kvm_x86_ops->tlb_flush(vcpu) directly.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

5304efde

KVM: MMU: remove unnecessary remote tlb flush · 4f78fd08

Xiao Guangrong authored Jun 08, 2010

This remote tlb flush is no necessary since we have synced while
sp is zapped
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

4f78fd08

KVM: VMX: fix rcu usage warning in init_rmode() · 4b9d3a04

Xiao Guangrong authored Jun 08, 2010

fix:

[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
include/linux/kvm_host.h:258 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 1
1 lock held by qemu-system-x86/3796:
 #0:  (&vcpu->mutex){+.+.+.}, at: [<ffffffffa0217fd8>] vcpu_load+0x1a/0x66 [kvm]

stack backtrace:
Pid: 3796, comm: qemu-system-x86 Not tainted 2.6.34 #25
Call Trace:
 [<ffffffff81070ed1>] lockdep_rcu_dereference+0x9d/0xa5
 [<ffffffffa0214fdf>] gfn_to_memslot_unaliased+0x65/0xa0 [kvm]
 [<ffffffffa0216139>] gfn_to_hva+0x22/0x4c [kvm]
 [<ffffffffa0216217>] kvm_write_guest_page+0x2a/0x7f [kvm]
 [<ffffffffa0216286>] kvm_clear_guest_page+0x1a/0x1c [kvm]
 [<ffffffffa0278239>] init_rmode+0x3b/0x180 [kvm_intel]
 [<ffffffffa02786ce>] vmx_set_cr0+0x350/0x4d3 [kvm_intel]
 [<ffffffffa02274ff>] kvm_arch_vcpu_ioctl_set_sregs+0x122/0x31a [kvm]
 [<ffffffffa021859c>] kvm_vcpu_ioctl+0x578/0xa3d [kvm]
 [<ffffffff8106624c>] ? cpu_clock+0x2d/0x40
 [<ffffffff810f7d86>] ? fget_light+0x244/0x28e
 [<ffffffff810709b9>] ? trace_hardirqs_off_caller+0x1f/0x10e
 [<ffffffff8110501b>] vfs_ioctl+0x32/0xa6
 [<ffffffff81105597>] do_vfs_ioctl+0x47f/0x4b8
 [<ffffffff813ae654>] ? sub_preempt_count+0xa3/0xb7
 [<ffffffff810f7da8>] ? fget_light+0x266/0x28e
 [<ffffffff810f7c53>] ? fget_light+0x111/0x28e
 [<ffffffff81105617>] sys_ioctl+0x47/0x6a
 [<ffffffff81002c1b>] system_call_fastpath+0x16/0x1b
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

4b9d3a04

KVM: VMX: rename vpid_sync_vcpu_all() to vpid_sync_vcpu_single() · 1760dd49

Gui Jianfeng authored Jun 07, 2010

The name "pid_sync_vcpu_all" isn't appropriate since it just affect
a single vpid, so rename it to vpid_sync_vcpu_single().
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

1760dd49

KVM: VMX: Add all-context INVVPID type support · b9d762fa

Gui Jianfeng authored Jun 07, 2010

Add all-context INVVPID type support.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

b9d762fa

KVM: MMU: reduce remote tlb flush in kvm_mmu_pte_write() · 0671a8e7

Xiao Guangrong authored Jun 04, 2010

collect remote tlb flush in kvm_mmu_pte_write() path
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

0671a8e7

KVM: MMU: traverse sp hlish safely · f41d335a

Xiao Guangrong authored Jun 04, 2010

Now, we can safely to traverse sp hlish
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

f41d335a

KVM: MMU: gather remote tlb flush which occurs during page zapped · d98ba053

Xiao Guangrong authored Jun 04, 2010

Using kvm_mmu_prepare_zap_page() and kvm_mmu_zap_page() instead of
kvm_mmu_zap_page() that can reduce remote tlb flush IPI
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

d98ba053

KVM: MMU: don't get free page number in the loop · 103ad25a

Xiao Guangrong authored Jun 04, 2010

In the later patch, we will modify sp's zapping way like below:

	kvm_mmu_prepare_zap_page A
	kvm_mmu_prepare_zap_page B
	kvm_mmu_prepare_zap_page C
	....
	kvm_mmu_commit_zap_page

[ zaped multiple sps only need to call kvm_mmu_commit_zap_page once ]

In __kvm_mmu_free_some_pages() function, the free page number is
getted form 'vcpu->kvm->arch.n_free_mmu_pages' in loop, it will
hinders us to apply kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page()
since kvm_mmu_prepare_zap_page() not free sp.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

103ad25a

KVM: MMU: split the operations of kvm_mmu_zap_page() · 7775834a

Xiao Guangrong authored Jun 04, 2010

Using kvm_mmu_prepare_zap_page() and kvm_mmu_commit_zap_page() to
split kvm_mmu_zap_page() function, then we can:

- traverse hlist safely
- easily to gather remote tlb flush which occurs during page zapped

Those feature can be used in the later patches
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

7775834a

KVM: MMU: introduce some macros to cleanup hlist traverseing · 7ae680eb

Xiao Guangrong authored Jun 04, 2010

Introduce for_each_gfn_sp() and for_each_gfn_indirect_valid_sp() to
cleanup hlist traverseing
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

7ae680eb

KVM: MMU: skip invalid sp when unprotect page · 03116aa5

Xiao Guangrong authored Jun 04, 2010

In kvm_mmu_unprotect_page(), the invalid sp can be skipped
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

03116aa5

KVM: VMX: Make sure single type invvpid is supported before issuing invvpid instruction · 518c8aee

Gui Jianfeng authored Jun 04, 2010

According to SDM, we need check whether single-context INVVPID type is supported
before issuing invvpid instruction.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

518c8aee

KVM: x86: use linux/uaccess.h instead of asm/uaccess.h · 7bee342a

Lai Jiangshan authored Jun 02, 2010

Should use linux/uaccess.h instead of asm/uaccess.h
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

7bee342a

KVM: cleanup "*new.rmap" type · 3bd89007

Lai Jiangshan authored Jun 02, 2010

The type of '*new.rmap' is not 'struct page *', fix it
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

3bd89007

KVM: VMX: Enforce EPT pagetable level checking · 4bc9b982

Sheng Yang authored Jun 02, 2010

We only support 4 levels EPT pagetable now.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

4bc9b982

KVM: Add Documentation/kvm/msr.txt · d2d7a611

Glauber Costa authored Jun 01, 2010

This patch adds a file that documents the usage of KVM-specific
MSRs.
Signed-off-by: Glauber Costa <glommer@redhat.com>
Reviewed-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

d2d7a611

KVM: PPC: elide struct thread_struct instances from stack · 49f6be8e

Andreas Schwab authored May 31, 2010

Instead of instantiating a whole thread_struct on the stack use only the
required parts of it.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

49f6be8e

KVM: VMX: Properly return error to userspace on vmentry failure · 5120702e

Mohammed Gamal authored May 31, 2010

The vmexit handler returns KVM_EXIT_UNKNOWN since there is no handler
for vmentry failures. This intercepts vmentry failures and returns
KVM_FAIL_ENTRY to userspace instead.
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

5120702e

KVM: MMU: Don't calculate quadrant if tdp_enabled · b66d8000

Gui Jianfeng authored May 31, 2010

There's no need to calculate quadrant if tdp is enabled.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

b66d8000

KVM: MMU: Document large pages · 316b9521

Avi Kivity authored May 27, 2010

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

316b9521

KVM: MMU: Document cr0.wp emulation · ec87fe2a
Avi Kivity authored May 27, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
ec87fe2a

KVM: MMU: Allow spte.w=1 for gpte.w=0 and cr0.wp=0 only in shadow mode · 8184dd38

Avi Kivity authored May 27, 2010

When tdp is enabled, the guest's cr0.wp shouldn't have any effect on spte
permissions.
Signed-off-by: Avi Kivity <avi@redhat.com>

8184dd38

KVM: x86: Propagate fpu_alloc errors · 10ab25cd

Jan Kiszka authored May 25, 2010

Memory allocation may fail. Propagate such errors.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

10ab25cd

KVM: SVM: Fix EFER.LME being stripped · 6dc696d4

Zachary Amsden authored May 26, 2010

Must set VCPU register to be the guest notion of EFER even if that
setting is not valid on hardware.  This was masked by the set in
set_efer until 7657fd5ace88e8092f5f3a84117e093d7b893f26 broke that.
Fix is simply to set the VCPU register before stripping bits.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

6dc696d4

KVM: MMU: don't check PT_WRITABLE_MASK directly · 01c168ac

Gui Jianfeng authored May 27, 2010

Since we have is_writable_pte(), make use of it.
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

01c168ac

KVM: MMU: calculate correct gfn for small host pages backing large guest pages · 3af1817a

Lai Jiangshan authored May 26, 2010

In Documentation/kvm/mmu.txt:
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations. See role.direct.

But in function FNAME(fetch)(), sp->gfn is incorrect when one of following
situations occurred:

 1) guest is 32bit paging and the guest PDE maps a 4-MByte page
    (backed by 4k host pages), FNAME(fetch)() miss handling the quadrant.

    And if guest use pse-36, "table_gfn = gpte_to_gfn(gw->ptes[level - delta]);"
    is incorrect.

 2) guest is long mode paging and the guest PDPTE maps a 1-GByte page
    (backed by 4k or 2M host pages).

So we fix it to suit to the document and suit to the code which
requires sp->gfn correct when sp->role.direct=1.

We use the goal mapping gfn(gw->gfn) to calculate the base page frame
for linear translations, it is simple and easy to be understood.
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Reported-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

3af1817a

KVM: MMU: Calculate correct base gfn for direct non-DIR level · c9fa0b3b

Lai Jiangshan authored May 26, 2010

In Document/kvm/mmu.txt:
  gfn:
    Either the guest page table containing the translations shadowed by this
    page, or the base page frame for linear translations. See role.direct.

But in __direct_map(), the base gfn calculation is incorrect,
it does not calculate correctly when level=3 or 4.

Fix by using PT64_LVL_ADDR_MASK() which accounts for all levels correctly.
Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

c9fa0b3b

KVM: MMU: Don't allocate gfns page for direct mmu pages · 2032a93d

Lai Jiangshan authored May 26, 2010

When sp->role.direct is set, sp->gfns does not contain any essential
information, leaf sptes reachable from this sp are for a continuous
guest physical memory range (a linear range).
So sp->gfns[i] (if it was set) equals to sp->gfn + i. (PT_PAGE_TABLE_LEVEL)
Obviously, it is not essential information, we can calculate it when need.

It means we don't need sp->gfns when sp->role.direct=1,
Thus we can save one page usage for every kvm_mmu_page.

Note:
  Access to sp->gfns must be wrapped by kvm_mmu_page_get_gfn()
  or kvm_mmu_page_set_gfn().
  It is only exposed in FNAME(sync_page).
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

2032a93d

KVM: VMX: Add constant for invalid guest state exit reason · c8174f7b

Mohammed Gamal authored May 24, 2010

For the sake of completeness, this patch adds a symbolic
constant for VMX exit reason 0x21 (invalid guest state).
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

c8174f7b

KVM: MMU: allow more page become unsync at getting sp time · 9f1a122f

Xiao Guangrong authored May 24, 2010

Allow more page become asynchronous at getting sp time, if need create new
shadow page for gfn but it not allow unsync(level > 1), we should unsync all
gfn's unsync page
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

9f1a122f

KVM: MMU: allow more page become unsync at gfn mapping time · 9cf5cf5a

Xiao Guangrong authored May 24, 2010

In current code, shadow page can become asynchronous only if one
shadow page for a gfn, this rule is too strict, in fact, we can
let all last mapping page(i.e, it's the pte page) become unsync,
and sync them at invlpg or flush tlb time.

This patch allow more page become asynchronous at gfn mapping time
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

9cf5cf5a

KVM: Update Red Hat copyrights · 221d059d
Avi Kivity authored May 23, 2010
```
Signed-off-by: Avi Kivity <avi@redhat.com>
```
221d059d

KVM: SVM: correctly trace irq injection · 9fb2d2b4

Gleb Natapov authored May 23, 2010

On SVM interrupts are injected by svm_set_irq() not svm_inject_irq().
The later is used only to wait for irq window.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

9fb2d2b4

KVM: MMU: only update unsync page in invlpg path · f78978aa

Xiao Guangrong authored May 15, 2010

Only unsync pages need updated at invlpg time since other shadow
pages are write-protected
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

f78978aa

KVM: MMU: don't write-protect if have new mapping to unsync page · e02aa901

Xiao Guangrong authored May 15, 2010

Two cases maybe happen in kvm_mmu_get_page() function:

- one case is, the goal sp is already in cache, if the sp is unsync,
  we only need update it to assure this mapping is valid, but not
  mark it sync and not write-protect sp->gfn since it not broke unsync
  rule(one shadow page for a gfn)

- another case is, the goal sp not existed, we need create a new sp
  for gfn, i.e, gfn (may)has another shadow page, to keep unsync rule,
  we should sync(mark sync and write-protect) gfn's unsync shadow page.
  After enabling multiple unsync shadows, we sync those shadow pages
  only when the new sp not allow to become unsync(also for the unsyc
  rule, the new rule is: allow all pte page become unsync)
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

e02aa901