Commit ecefbd94 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Avi Kivity:
 "Highlights of the changes for this release include support for vfio
  level triggered interrupts, improved big real mode support on older
  Intels, a streamlines guest page table walker, guest APIC speedups,
  PIO optimizations, better overcommit handling, and read-only memory."

* tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits)
  KVM: s390: Fix vcpu_load handling in interrupt code
  KVM: x86: Fix guest debug across vcpu INIT reset
  KVM: Add resampling irqfds for level triggered interrupts
  KVM: optimize apic interrupt delivery
  KVM: MMU: Eliminate pointless temporary 'ac'
  KVM: MMU: Avoid access/dirty update loop if all is well
  KVM: MMU: Eliminate eperm temporary
  KVM: MMU: Optimize is_last_gpte()
  KVM: MMU: Simplify walk_addr_generic() loop
  KVM: MMU: Optimize pte permission checks
  KVM: MMU: Update accessed and dirty bits after guest pagetable walk
  KVM: MMU: Move gpte_access() out of paging_tmpl.h
  KVM: MMU: Optimize gpte_access() slightly
  KVM: MMU: Push clean gpte write protection out of gpte_access()
  KVM: clarify kvmclock documentation
  KVM: make processes waiting on vcpu mutex killable
  KVM: SVM: Make use of asm.h
  KVM: VMX: Make use of asm.h
  KVM: VMX: Make lto-friendly
  KVM: x86: lapic: Clean up find_highest_vector() and count_vectors()
  ...

Conflicts:
	arch/s390/include/asm/processor.h
	arch/x86/kvm/i8259.c
parents ce57e981 3d11df7a
......@@ -857,7 +857,8 @@ struct kvm_userspace_memory_region {
};
/* for kvm_memory_region::flags */
#define KVM_MEM_LOG_DIRTY_PAGES 1UL
#define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0)
#define KVM_MEM_READONLY (1UL << 1)
This ioctl allows the user to create or modify a guest physical memory
slot. When changing an existing slot, it may be moved in the guest
......@@ -873,14 +874,17 @@ It is recommended that the lower 21 bits of guest_phys_addr and userspace_addr
be identical. This allows large pages in the guest to be backed by large
pages in the host.
The flags field supports just one flag, KVM_MEM_LOG_DIRTY_PAGES, which
instructs kvm to keep track of writes to memory within the slot. See
the KVM_GET_DIRTY_LOG ioctl.
The flags field supports two flag, KVM_MEM_LOG_DIRTY_PAGES, which instructs
kvm to keep track of writes to memory within the slot. See KVM_GET_DIRTY_LOG
ioctl. The KVM_CAP_READONLY_MEM capability indicates the availability of the
KVM_MEM_READONLY flag. When this flag is set for a memory region, KVM only
allows read accesses. Writes will be posted to userspace as KVM_EXIT_MMIO
exits.
When the KVM_CAP_SYNC_MMU capability, changes in the backing of the memory
region are automatically reflected into the guest. For example, an mmap()
that affects the region will be made visible immediately. Another example
is madvise(MADV_DROP).
When the KVM_CAP_SYNC_MMU capability is available, changes in the backing of
the memory region are automatically reflected into the guest. For example, an
mmap() that affects the region will be made visible immediately. Another
example is madvise(MADV_DROP).
It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl.
The KVM_SET_MEMORY_REGION does not allow fine grained control over memory
......@@ -1946,6 +1950,19 @@ the guest using the specified gsi pin. The irqfd is removed using
the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
and kvm_irqfd.gsi.
With KVM_CAP_IRQFD_RESAMPLE, KVM_IRQFD supports a de-assert and notify
mechanism allowing emulation of level-triggered, irqfd-based
interrupts. When KVM_IRQFD_FLAG_RESAMPLE is set the user must pass an
additional eventfd in the kvm_irqfd.resamplefd field. When operating
in resample mode, posting of an interrupt through kvm_irq.fd asserts
the specified gsi in the irqchip. When the irqchip is resampled, such
as from an EOI, the gsi is de-asserted and the user is notifed via
kvm_irqfd.resamplefd. It is the user's responsibility to re-queue
the interrupt if the device making use of it still requires service.
Note that closing the resamplefd is not sufficient to disable the
irqfd. The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
4.76 KVM_PPC_ALLOCATE_HTAB
Capability: KVM_CAP_PPC_ALLOC_HTAB
......
Linux KVM Hypercall:
===================
X86:
KVM Hypercalls have a three-byte sequence of either the vmcall or the vmmcall
instruction. The hypervisor can replace it with instructions that are
guaranteed to be supported.
Up to four arguments may be passed in rbx, rcx, rdx, and rsi respectively.
The hypercall number should be placed in rax and the return value will be
placed in rax. No other registers will be clobbered unless explicitly stated
by the particular hypercall.
S390:
R2-R7 are used for parameters 1-6. In addition, R1 is used for hypercall
number. The return value is written to R2.
S390 uses diagnose instruction as hypercall (0x500) along with hypercall
number in R1.
PowerPC:
It uses R3-R10 and hypercall number in R11. R4-R11 are used as output registers.
Return value is placed in R3.
KVM hypercalls uses 4 byte opcode, that are patched with 'hypercall-instructions'
property inside the device tree's /hypervisor node.
For more information refer to Documentation/virtual/kvm/ppc-pv.txt
KVM Hypercalls Documentation
===========================
The template for each hypercall is:
1. Hypercall name.
2. Architecture(s)
3. Status (deprecated, obsolete, active)
4. Purpose
1. KVM_HC_VAPIC_POLL_IRQ
------------------------
Architecture: x86
Status: active
Purpose: Trigger guest exit so that the host can check for pending
interrupts on reentry.
2. KVM_HC_MMU_OP
------------------------
Architecture: x86
Status: deprecated.
Purpose: Support MMU operations such as writing to PTE,
flushing TLB, release PT.
3. KVM_HC_FEATURES
------------------------
Architecture: PPC
Status: active
Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
used to enumerate which hypercalls are available. On PPC, either device tree
based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
mechanism (which is this hypercall) can be used.
4. KVM_HC_PPC_MAP_MAGIC_PAGE
------------------------
Architecture: PPC
Status: active
Purpose: To enable communication between the hypervisor and guest there is a
shared page that contains parts of supervisor visible register state.
The guest can map this shared page to access its supervisor register through
memory using this hypercall.
......@@ -34,9 +34,12 @@ MSR_KVM_WALL_CLOCK_NEW: 0x4b564d00
time information and check that they are both equal and even.
An odd version indicates an in-progress update.
sec: number of seconds for wallclock.
sec: number of seconds for wallclock at time of boot.
nsec: number of nanoseconds for wallclock.
nsec: number of nanoseconds for wallclock at time of boot.
In order to get the current wallclock time, the system_time from
MSR_KVM_SYSTEM_TIME_NEW needs to be added.
Note that although MSRs are per-CPU entities, the effect of this
particular MSR is global.
......@@ -82,20 +85,25 @@ MSR_KVM_SYSTEM_TIME_NEW: 0x4b564d01
time at the time this structure was last updated. Unit is
nanoseconds.
tsc_to_system_mul: a function of the tsc frequency. One has
to multiply any tsc-related quantity by this value to get
a value in nanoseconds, besides dividing by 2^tsc_shift
tsc_to_system_mul: multiplier to be used when converting
tsc-related quantity to nanoseconds
tsc_shift: cycle to nanosecond divider, as a power of two, to
allow for shift rights. One has to shift right any tsc-related
quantity by this value to get a value in nanoseconds, besides
multiplying by tsc_to_system_mul.
tsc_shift: shift to be used when converting tsc-related
quantity to nanoseconds. This shift will ensure that
multiplication with tsc_to_system_mul does not overflow.
A positive value denotes a left shift, a negative value
a right shift.
With this information, guests can derive per-CPU time by
doing:
The conversion from tsc to nanoseconds involves an additional
right shift by 32 bits. With this information, guests can
derive per-CPU time by doing:
time = (current_tsc - tsc_timestamp)
time = (time * tsc_to_system_mul) >> tsc_shift
if (tsc_shift >= 0)
time <<= tsc_shift;
else
time >>= -tsc_shift;
time = (time * tsc_to_system_mul) >> 32
time = time + system_time
flags: bits in this field indicate extended capabilities
......
......@@ -174,3 +174,25 @@ following:
That way we can inject an arbitrary amount of code as replacement for a single
instruction. This allows us to check for pending interrupts when setting EE=1
for example.
Hypercall ABIs in KVM on PowerPC
=================================
1) KVM hypercalls (ePAPR)
These are ePAPR compliant hypercall implementation (mentioned above). Even
generic hypercalls are implemented here, like the ePAPR idle hcall. These are
available on all targets.
2) PAPR hypercalls
PAPR hypercalls are needed to run server PowerPC PAPR guests (-M pseries in QEMU).
These are the same hypercalls that pHyp, the POWER hypervisor implements. Some of
them are handled in the kernel, some are handled in user space. This is only
available on book3s_64.
3) OSI hypercalls
Mac-on-Linux is another user of KVM on PowerPC, which has its own hypercall (long
before KVM). This is supported to maintain compatibility. All these hypercalls get
forwarded to user space. This is only useful on book3s_32, but can be used with
book3s_64 as well.
......@@ -924,6 +924,16 @@ int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
return 0;
}
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event)
{
if (!irqchip_in_kernel(kvm))
return -ENXIO;
irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
irq_event->irq, irq_event->level);
return 0;
}
long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
......@@ -963,29 +973,6 @@ long kvm_arch_vm_ioctl(struct file *filp,
goto out;
}
break;
case KVM_IRQ_LINE_STATUS:
case KVM_IRQ_LINE: {
struct kvm_irq_level irq_event;
r = -EFAULT;
if (copy_from_user(&irq_event, argp, sizeof irq_event))
goto out;
r = -ENXIO;
if (irqchip_in_kernel(kvm)) {
__s32 status;
status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
irq_event.irq, irq_event.level);
if (ioctl == KVM_IRQ_LINE_STATUS) {
r = -EFAULT;
irq_event.status = status;
if (copy_to_user(argp, &irq_event,
sizeof irq_event))
goto out;
}
r = 0;
}
break;
}
case KVM_GET_IRQCHIP: {
/* 0: PIC master, 1: PIC slave, 2: IOAPIC */
struct kvm_irqchip chip;
......@@ -1626,11 +1613,17 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
return;
}
void kvm_arch_flush_shadow(struct kvm *kvm)
void kvm_arch_flush_shadow_all(struct kvm *kvm)
{
kvm_flush_remote_tlbs(kvm);
}
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot)
{
kvm_arch_flush_shadow_all();
}
long kvm_arch_dev_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
{
......
......@@ -53,6 +53,8 @@
struct kvm;
extern int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
extern int kvm_unmap_hva_range(struct kvm *kvm,
unsigned long start, unsigned long end);
extern int kvm_age_hva(struct kvm *kvm, unsigned long hva);
extern int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
extern void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
......@@ -220,6 +222,7 @@ struct revmap_entry {
#define KVMPPC_GOT_PAGE 0x80
struct kvm_arch_memory_slot {
unsigned long *rmap;
};
struct kvm_arch {
......
......@@ -319,7 +319,6 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 gvaddr, gpa_t gpaddr,
if (is_error_page(new_page)) {
printk(KERN_ERR "Couldn't get guest page for gfn %llx!\n",
(unsigned long long)gfn);
kvm_release_page_clean(new_page);
return;
}
hpaddr = page_to_phys(new_page);
......
......@@ -705,7 +705,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
goto out_unlock;
hpte[0] = (hpte[0] & ~HPTE_V_ABSENT) | HPTE_V_VALID;
rmap = &memslot->rmap[gfn - memslot->base_gfn];
rmap = &memslot->arch.rmap[gfn - memslot->base_gfn];
lock_rmap(rmap);
/* Check if we might have been invalidated; let the guest retry if so */
......@@ -756,8 +756,11 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct kvm_vcpu *vcpu,
goto out_put;
}
static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
int (*handler)(struct kvm *kvm, unsigned long *rmapp,
static int kvm_handle_hva_range(struct kvm *kvm,
unsigned long start,
unsigned long end,
int (*handler)(struct kvm *kvm,
unsigned long *rmapp,
unsigned long gfn))
{
int ret;
......@@ -767,15 +770,25 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
slots = kvm_memslots(kvm);
kvm_for_each_memslot(memslot, slots) {
unsigned long start = memslot->userspace_addr;
unsigned long end;
unsigned long hva_start, hva_end;
gfn_t gfn, gfn_end;
end = start + (memslot->npages << PAGE_SHIFT);
if (hva >= start && hva < end) {
gfn_t gfn_offset = (hva - start) >> PAGE_SHIFT;
hva_start = max(start, memslot->userspace_addr);
hva_end = min(end, memslot->userspace_addr +
(memslot->npages << PAGE_SHIFT));
if (hva_start >= hva_end)
continue;
/*
* {gfn(page) | page intersects with [hva_start, hva_end)} =
* {gfn, gfn+1, ..., gfn_end-1}.
*/
gfn = hva_to_gfn_memslot(hva_start, memslot);
gfn_end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, memslot);
ret = handler(kvm, &memslot->rmap[gfn_offset],
memslot->base_gfn + gfn_offset);
for (; gfn < gfn_end; ++gfn) {
gfn_t gfn_offset = gfn - memslot->base_gfn;
ret = handler(kvm, &memslot->arch.rmap[gfn_offset], gfn);
retval |= ret;
}
}
......@@ -783,6 +796,13 @@ static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
return retval;
}
static int kvm_handle_hva(struct kvm *kvm, unsigned long hva,
int (*handler)(struct kvm *kvm, unsigned long *rmapp,
unsigned long gfn))
{
return kvm_handle_hva_range(kvm, hva, hva + 1, handler);
}
static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp,
unsigned long gfn)
{
......@@ -850,6 +870,13 @@ int kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
return 0;
}
int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end)
{
if (kvm->arch.using_mmu_notifiers)
kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
return 0;
}
static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp,
unsigned long gfn)
{
......@@ -1009,7 +1036,7 @@ long kvmppc_hv_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
unsigned long *rmapp, *map;
preempt_disable();
rmapp = memslot->rmap;
rmapp = memslot->arch.rmap;
map = memslot->dirty_bitmap;
for (i = 0; i < memslot->npages; ++i) {
if (kvm_test_clear_dirty(kvm, rmapp))
......
......@@ -84,7 +84,7 @@ static void remove_revmap_chain(struct kvm *kvm, long pte_index,
if (!memslot || (memslot->flags & KVM_MEMSLOT_INVALID))
return;
rmap = real_vmalloc_addr(&memslot->rmap[gfn - memslot->base_gfn]);
rmap = real_vmalloc_addr(&memslot->arch.rmap[gfn - memslot->base_gfn]);
lock_rmap(rmap);
head = *rmap & KVMPPC_RMAP_INDEX;
......@@ -180,7 +180,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
if (!slot_is_aligned(memslot, psize))
return H_PARAMETER;
slot_fn = gfn - memslot->base_gfn;
rmap = &memslot->rmap[slot_fn];
rmap = &memslot->arch.rmap[slot_fn];
if (!kvm->arch.using_mmu_notifiers) {
physp = kvm->arch.slot_phys[memslot->id];
......@@ -197,7 +197,7 @@ long kvmppc_h_enter(struct kvm_vcpu *vcpu, unsigned long flags,
pa &= PAGE_MASK;
} else {
/* Translate to host virtual address */
hva = gfn_to_hva_memslot(memslot, gfn);
hva = __gfn_to_hva_memslot(memslot, gfn);
/* Look up the Linux PTE for the backing page */
pte_size = psize;
......
......@@ -242,10 +242,8 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte)
int i;
hpage = gfn_to_page(vcpu->kvm, pte->raddr >> PAGE_SHIFT);
if (is_error_page(hpage)) {
kvm_release_page_clean(hpage);
if (is_error_page(hpage))
return;
}
hpage_offset = pte->raddr & ~PAGE_MASK;
hpage_offset &= ~0xFFFULL;
......
......@@ -520,11 +520,10 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500,
if (likely(!pfnmap)) {
unsigned long tsize_pages = 1 << (tsize + 10 - PAGE_SHIFT);
pfn = gfn_to_pfn_memslot(vcpu_e500->vcpu.kvm, slot, gfn);
pfn = gfn_to_pfn_memslot(slot, gfn);
if (is_error_pfn(pfn)) {
printk(KERN_ERR "Couldn't get real page for gfn %lx!\n",
(long)gfn);
kvm_release_pfn_clean(pfn);
return;
}
......
......@@ -302,10 +302,18 @@ long kvm_arch_dev_ioctl(struct file *filp,
void kvm_arch_free_memslot(struct kvm_memory_slot *free,
struct kvm_memory_slot *dont)
{
if (!dont || free->arch.rmap != dont->arch.rmap) {
vfree(free->arch.rmap);
free->arch.rmap = NULL;
}
}
int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
{
slot->arch.rmap = vzalloc(npages * sizeof(*slot->arch.rmap));
if (!slot->arch.rmap)
return -ENOMEM;
return 0;
}
......@@ -326,8 +334,12 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
kvmppc_core_commit_memory_region(kvm, mem);
}
void kvm_arch_flush_shadow_all(struct kvm *kvm)
{
}
void kvm_arch_flush_shadow(struct kvm *kvm)
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot)
{
}
......
......@@ -159,6 +159,7 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
extern void show_code(struct pt_regs *regs);
extern void print_fn_code(unsigned char *code, unsigned long len);
extern int insn_to_mnemonic(unsigned char *instruction, char buf[8]);
unsigned long get_wchan(struct task_struct *p);
#define task_pt_regs(tsk) ((struct pt_regs *) \
......
......@@ -1501,6 +1501,33 @@ static struct insn *find_insn(unsigned char *code)
return NULL;
}
/**
* insn_to_mnemonic - decode an s390 instruction
* @instruction: instruction to decode
* @buf: buffer to fill with mnemonic
*
* Decode the instruction at @instruction and store the corresponding
* mnemonic into @buf.
* @buf is left unchanged if the instruction could not be decoded.
* Returns:
* %0 on success, %-ENOENT if the instruction was not found.
*/
int insn_to_mnemonic(unsigned char *instruction, char buf[8])
{
struct insn *insn;
insn = find_insn(instruction);
if (!insn)
return -ENOENT;
if (insn->name[0] == '\0')
snprintf(buf, sizeof(buf), "%s",
long_insn_name[(int) insn->name[1]]);
else
snprintf(buf, sizeof(buf), "%.5s", insn->name);
return 0;
}
EXPORT_SYMBOL_GPL(insn_to_mnemonic);
static int print_insn(char *buffer, unsigned char *code, unsigned long addr)
{
struct insn *insn;
......
......@@ -21,6 +21,7 @@ config KVM
depends on HAVE_KVM && EXPERIMENTAL
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_CPU_RELAX_INTERCEPT
---help---
Support hosting paravirtualized guest machines using the SIE
virtualization capability on the mainframe. This should work
......
......@@ -14,6 +14,8 @@
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include "kvm-s390.h"
#include "trace.h"
#include "trace-s390.h"
static int diag_release_pages(struct kvm_vcpu *vcpu)
{
......@@ -98,6 +100,7 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
vcpu->run->exit_reason = KVM_EXIT_S390_RESET;
VCPU_EVENT(vcpu, 3, "requesting userspace resets %llx",
vcpu->run->s390_reset_flags);
trace_kvm_s390_request_resets(vcpu->run->s390_reset_flags);
return -EREMOTE;
}
......@@ -105,6 +108,7 @@ int kvm_s390_handle_diag(struct kvm_vcpu *vcpu)
{
int code = (vcpu->arch.sie_block->ipb & 0xfff0000) >> 16;
trace_kvm_s390_handle_diag(vcpu, code);
switch (code) {
case 0x10:
return diag_release_pages(vcpu);
......
......@@ -19,6 +19,8 @@
#include "kvm-s390.h"
#include "gaccess.h"
#include "trace.h"
#include "trace-s390.h"
static int handle_lctlg(struct kvm_vcpu *vcpu)
{
......@@ -45,6 +47,7 @@ static int handle_lctlg(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 5, "lctlg r1:%x, r3:%x,b2:%x,d2:%x", reg1, reg3, base2,
disp2);
trace_kvm_s390_handle_lctl(vcpu, 1, reg1, reg3, useraddr);
do {
rc = get_guest_u64(vcpu, useraddr,
......@@ -82,6 +85,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu)
VCPU_EVENT(vcpu, 5, "lctl r1:%x, r3:%x,b2:%x,d2:%x", reg1, reg3, base2,
disp2);
trace_kvm_s390_handle_lctl(vcpu, 0, reg1, reg3, useraddr);
reg = reg1;
do {
......@@ -135,6 +139,8 @@ static int handle_stop(struct kvm_vcpu *vcpu)
vcpu->stat.exit_stop_request++;
spin_lock_bh(&vcpu->arch.local_int.lock);
trace_kvm_s390_stop_request(vcpu->arch.local_int.action_bits);
if (vcpu->arch.local_int.action_bits & ACTION_RELOADVCPU_ON_STOP) {
vcpu->arch.local_int.action_bits &= ~ACTION_RELOADVCPU_ON_STOP;
rc = SIE_INTERCEPT_RERUNVCPU;
......@@ -171,6 +177,7 @@ static int handle_validity(struct kvm_vcpu *vcpu)
int rc;
vcpu->stat.exit_validity++;
trace_kvm_s390_intercept_validity(vcpu, viwhy);
if (viwhy == 0x37) {
vmaddr = gmap_fault(vcpu->arch.sie_block->prefix,
vcpu->arch.gmap);
......@@ -213,6 +220,9 @@ static int handle_instruction(struct kvm_vcpu *vcpu)
intercept_handler_t handler;
vcpu->stat.exit_instruction++;
trace_kvm_s390_intercept_instruction(vcpu,
vcpu->arch.sie_block->ipa,
vcpu->arch.sie_block->ipb);
handler = instruction_handlers[vcpu->arch.sie_block->ipa >> 8];
if (handler)
return handler(vcpu);
......@@ -222,6 +232,7 @@ static int handle_instruction(struct kvm_vcpu *vcpu)
static int handle_prog(struct kvm_vcpu *vcpu)
{
vcpu->stat.exit_program_interruption++;
trace_kvm_s390_intercept_prog(vcpu, vcpu->arch.sie_block->iprcc);
return kvm_s390_inject_program_int(vcpu, vcpu->arch.sie_block->iprcc);
}
......
......@@ -19,6 +19,7 @@
#include <asm/uaccess.h>
#include "kvm-s390.h"
#include "gaccess.h"
#include "trace-s390.h"
static int psw_extint_disabled(struct kvm_vcpu *vcpu)
{
......@@ -130,6 +131,8 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
case KVM_S390_INT_EMERGENCY:
VCPU_EVENT(vcpu, 4, "%s", "interrupt: sigp emerg");
vcpu->stat.deliver_emergency_signal++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->emerg.code, 0);
rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x1201);
if (rc == -EFAULT)
exception = 1;
......@@ -152,6 +155,8 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
case KVM_S390_INT_EXTERNAL_CALL:
VCPU_EVENT(vcpu, 4, "%s", "interrupt: sigp ext call");
vcpu->stat.deliver_external_call++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->extcall.code, 0);
rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x1202);
if (rc == -EFAULT)
exception = 1;
......@@ -175,6 +180,8 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
VCPU_EVENT(vcpu, 4, "interrupt: sclp parm:%x",
inti->ext.ext_params);
vcpu->stat.deliver_service_signal++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->ext.ext_params, 0);
rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x2401);
if (rc == -EFAULT)
exception = 1;
......@@ -198,6 +205,9 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
VCPU_EVENT(vcpu, 4, "interrupt: virtio parm:%x,parm64:%llx",
inti->ext.ext_params, inti->ext.ext_params2);
vcpu->stat.deliver_virtio_interrupt++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->ext.ext_params,
inti->ext.ext_params2);
rc = put_guest_u16(vcpu, __LC_EXT_INT_CODE, 0x2603);
if (rc == -EFAULT)
exception = 1;
......@@ -229,6 +239,8 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
case KVM_S390_SIGP_STOP:
VCPU_EVENT(vcpu, 4, "%s", "interrupt: cpu stop");
vcpu->stat.deliver_stop_signal++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
0, 0);
__set_intercept_indicator(vcpu, inti);
break;
......@@ -236,12 +248,16 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
VCPU_EVENT(vcpu, 4, "interrupt: set prefix to %x",
inti->prefix.address);
vcpu->stat.deliver_prefix_signal++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->prefix.address, 0);
kvm_s390_set_prefix(vcpu, inti->prefix.address);
break;
case KVM_S390_RESTART:
VCPU_EVENT(vcpu, 4, "%s", "interrupt: cpu restart");
vcpu->stat.deliver_restart_signal++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
0, 0);
rc = copy_to_guest(vcpu, offsetof(struct _lowcore,
restart_old_psw), &vcpu->arch.sie_block->gpsw, sizeof(psw_t));
if (rc == -EFAULT)
......@@ -259,6 +275,8 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
inti->pgm.code,
table[vcpu->arch.sie_block->ipa >> 14]);
vcpu->stat.deliver_program_int++;
trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, inti->type,
inti->pgm.code, 0);
rc = put_guest_u16(vcpu, __LC_PGM_INT_CODE, inti->pgm.code);
if (rc == -EFAULT)
exception = 1;
......@@ -405,9 +423,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
set_current_state(TASK_INTERRUPTIBLE);
spin_unlock_bh(&vcpu->arch.local_int.lock);
spin_unlock(&vcpu->arch.local_int.float_int->lock);
vcpu_put(vcpu);
schedule();
vcpu_load(vcpu);
spin_lock(&vcpu->arch.local_int.float_int->lock);
spin_lock_bh(&vcpu->arch.local_int.lock);
}
......@@ -515,6 +531,7 @@ int kvm_s390_inject_program_int(struct kvm_vcpu *vcpu, u16 code)
inti->pgm.code = code;
VCPU_EVENT(vcpu, 3, "inject: program check %d (from kernel)", code);
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, inti->type, code, 0, 1);
spin_lock_bh(&li->lock);
list_add(&inti->list, &li->list);
atomic_set(&li->active, 1);
......@@ -556,6 +573,8 @@ int kvm_s390_inject_vm(struct kvm *kvm,
kfree(inti);
return -EINVAL;
}
trace_kvm_s390_inject_vm(s390int->type, s390int->parm, s390int->parm64,
2);
mutex_lock(&kvm->lock);
fi = &kvm->arch.float_int;
......@@ -621,6 +640,8 @@ int kvm_s390_inject_vcpu(struct kvm_vcpu *vcpu,
kfree(inti);
return -EINVAL;
}
trace_kvm_s390_inject_vcpu(vcpu->vcpu_id, s390int->type, s390int->parm,
s390int->parm64, 2);
mutex_lock(&vcpu->kvm->lock);
li = &vcpu->arch.local_int;
......
......@@ -32,6 +32,10 @@
#include "kvm-s390.h"
#include "gaccess.h"
#define CREATE_TRACE_POINTS
#include "trace.h"
#include "trace-s390.h"
#define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
struct kvm_stats_debugfs_item debugfs_entries[] = {
......@@ -242,6 +246,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
{
VCPU_EVENT(vcpu, 3, "%s", "free cpu");
trace_kvm_s390_destroy_vcpu(vcpu->vcpu_id);
if (!kvm_is_ucontrol(vcpu->kvm)) {
clear_bit(63 - vcpu->vcpu_id,
(unsigned long *) &vcpu->kvm->arch.sca->mcn);
......@@ -417,6 +422,7 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
goto out_free_sie_block;
VM_EVENT(kvm, 3, "create cpu %d at %p, sie block at %p", id, vcpu,
vcpu->arch.sie_block);
trace_kvm_s390_create_vcpu(id, vcpu, vcpu->arch.sie_block);
return vcpu;
out_free_sie_block:
......@@ -607,18 +613,22 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
local_irq_enable();
VCPU_EVENT(vcpu, 6, "entering sie flags %x",
atomic_read(&vcpu->arch.sie_block->cpuflags));
trace_kvm_s390_sie_enter(vcpu,
atomic_read(&vcpu->arch.sie_block->cpuflags));
rc = sie64a(vcpu->arch.sie_block, vcpu->run->s.regs.gprs);
if (rc) {
if (kvm_is_ucontrol(vcpu->kvm)) {
rc = SIE_INTERCEPT_UCONTROL;
} else {
VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction");
trace_kvm_s390_sie_fault(vcpu);
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
rc = 0;
}
}
VCPU_EVENT(vcpu, 6, "exit sie icptcode %d",
vcpu->arch.sie_block->icptcode);
trace_kvm_s390_sie_exit(vcpu, vcpu->arch.sie_block->icptcode);
local_irq_disable();
kvm_guest_exit();
local_irq_enable();
......@@ -959,7 +969,12 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
return;
}
void kvm_arch_flush_shadow(struct kvm *kvm)
void kvm_arch_flush_shadow_all(struct kvm *kvm)
{
}
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot)
{
}
......
......@@ -20,6 +20,7 @@
#include <asm/sysinfo.h>
#include "gaccess.h"
#include "kvm-s390.h"
#include "trace.h"
static int handle_set_prefix(struct kvm_vcpu *vcpu)
{
......@@ -59,6 +60,7 @@ static int handle_set_prefix(struct kvm_vcpu *vcpu)
kvm_s390_set_prefix(vcpu, address);
VCPU_EVENT(vcpu, 5, "setting prefix to %x", address);
trace_kvm_s390_handle_prefix(vcpu, 1, address);
out:
return 0;
}
......@@ -91,6 +93,7 @@ static int handle_store_prefix(struct kvm_vcpu *vcpu)
}
VCPU_EVENT(vcpu, 5, "storing prefix to %x", address);
trace_kvm_s390_handle_prefix(vcpu, 0, address);
out:
return 0;
}
......@@ -119,6 +122,7 @@ static int handle_store_cpu_address(struct kvm_vcpu *vcpu)
}
VCPU_EVENT(vcpu, 5, "storing cpu address to %llx", useraddr);
trace_kvm_s390_handle_stap(vcpu, useraddr);
out:
return 0;
}
......@@ -164,9 +168,11 @@ static int handle_stfl(struct kvm_vcpu *vcpu)
&facility_list, sizeof(facility_list));
if (rc == -EFAULT)
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
else
else {
VCPU_EVENT(vcpu, 5, "store facility list value %x",
facility_list);
trace_kvm_s390_handle_stfl(vcpu, facility_list);
}
return 0;
}
......@@ -278,6 +284,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
goto out_mem;
}
trace_kvm_s390_handle_stsi(vcpu, fc, sel1, sel2, operand2);
free_page(mem);
vcpu->arch.sie_block->gpsw.mask &= ~(3ul << 44);
vcpu->run->s.regs.gprs[0] = 0;
......
......@@ -18,6 +18,7 @@
#include <asm/sigp.h>
#include "gaccess.h"
#include "kvm-s390.h"
#include "trace.h"
static int __sigp_sense(struct kvm_vcpu *vcpu, u16 cpu_addr,
u64 *reg)
......@@ -344,6 +345,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
else
parameter = vcpu->run->s.regs.gprs[r1 + 1];
trace_kvm_s390_handle_sigp(vcpu, order_code, cpu_addr, parameter);
switch (order_code) {
case SIGP_SENSE:
vcpu->stat.instruction_sigp_sense++;
......
#if !defined(_TRACE_KVMS390_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_KVMS390_H
#include <linux/tracepoint.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm-s390
#define TRACE_INCLUDE_PATH .
#undef TRACE_INCLUDE_FILE
#define TRACE_INCLUDE_FILE trace-s390
/*
* Trace point for the creation of the kvm instance.
*/
TRACE_EVENT(kvm_s390_create_vm,
TP_PROTO(unsigned long type),
TP_ARGS(type),
TP_STRUCT__entry(
__field(unsigned long, type)
),
TP_fast_assign(
__entry->type = type;
),
TP_printk("create vm%s",
__entry->type & KVM_VM_S390_UCONTROL ? " (UCONTROL)" : "")
);
/*
* Trace points for creation and destruction of vpcus.
*/
TRACE_EVENT(kvm_s390_create_vcpu,
TP_PROTO(unsigned int id, struct kvm_vcpu *vcpu,
struct kvm_s390_sie_block *sie_block),
TP_ARGS(id, vcpu, sie_block),
TP_STRUCT__entry(
__field(unsigned int, id)
__field(struct kvm_vcpu *, vcpu)
__field(struct kvm_s390_sie_block *, sie_block)
),
TP_fast_assign(
__entry->id = id;
__entry->vcpu = vcpu;
__entry->sie_block = sie_block;
),
TP_printk("create cpu %d at %p, sie block at %p", __entry->id,
__entry->vcpu, __entry->sie_block)
);
TRACE_EVENT(kvm_s390_destroy_vcpu,
TP_PROTO(unsigned int id),
TP_ARGS(id),
TP_STRUCT__entry(
__field(unsigned int, id)
),
TP_fast_assign(
__entry->id = id;
),
TP_printk("destroy cpu %d", __entry->id)
);
/*
* Trace points for injection of interrupts, either per machine or
* per vcpu.
*/
#define kvm_s390_int_type \
{KVM_S390_SIGP_STOP, "sigp stop"}, \
{KVM_S390_PROGRAM_INT, "program interrupt"}, \
{KVM_S390_SIGP_SET_PREFIX, "sigp set prefix"}, \
{KVM_S390_RESTART, "sigp restart"}, \
{KVM_S390_INT_VIRTIO, "virtio interrupt"}, \
{KVM_S390_INT_SERVICE, "sclp interrupt"}, \
{KVM_S390_INT_EMERGENCY, "sigp emergency"}, \
{KVM_S390_INT_EXTERNAL_CALL, "sigp ext call"}
TRACE_EVENT(kvm_s390_inject_vm,
TP_PROTO(__u64 type, __u32 parm, __u64 parm64, int who),
TP_ARGS(type, parm, parm64, who),
TP_STRUCT__entry(
__field(__u32, inttype)
__field(__u32, parm)
__field(__u64, parm64)
__field(int, who)
),
TP_fast_assign(
__entry->inttype = type & 0x00000000ffffffff;
__entry->parm = parm;
__entry->parm64 = parm64;
__entry->who = who;
),
TP_printk("inject%s: type:%x (%s) parm:%x parm64:%llx",
(__entry->who == 1) ? " (from kernel)" :
(__entry->who == 2) ? " (from user)" : "",
__entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->parm, __entry->parm64)
);
TRACE_EVENT(kvm_s390_inject_vcpu,
TP_PROTO(unsigned int id, __u64 type, __u32 parm, __u64 parm64, \
int who),
TP_ARGS(id, type, parm, parm64, who),
TP_STRUCT__entry(
__field(int, id)
__field(__u32, inttype)
__field(__u32, parm)
__field(__u64, parm64)
__field(int, who)
),
TP_fast_assign(
__entry->id = id;
__entry->inttype = type & 0x00000000ffffffff;
__entry->parm = parm;
__entry->parm64 = parm64;
__entry->who = who;
),
TP_printk("inject%s (vcpu %d): type:%x (%s) parm:%x parm64:%llx",
(__entry->who == 1) ? " (from kernel)" :
(__entry->who == 2) ? " (from user)" : "",
__entry->id, __entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->parm, __entry->parm64)
);
/*
* Trace point for the actual delivery of interrupts.
*/
TRACE_EVENT(kvm_s390_deliver_interrupt,
TP_PROTO(unsigned int id, __u64 type, __u32 data0, __u64 data1),
TP_ARGS(id, type, data0, data1),
TP_STRUCT__entry(
__field(int, id)
__field(__u32, inttype)
__field(__u32, data0)
__field(__u64, data1)
),
TP_fast_assign(
__entry->id = id;
__entry->inttype = type & 0x00000000ffffffff;
__entry->data0 = data0;
__entry->data1 = data1;
),
TP_printk("deliver interrupt (vcpu %d): type:%x (%s) " \
"data:%08x %016llx",
__entry->id, __entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->data0, __entry->data1)
);
/*
* Trace point for resets that may be requested from userspace.
*/
TRACE_EVENT(kvm_s390_request_resets,
TP_PROTO(__u64 resets),
TP_ARGS(resets),
TP_STRUCT__entry(
__field(__u64, resets)
),
TP_fast_assign(
__entry->resets = resets;
),
TP_printk("requesting userspace resets %llx",
__entry->resets)
);
/*
* Trace point for a vcpu's stop requests.
*/
TRACE_EVENT(kvm_s390_stop_request,
TP_PROTO(unsigned int action_bits),
TP_ARGS(action_bits),
TP_STRUCT__entry(
__field(unsigned int, action_bits)
),
TP_fast_assign(
__entry->action_bits = action_bits;
),
TP_printk("stop request, action_bits = %08x",
__entry->action_bits)
);
#endif /* _TRACE_KVMS390_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
#if !defined(_TRACE_KVM_H) || defined(TRACE_HEADER_MULTI_READ)
#define _TRACE_KVM_H
#include <linux/tracepoint.h>
#include <asm/sigp.h>
#include <asm/debug.h>
#undef TRACE_SYSTEM
#define TRACE_SYSTEM kvm
#define TRACE_INCLUDE_PATH .
#undef TRACE_INCLUDE_FILE
#define TRACE_INCLUDE_FILE trace
/*
* Helpers for vcpu-specific tracepoints containing the same information
* as s390dbf VCPU_EVENTs.
*/
#define VCPU_PROTO_COMMON struct kvm_vcpu *vcpu
#define VCPU_ARGS_COMMON vcpu
#define VCPU_FIELD_COMMON __field(int, id) \
__field(unsigned long, pswmask) \
__field(unsigned long, pswaddr)
#define VCPU_ASSIGN_COMMON do { \
__entry->id = vcpu->vcpu_id; \
__entry->pswmask = vcpu->arch.sie_block->gpsw.mask; \
__entry->pswaddr = vcpu->arch.sie_block->gpsw.addr; \
} while (0);
#define VCPU_TP_PRINTK(p_str, p_args...) \
TP_printk("%02d[%016lx-%016lx]: " p_str, __entry->id, \
__entry->pswmask, __entry->pswaddr, p_args)
/*
* Tracepoints for SIE entry and exit.
*/
TRACE_EVENT(kvm_s390_sie_enter,
TP_PROTO(VCPU_PROTO_COMMON, int cpuflags),
TP_ARGS(VCPU_ARGS_COMMON, cpuflags),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(int, cpuflags)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->cpuflags = cpuflags;
),
VCPU_TP_PRINTK("entering sie flags %x", __entry->cpuflags)
);
TRACE_EVENT(kvm_s390_sie_fault,
TP_PROTO(VCPU_PROTO_COMMON),
TP_ARGS(VCPU_ARGS_COMMON),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
),
VCPU_TP_PRINTK("%s", "fault in sie instruction")
);
#define sie_intercept_code \
{0x04, "Instruction"}, \
{0x08, "Program interruption"}, \
{0x0C, "Instruction and program interuption"}, \
{0x10, "External request"}, \
{0x14, "External interruption"}, \
{0x18, "I/O request"}, \
{0x1C, "Wait state"}, \
{0x20, "Validity"}, \
{0x28, "Stop request"}
TRACE_EVENT(kvm_s390_sie_exit,
TP_PROTO(VCPU_PROTO_COMMON, u8 icptcode),
TP_ARGS(VCPU_ARGS_COMMON, icptcode),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(u8, icptcode)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->icptcode = icptcode;
),
VCPU_TP_PRINTK("exit sie icptcode %d (%s)", __entry->icptcode,
__print_symbolic(__entry->icptcode,
sie_intercept_code))
);
/*
* Trace point for intercepted instructions.
*/
TRACE_EVENT(kvm_s390_intercept_instruction,
TP_PROTO(VCPU_PROTO_COMMON, __u16 ipa, __u32 ipb),
TP_ARGS(VCPU_ARGS_COMMON, ipa, ipb),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(__u64, instruction)
__field(char, insn[8])
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->instruction = ((__u64)ipa << 48) |
((__u64)ipb << 16);
),
VCPU_TP_PRINTK("intercepted instruction %016llx (%s)",
__entry->instruction,
insn_to_mnemonic((unsigned char *)
&__entry->instruction,
__entry->insn) ?
"unknown" : __entry->insn)
);
/*
* Trace point for intercepted program interruptions.
*/
TRACE_EVENT(kvm_s390_intercept_prog,
TP_PROTO(VCPU_PROTO_COMMON, __u16 code),
TP_ARGS(VCPU_ARGS_COMMON, code),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(__u16, code)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->code = code;
),
VCPU_TP_PRINTK("intercepted program interruption %04x",
__entry->code)
);
/*
* Trace point for validity intercepts.
*/
TRACE_EVENT(kvm_s390_intercept_validity,
TP_PROTO(VCPU_PROTO_COMMON, __u16 viwhy),
TP_ARGS(VCPU_ARGS_COMMON, viwhy),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(__u16, viwhy)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->viwhy = viwhy;
),
VCPU_TP_PRINTK("got validity intercept %04x", __entry->viwhy)
);
/*
* Trace points for instructions that are of special interest.
*/
#define sigp_order_codes \
{SIGP_SENSE, "sense"}, \
{SIGP_EXTERNAL_CALL, "external call"}, \
{SIGP_EMERGENCY_SIGNAL, "emergency signal"}, \
{SIGP_STOP, "stop"}, \
{SIGP_STOP_AND_STORE_STATUS, "stop and store status"}, \
{SIGP_SET_ARCHITECTURE, "set architecture"}, \
{SIGP_SET_PREFIX, "set prefix"}, \
{SIGP_SENSE_RUNNING, "sense running"}, \
{SIGP_RESTART, "restart"}
TRACE_EVENT(kvm_s390_handle_sigp,
TP_PROTO(VCPU_PROTO_COMMON, __u8 order_code, __u16 cpu_addr, \
__u32 parameter),
TP_ARGS(VCPU_ARGS_COMMON, order_code, cpu_addr, parameter),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(__u8, order_code)
__field(__u16, cpu_addr)
__field(__u32, parameter)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->order_code = order_code;
__entry->cpu_addr = cpu_addr;
__entry->parameter = parameter;
),
VCPU_TP_PRINTK("handle sigp order %02x (%s), cpu address %04x, " \
"parameter %08x", __entry->order_code,
__print_symbolic(__entry->order_code,
sigp_order_codes),
__entry->cpu_addr, __entry->parameter)
);
#define diagnose_codes \
{0x10, "release pages"}, \
{0x44, "time slice end"}, \
{0x308, "ipl functions"}, \
{0x500, "kvm hypercall"}, \
{0x501, "kvm breakpoint"}
TRACE_EVENT(kvm_s390_handle_diag,
TP_PROTO(VCPU_PROTO_COMMON, __u16 code),
TP_ARGS(VCPU_ARGS_COMMON, code),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(__u16, code)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->code = code;
),
VCPU_TP_PRINTK("handle diagnose call %04x (%s)", __entry->code,
__print_symbolic(__entry->code, diagnose_codes))
);
TRACE_EVENT(kvm_s390_handle_lctl,
TP_PROTO(VCPU_PROTO_COMMON, int g, int reg1, int reg3, u64 addr),
TP_ARGS(VCPU_ARGS_COMMON, g, reg1, reg3, addr),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(int, g)
__field(int, reg1)
__field(int, reg3)
__field(u64, addr)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->g = g;
__entry->reg1 = reg1;
__entry->reg3 = reg3;
__entry->addr = addr;
),
VCPU_TP_PRINTK("%s: loading cr %x-%x from %016llx",
__entry->g ? "lctlg" : "lctl",
__entry->reg1, __entry->reg3, __entry->addr)
);
TRACE_EVENT(kvm_s390_handle_prefix,
TP_PROTO(VCPU_PROTO_COMMON, int set, u32 address),
TP_ARGS(VCPU_ARGS_COMMON, set, address),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(int, set)
__field(u32, address)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->set = set;
__entry->address = address;
),
VCPU_TP_PRINTK("%s prefix to %08x",
__entry->set ? "setting" : "storing",
__entry->address)
);
TRACE_EVENT(kvm_s390_handle_stap,
TP_PROTO(VCPU_PROTO_COMMON, u64 address),
TP_ARGS(VCPU_ARGS_COMMON, address),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(u64, address)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->address = address;
),
VCPU_TP_PRINTK("storing cpu address to %016llx",
__entry->address)
);
TRACE_EVENT(kvm_s390_handle_stfl,
TP_PROTO(VCPU_PROTO_COMMON, unsigned int facility_list),
TP_ARGS(VCPU_ARGS_COMMON, facility_list),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(unsigned int, facility_list)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->facility_list = facility_list;
),
VCPU_TP_PRINTK("store facility list value %08x",
__entry->facility_list)
);
TRACE_EVENT(kvm_s390_handle_stsi,
TP_PROTO(VCPU_PROTO_COMMON, int fc, int sel1, int sel2, u64 addr),
TP_ARGS(VCPU_ARGS_COMMON, fc, sel1, sel2, addr),
TP_STRUCT__entry(
VCPU_FIELD_COMMON
__field(int, fc)
__field(int, sel1)
__field(int, sel2)
__field(u64, addr)
),
TP_fast_assign(
VCPU_ASSIGN_COMMON
__entry->fc = fc;
__entry->sel1 = sel1;
__entry->sel2 = sel2;
__entry->addr = addr;
),
VCPU_TP_PRINTK("STSI %d.%d.%d information stored to %016llx",
__entry->fc, __entry->sel1, __entry->sel2,
__entry->addr)
);
#endif /* _TRACE_KVM_H */
/* This part must be outside protection */
#include <trace/define_trace.h>
......@@ -586,23 +586,18 @@ config PARAVIRT_TIME_ACCOUNTING
source "arch/x86/xen/Kconfig"
config KVM_CLOCK
bool "KVM paravirtualized clock"
select PARAVIRT
select PARAVIRT_CLOCK
---help---
Turning on this option will allow you to run a paravirtualized clock
when running over the KVM hypervisor. Instead of relying on a PIT
(or probably other) emulation by the underlying device model, the host
provides the guest with timing infrastructure such as time of day, and
system time
config KVM_GUEST
bool "KVM Guest support"
bool "KVM Guest support (including kvmclock)"
select PARAVIRT
select PARAVIRT
select PARAVIRT_CLOCK
default y if PARAVIRT_GUEST
---help---
This option enables various optimizations for running under the KVM
hypervisor.
hypervisor. It includes a paravirtualized clock, so that instead
of relying on a PIT (or probably other) emulation by the
underlying device model, the host provides the guest with
timing infrastructure such as time of day, and system time
source "arch/x86/lguest/Kconfig"
......
......@@ -41,6 +41,7 @@
#define __KVM_HAVE_DEBUGREGS
#define __KVM_HAVE_XSAVE
#define __KVM_HAVE_XCRS
#define __KVM_HAVE_READONLY_MEM
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
......
......@@ -85,6 +85,19 @@ struct x86_instruction_info {
#define X86EMUL_INTERCEPTED 6 /* Intercepted by nested VMCB/VMCS */
struct x86_emulate_ops {
/*
* read_gpr: read a general purpose register (rax - r15)
*
* @reg: gpr number.
*/
ulong (*read_gpr)(struct x86_emulate_ctxt *ctxt, unsigned reg);
/*
* write_gpr: write a general purpose register (rax - r15)
*
* @reg: gpr number.
* @val: value to write.
*/
void (*write_gpr)(struct x86_emulate_ctxt *ctxt, unsigned reg, ulong val);
/*
* read_std: Read bytes of standard (non-emulated/special) memory.
* Used for descriptor reading.
......@@ -200,8 +213,9 @@ typedef u32 __attribute__((vector_size(16))) sse128_t;
/* Type, address-of, and value of an instruction's operand. */
struct operand {
enum { OP_REG, OP_MEM, OP_IMM, OP_XMM, OP_MM, OP_NONE } type;
enum { OP_REG, OP_MEM, OP_MEM_STR, OP_IMM, OP_XMM, OP_MM, OP_NONE } type;
unsigned int bytes;
unsigned int count;
union {
unsigned long orig_val;
u64 orig_val64;
......@@ -221,6 +235,7 @@ struct operand {
char valptr[sizeof(unsigned long) + 2];
sse128_t vec_val;
u64 mm_val;
void *data;
};
};
......@@ -236,14 +251,23 @@ struct read_cache {
unsigned long end;
};
/* Execution mode, passed to the emulator. */
enum x86emul_mode {
X86EMUL_MODE_REAL, /* Real mode. */
X86EMUL_MODE_VM86, /* Virtual 8086 mode. */
X86EMUL_MODE_PROT16, /* 16-bit protected mode. */
X86EMUL_MODE_PROT32, /* 32-bit protected mode. */
X86EMUL_MODE_PROT64, /* 64-bit (long) mode. */
};
struct x86_emulate_ctxt {
struct x86_emulate_ops *ops;
const struct x86_emulate_ops *ops;
/* Register state before/after emulation. */
unsigned long eflags;
unsigned long eip; /* eip before instruction emulation */
/* Emulated execution mode, represented by an X86EMUL_MODE value. */
int mode;
enum x86emul_mode mode;
/* interruptibility state, as a result of execution of STI or MOV SS */
int interruptibility;
......@@ -281,8 +305,10 @@ struct x86_emulate_ctxt {
bool rip_relative;
unsigned long _eip;
struct operand memop;
u32 regs_valid; /* bitmaps of registers in _regs[] that can be read */
u32 regs_dirty; /* bitmaps of registers in _regs[] that have been written */
/* Fields above regs are cleared together. */
unsigned long regs[NR_VCPU_REGS];
unsigned long _regs[NR_VCPU_REGS];
struct operand *memopp;
struct fetch_cache fetch;
struct read_cache io_read;
......@@ -293,17 +319,6 @@ struct x86_emulate_ctxt {
#define REPE_PREFIX 0xf3
#define REPNE_PREFIX 0xf2
/* Execution mode, passed to the emulator. */
#define X86EMUL_MODE_REAL 0 /* Real mode. */
#define X86EMUL_MODE_VM86 1 /* Virtual 8086 mode. */
#define X86EMUL_MODE_PROT16 2 /* 16-bit protected mode. */
#define X86EMUL_MODE_PROT32 4 /* 32-bit protected mode. */
#define X86EMUL_MODE_PROT64 8 /* 64-bit (long) mode. */
/* any protected mode */
#define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
X86EMUL_MODE_PROT64)
/* CPUID vendors */
#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
......@@ -394,4 +409,7 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
u16 tss_selector, int idt_index, int reason,
bool has_error_code, u32 error_code);
int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
#endif /* _ASM_X86_KVM_X86_EMULATE_H */
......@@ -271,10 +271,24 @@ struct kvm_mmu {
union kvm_mmu_page_role base_role;
bool direct_map;
/*
* Bitmap; bit set = permission fault
* Byte index: page fault error code [4:1]
* Bit index: pte permissions in ACC_* format
*/
u8 permissions[16];
u64 *pae_root;
u64 *lm_root;
u64 rsvd_bits_mask[2][4];
/*
* Bitmap: bit set = last pte in walk
* index[0:1]: level (zero-based)
* index[2]: pte.ps
*/
u8 last_pte_bitmap;
bool nx;
u64 pdptrs[4]; /* pae */
......@@ -398,12 +412,15 @@ struct kvm_vcpu_arch {
struct x86_emulate_ctxt emulate_ctxt;
bool emulate_regs_need_sync_to_vcpu;
bool emulate_regs_need_sync_from_vcpu;
int (*complete_userspace_io)(struct kvm_vcpu *vcpu);
gpa_t time;
struct pvclock_vcpu_time_info hv_clock;
unsigned int hw_tsc_khz;
unsigned int time_offset;
struct page *time_page;
/* set guest stopped flag in pvclock flags field */
bool pvclock_set_guest_stopped_request;
struct {
u64 msr_val;
......@@ -438,6 +455,7 @@ struct kvm_vcpu_arch {
unsigned long dr6;
unsigned long dr7;
unsigned long eff_db[KVM_NR_DB_REGS];
unsigned long guest_debug_dr7;
u64 mcg_cap;
u64 mcg_status;
......@@ -484,14 +502,24 @@ struct kvm_vcpu_arch {
};
struct kvm_lpage_info {
unsigned long rmap_pde;
int write_count;
};
struct kvm_arch_memory_slot {
unsigned long *rmap[KVM_NR_PAGE_SIZES];
struct kvm_lpage_info *lpage_info[KVM_NR_PAGE_SIZES - 1];
};
struct kvm_apic_map {
struct rcu_head rcu;
u8 ldr_bits;
/* fields bellow are used to decode ldr values in different modes */
u32 cid_shift, cid_mask, lid_mask;
struct kvm_lapic *phys_map[256];
/* first index is cluster id second is cpu id in a cluster */
struct kvm_lapic *logical_map[16][16];
};
struct kvm_arch {
unsigned int n_used_mmu_pages;
unsigned int n_requested_mmu_pages;
......@@ -509,6 +537,8 @@ struct kvm_arch {
struct kvm_ioapic *vioapic;
struct kvm_pit *vpit;
int vapics_in_nmi_mode;
struct mutex apic_map_lock;
struct kvm_apic_map *apic_map;
unsigned int tss_addr;
struct page *apic_access_page;
......@@ -602,8 +632,7 @@ struct kvm_x86_ops {
void (*vcpu_load)(struct kvm_vcpu *vcpu, int cpu);
void (*vcpu_put)(struct kvm_vcpu *vcpu);
void (*set_guest_debug)(struct kvm_vcpu *vcpu,
struct kvm_guest_debug *dbg);
void (*update_db_bp_intercept)(struct kvm_vcpu *vcpu);
int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
int (*set_msr)(struct kvm_vcpu *vcpu, u32 msr_index, u64 data);
u64 (*get_segment_base)(struct kvm_vcpu *vcpu, int seg);
......@@ -941,6 +970,7 @@ extern bool kvm_rebooting;
#define KVM_ARCH_WANT_MMU_NOTIFIER
int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end);
int kvm_age_hva(struct kvm *kvm, unsigned long hva);
int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
......
......@@ -102,21 +102,21 @@ struct kvm_vcpu_pv_apf_data {
extern void kvmclock_init(void);
extern int kvm_register_clock(char *txt);
#ifdef CONFIG_KVM_CLOCK
#ifdef CONFIG_KVM_GUEST
bool kvm_check_and_clear_guest_paused(void);
#else
static inline bool kvm_check_and_clear_guest_paused(void)
{
return false;
}
#endif /* CONFIG_KVMCLOCK */
#endif /* CONFIG_KVM_GUEST */
/* This instruction is vmcall. On non-VT architectures, it will generate a
* trap that we will then rewrite to the appropriate instruction.
*/
#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"
/* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun
/* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
* instruction. The hypervisor may replace it with something else but only the
* instructions are guaranteed to be supported.
*
......
......@@ -81,8 +81,7 @@ obj-$(CONFIG_DEBUG_RODATA_TEST) += test_rodata.o
obj-$(CONFIG_DEBUG_NX_TEST) += test_nx.o
obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
obj-$(CONFIG_KVM_GUEST) += kvm.o
obj-$(CONFIG_KVM_CLOCK) += kvmclock.o
obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o
obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o
obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o
......
......@@ -354,6 +354,7 @@ static void kvm_pv_guest_cpu_reboot(void *unused)
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
kvm_pv_disable_apf();
kvm_disable_steal_time();
}
static int kvm_pv_reboot_notify(struct notifier_block *nb,
......@@ -396,9 +397,7 @@ void kvm_disable_steal_time(void)
#ifdef CONFIG_SMP
static void __init kvm_smp_prepare_boot_cpu(void)
{
#ifdef CONFIG_KVM_CLOCK
WARN_ON(kvm_register_clock("primary cpu clock"));
#endif
kvm_guest_cpu_init();
native_smp_prepare_boot_cpu();
}
......
......@@ -957,7 +957,7 @@ void __init setup_arch(char **cmdline_p)
initmem_init();
memblock_find_dma_reserve();
#ifdef CONFIG_KVM_CLOCK
#ifdef CONFIG_KVM_GUEST
kvmclock_init();
#endif
......
......@@ -20,6 +20,7 @@ if VIRTUALIZATION
config KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on HAVE_KVM
depends on HIGH_RES_TIMERS
# for device assignment:
depends on PCI
# for TASKSTATS/TASK_DELAY_ACCT:
......@@ -37,6 +38,7 @@ config KVM
select TASK_DELAY_ACCT
select PERF_EVENTS
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
---help---
Support hosting fully virtualized guest machines using hardware
virtualization extensions. You will need a fairly recent
......
......@@ -12,7 +12,7 @@ kvm-$(CONFIG_IOMMU_API) += $(addprefix ../../../virt/kvm/, iommu.o)
kvm-$(CONFIG_KVM_ASYNC_PF) += $(addprefix ../../../virt/kvm/, async_pf.o)
kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
i8254.o timer.o cpuid.o pmu.o
i8254.o cpuid.o pmu.o
kvm-intel-y += vmx.o
kvm-amd-y += svm.o
......
......@@ -316,7 +316,7 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
}
case 7: {
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
/* Mask ebx against host capbability word 9 */
/* Mask ebx against host capability word 9 */
if (index == 0) {
entry->ebx &= kvm_supported_word9_x86_features;
cpuid_mask(&entry->ebx, 9);
......@@ -397,8 +397,8 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
break;
}
case KVM_CPUID_SIGNATURE: {
char signature[12] = "KVMKVMKVM\0\0";
u32 *sigptr = (u32 *)signature;
static const char signature[12] = "KVMKVMKVM\0\0";
const u32 *sigptr = (const u32 *)signature;
entry->eax = KVM_CPUID_FEATURES;
entry->ebx = sigptr[0];
entry->ecx = sigptr[1];
......@@ -484,10 +484,10 @@ struct kvm_cpuid_param {
u32 func;
u32 idx;
bool has_leaf_count;
bool (*qualifier)(struct kvm_cpuid_param *param);
bool (*qualifier)(const struct kvm_cpuid_param *param);
};
static bool is_centaur_cpu(struct kvm_cpuid_param *param)
static bool is_centaur_cpu(const struct kvm_cpuid_param *param)
{
return boot_cpu_data.x86_vendor == X86_VENDOR_CENTAUR;
}
......@@ -498,7 +498,7 @@ int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
struct kvm_cpuid_entry2 *cpuid_entries;
int limit, nent = 0, r = -E2BIG, i;
u32 func;
static struct kvm_cpuid_param param[] = {
static const struct kvm_cpuid_param param[] = {
{ .func = 0, .has_leaf_count = true },
{ .func = 0x80000000, .has_leaf_count = true },
{ .func = 0xC0000000, .qualifier = is_centaur_cpu, .has_leaf_count = true },
......@@ -517,7 +517,7 @@ int kvm_dev_ioctl_get_supported_cpuid(struct kvm_cpuid2 *cpuid,
r = 0;
for (i = 0; i < ARRAY_SIZE(param); i++) {
struct kvm_cpuid_param *ent = &param[i];
const struct kvm_cpuid_param *ent = &param[i];
if (ent->qualifier && !ent->qualifier(ent))
continue;
......
This diff is collapsed.
......@@ -108,7 +108,7 @@ static s64 __kpit_elapsed(struct kvm *kvm)
ktime_t remaining;
struct kvm_kpit_state *ps = &kvm->arch.vpit->pit_state;
if (!ps->pit_timer.period)
if (!ps->period)
return 0;
/*
......@@ -120,9 +120,9 @@ static s64 __kpit_elapsed(struct kvm *kvm)
* itself with the initial count and continues counting
* from there.
*/
remaining = hrtimer_get_remaining(&ps->pit_timer.timer);
elapsed = ps->pit_timer.period - ktime_to_ns(remaining);
elapsed = mod_64(elapsed, ps->pit_timer.period);
remaining = hrtimer_get_remaining(&ps->timer);
elapsed = ps->period - ktime_to_ns(remaining);
elapsed = mod_64(elapsed, ps->period);
return elapsed;
}
......@@ -238,12 +238,12 @@ static void kvm_pit_ack_irq(struct kvm_irq_ack_notifier *kian)
int value;
spin_lock(&ps->inject_lock);
value = atomic_dec_return(&ps->pit_timer.pending);
value = atomic_dec_return(&ps->pending);
if (value < 0)
/* spurious acks can be generated if, for example, the
* PIC is being reset. Handle it gracefully here
*/
atomic_inc(&ps->pit_timer.pending);
atomic_inc(&ps->pending);
else if (value > 0)
/* in this case, we had multiple outstanding pit interrupts
* that we needed to inject. Reinject
......@@ -261,28 +261,17 @@ void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu)
if (!kvm_vcpu_is_bsp(vcpu) || !pit)
return;
timer = &pit->pit_state.pit_timer.timer;
timer = &pit->pit_state.timer;
if (hrtimer_cancel(timer))
hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
}
static void destroy_pit_timer(struct kvm_pit *pit)
{
hrtimer_cancel(&pit->pit_state.pit_timer.timer);
hrtimer_cancel(&pit->pit_state.timer);
flush_kthread_work(&pit->expired);
}
static bool kpit_is_periodic(struct kvm_timer *ktimer)
{
struct kvm_kpit_state *ps = container_of(ktimer, struct kvm_kpit_state,
pit_timer);
return ps->is_periodic;
}
static struct kvm_timer_ops kpit_ops = {
.is_periodic = kpit_is_periodic,
};
static void pit_do_work(struct kthread_work *work)
{
struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
......@@ -322,16 +311,16 @@ static void pit_do_work(struct kthread_work *work)
static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
{
struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
struct kvm_pit *pt = ktimer->kvm->arch.vpit;
struct kvm_kpit_state *ps = container_of(data, struct kvm_kpit_state, timer);
struct kvm_pit *pt = ps->kvm->arch.vpit;
if (ktimer->reinject || !atomic_read(&ktimer->pending)) {
atomic_inc(&ktimer->pending);
if (ps->reinject || !atomic_read(&ps->pending)) {
atomic_inc(&ps->pending);
queue_kthread_work(&pt->worker, &pt->expired);
}
if (ktimer->t_ops->is_periodic(ktimer)) {
hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
if (ps->is_periodic) {
hrtimer_add_expires_ns(&ps->timer, ps->period);
return HRTIMER_RESTART;
} else
return HRTIMER_NORESTART;
......@@ -340,7 +329,6 @@ static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
static void create_pit_timer(struct kvm *kvm, u32 val, int is_period)
{
struct kvm_kpit_state *ps = &kvm->arch.vpit->pit_state;
struct kvm_timer *pt = &ps->pit_timer;
s64 interval;
if (!irqchip_in_kernel(kvm) || ps->flags & KVM_PIT_FLAGS_HPET_LEGACY)
......@@ -351,19 +339,18 @@ static void create_pit_timer(struct kvm *kvm, u32 val, int is_period)
pr_debug("create pit timer, interval is %llu nsec\n", interval);
/* TODO The new value only affected after the retriggered */
hrtimer_cancel(&pt->timer);
hrtimer_cancel(&ps->timer);
flush_kthread_work(&ps->pit->expired);
pt->period = interval;
ps->period = interval;
ps->is_periodic = is_period;
pt->timer.function = pit_timer_fn;
pt->t_ops = &kpit_ops;
pt->kvm = ps->pit->kvm;
ps->timer.function = pit_timer_fn;
ps->kvm = ps->pit->kvm;
atomic_set(&pt->pending, 0);
atomic_set(&ps->pending, 0);
ps->irq_ack = 1;
hrtimer_start(&pt->timer, ktime_add_ns(ktime_get(), interval),
hrtimer_start(&ps->timer, ktime_add_ns(ktime_get(), interval),
HRTIMER_MODE_ABS);
}
......@@ -639,7 +626,7 @@ void kvm_pit_reset(struct kvm_pit *pit)
}
mutex_unlock(&pit->pit_state.lock);
atomic_set(&pit->pit_state.pit_timer.pending, 0);
atomic_set(&pit->pit_state.pending, 0);
pit->pit_state.irq_ack = 1;
}
......@@ -648,7 +635,7 @@ static void pit_mask_notifer(struct kvm_irq_mask_notifier *kimn, bool mask)
struct kvm_pit *pit = container_of(kimn, struct kvm_pit, mask_notifier);
if (!mask) {
atomic_set(&pit->pit_state.pit_timer.pending, 0);
atomic_set(&pit->pit_state.pending, 0);
pit->pit_state.irq_ack = 1;
}
}
......@@ -706,12 +693,11 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
pit_state = &pit->pit_state;
pit_state->pit = pit;
hrtimer_init(&pit_state->pit_timer.timer,
CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
hrtimer_init(&pit_state->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
pit_state->irq_ack_notifier.gsi = 0;
pit_state->irq_ack_notifier.irq_acked = kvm_pit_ack_irq;
kvm_register_irq_ack_notifier(kvm, &pit_state->irq_ack_notifier);
pit_state->pit_timer.reinject = true;
pit_state->reinject = true;
mutex_unlock(&pit->pit_state.lock);
kvm_pit_reset(pit);
......@@ -761,7 +747,7 @@ void kvm_free_pit(struct kvm *kvm)
kvm_unregister_irq_ack_notifier(kvm,
&kvm->arch.vpit->pit_state.irq_ack_notifier);
mutex_lock(&kvm->arch.vpit->pit_state.lock);
timer = &kvm->arch.vpit->pit_state.pit_timer.timer;
timer = &kvm->arch.vpit->pit_state.timer;
hrtimer_cancel(timer);
flush_kthread_work(&kvm->arch.vpit->expired);
kthread_stop(kvm->arch.vpit->worker_task);
......
......@@ -24,8 +24,12 @@ struct kvm_kpit_channel_state {
struct kvm_kpit_state {
struct kvm_kpit_channel_state channels[3];
u32 flags;
struct kvm_timer pit_timer;
bool is_periodic;
s64 period; /* unit: ns */
struct hrtimer timer;
atomic_t pending; /* accumulated triggered timers */
bool reinject;
struct kvm *kvm;
u32 speaker_data_on;
struct mutex lock;
struct kvm_pit *pit;
......
......@@ -190,17 +190,17 @@ void kvm_pic_update_irq(struct kvm_pic *s)
int kvm_pic_set_irq(struct kvm_pic *s, int irq, int irq_source_id, int level)
{
int ret = -1;
int ret, irq_level;
BUG_ON(irq < 0 || irq >= PIC_NUM_PINS);
pic_lock(s);
if (irq >= 0 && irq < PIC_NUM_PINS) {
int irq_level = __kvm_irq_line_state(&s->irq_states[irq],
irq_level = __kvm_irq_line_state(&s->irq_states[irq],
irq_source_id, level);
ret = pic_set_irq1(&s->pics[irq >> 3], irq & 7, irq_level);
pic_update_irq(s);
trace_kvm_pic_set_irq(irq >> 3, irq & 7, s->pics[irq >> 3].elcr,
s->pics[irq >> 3].imr, ret == 0);
}
pic_unlock(s);
return ret;
......@@ -275,23 +275,20 @@ void kvm_pic_reset(struct kvm_kpic_state *s)
{
int irq, i;
struct kvm_vcpu *vcpu;
u8 irr = s->irr, isr = s->imr;
u8 edge_irr = s->irr & ~s->elcr;
bool found = false;
s->last_irr = 0;
s->irr = 0;
s->irr &= s->elcr;
s->imr = 0;
s->isr = 0;
s->priority_add = 0;
s->irq_base = 0;
s->read_reg_select = 0;
s->poll = 0;
s->special_mask = 0;
s->init_state = 0;
s->auto_eoi = 0;
s->rotate_on_auto_eoi = 0;
s->read_reg_select = 0;
if (!s->init4) {
s->special_fully_nested_mode = 0;
s->init4 = 0;
s->auto_eoi = 0;
}
s->init_state = 1;
kvm_for_each_vcpu(i, vcpu, s->pics_state->kvm)
if (kvm_apic_accept_pic_intr(vcpu)) {
......@@ -304,7 +301,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s)
return;
for (irq = 0; irq < PIC_NUM_PINS/2; irq++)
if (irr & (1 << irq) || isr & (1 << irq))
if (edge_irr & (1 << irq))
pic_clear_isr(s, irq);
}
......@@ -316,40 +313,13 @@ static void pic_ioport_write(void *opaque, u32 addr, u32 val)
addr &= 1;
if (addr == 0) {
if (val & 0x10) {
u8 edge_irr = s->irr & ~s->elcr;
int i;
bool found = false;
struct kvm_vcpu *vcpu;
s->init4 = val & 1;
s->last_irr = 0;
s->irr &= s->elcr;
s->imr = 0;
s->priority_add = 0;
s->special_mask = 0;
s->read_reg_select = 0;
if (!s->init4) {
s->special_fully_nested_mode = 0;
s->auto_eoi = 0;
}
s->init_state = 1;
if (val & 0x02)
pr_pic_unimpl("single mode not supported");
if (val & 0x08)
pr_pic_unimpl(
"level sensitive irq not supported");
kvm_for_each_vcpu(i, vcpu, s->pics_state->kvm)
if (kvm_apic_accept_pic_intr(vcpu)) {
found = true;
break;
}
if (found)
for (irq = 0; irq < PIC_NUM_PINS/2; irq++)
if (edge_irr & (1 << irq))
pic_clear_isr(s, irq);
kvm_pic_reset(s);
} else if (val & 0x08) {
if (val & 0x04)
s->poll = 1;
......
......@@ -70,7 +70,7 @@ struct kvm_pic {
struct kvm_io_device dev_slave;
struct kvm_io_device dev_eclr;
void (*ack_notifier)(void *opaque, int irq);
unsigned long irq_states[16];
unsigned long irq_states[PIC_NUM_PINS];
};
struct kvm_pic *kvm_create_pic(struct kvm *kvm);
......
struct kvm_timer {
struct hrtimer timer;
s64 period; /* unit: ns */
u32 timer_mode_mask;
u64 tscdeadline;
atomic_t pending; /* accumulated triggered timers */
bool reinject;
struct kvm_timer_ops *t_ops;
struct kvm *kvm;
struct kvm_vcpu *vcpu;
};
struct kvm_timer_ops {
bool (*is_periodic)(struct kvm_timer *);
};
enum hrtimer_restart kvm_timer_fn(struct hrtimer *data);
This diff is collapsed.
......@@ -2,10 +2,17 @@
#define __KVM_X86_LAPIC_H
#include "iodev.h"
#include "kvm_timer.h"
#include <linux/kvm_host.h>
struct kvm_timer {
struct hrtimer timer;
s64 period; /* unit: ns */
u32 timer_mode_mask;
u64 tscdeadline;
atomic_t pending; /* accumulated triggered timers */
};
struct kvm_lapic {
unsigned long base_address;
struct kvm_io_device dev;
......@@ -45,11 +52,13 @@ int kvm_apic_match_logical_addr(struct kvm_lapic *apic, u8 mda);
int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq);
int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
struct kvm_lapic_irq *irq, int *r);
u64 kvm_get_apic_base(struct kvm_vcpu *vcpu);
void kvm_set_apic_base(struct kvm_vcpu *vcpu, u64 data);
void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu);
int kvm_lapic_enabled(struct kvm_vcpu *vcpu);
bool kvm_apic_present(struct kvm_vcpu *vcpu);
void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
struct kvm_lapic_state *s);
int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu);
u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu);
......@@ -71,4 +80,48 @@ static inline bool kvm_hv_vapic_assist_page_enabled(struct kvm_vcpu *vcpu)
}
int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data);
void kvm_lapic_init(void);
static inline u32 kvm_apic_get_reg(struct kvm_lapic *apic, int reg_off)
{
return *((u32 *) (apic->regs + reg_off));
}
extern struct static_key kvm_no_apic_vcpu;
static inline bool kvm_vcpu_has_lapic(struct kvm_vcpu *vcpu)
{
if (static_key_false(&kvm_no_apic_vcpu))
return vcpu->arch.apic;
return true;
}
extern struct static_key_deferred apic_hw_disabled;
static inline int kvm_apic_hw_enabled(struct kvm_lapic *apic)
{
if (static_key_false(&apic_hw_disabled.key))
return apic->vcpu->arch.apic_base & MSR_IA32_APICBASE_ENABLE;
return MSR_IA32_APICBASE_ENABLE;
}
extern struct static_key_deferred apic_sw_disabled;
static inline int kvm_apic_sw_enabled(struct kvm_lapic *apic)
{
if (static_key_false(&apic_sw_disabled.key))
return kvm_apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_APIC_ENABLED;
return APIC_SPIV_APIC_ENABLED;
}
static inline bool kvm_apic_present(struct kvm_vcpu *vcpu)
{
return kvm_vcpu_has_lapic(vcpu) && kvm_apic_hw_enabled(vcpu->arch.apic);
}
static inline int kvm_lapic_enabled(struct kvm_vcpu *vcpu)
{
return kvm_apic_present(vcpu) && kvm_apic_sw_enabled(vcpu->arch.apic);
}
#endif
This diff is collapsed.
......@@ -18,8 +18,10 @@
#define PT_PCD_MASK (1ULL << 4)
#define PT_ACCESSED_SHIFT 5
#define PT_ACCESSED_MASK (1ULL << PT_ACCESSED_SHIFT)
#define PT_DIRTY_MASK (1ULL << 6)
#define PT_PAGE_SIZE_MASK (1ULL << 7)
#define PT_DIRTY_SHIFT 6
#define PT_DIRTY_MASK (1ULL << PT_DIRTY_SHIFT)
#define PT_PAGE_SIZE_SHIFT 7
#define PT_PAGE_SIZE_MASK (1ULL << PT_PAGE_SIZE_SHIFT)
#define PT_PAT_MASK (1ULL << 7)
#define PT_GLOBAL_MASK (1ULL << 8)
#define PT64_NX_SHIFT 63
......@@ -88,17 +90,14 @@ static inline bool is_write_protection(struct kvm_vcpu *vcpu)
return kvm_read_cr0_bits(vcpu, X86_CR0_WP);
}
static inline bool check_write_user_access(struct kvm_vcpu *vcpu,
bool write_fault, bool user_fault,
unsigned long pte)
/*
* Will a fault with a given page-fault error code (pfec) cause a permission
* fault with the given access (in ACC_* format)?
*/
static inline bool permission_fault(struct kvm_mmu *mmu, unsigned pte_access,
unsigned pfec)
{
if (unlikely(write_fault && !is_writable_pte(pte)
&& (user_fault || is_write_protection(vcpu))))
return false;
if (unlikely(user_fault && !(pte & PT_USER_MASK)))
return false;
return true;
return (mmu->permissions[pfec >> 1] >> pte_access) & 1;
}
#endif
......@@ -116,10 +116,8 @@ static void audit_mappings(struct kvm_vcpu *vcpu, u64 *sptep, int level)
gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
pfn = gfn_to_pfn_atomic(vcpu->kvm, gfn);
if (is_error_pfn(pfn)) {
kvm_release_pfn_clean(pfn);
if (is_error_pfn(pfn))
return;
}
hpa = pfn << PAGE_SHIFT;
if ((*sptep & PT64_BASE_ADDR_MASK) != hpa)
......@@ -190,7 +188,6 @@ static void check_mappings_rmap(struct kvm *kvm, struct kvm_mmu_page *sp)
static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
{
struct kvm_memory_slot *slot;
unsigned long *rmapp;
u64 *sptep;
struct rmap_iterator iter;
......@@ -198,8 +195,7 @@ static void audit_write_protection(struct kvm *kvm, struct kvm_mmu_page *sp)
if (sp->role.direct || sp->unsync || sp->role.invalid)
return;
slot = gfn_to_memslot(kvm, sp->gfn);
rmapp = &slot->rmap[sp->gfn - slot->base_gfn];
rmapp = gfn_to_rmap(kvm, sp->gfn, PT_PAGE_TABLE_LEVEL);
for (sptep = rmap_get_first(*rmapp, &iter); sptep;
sptep = rmap_get_next(&iter)) {
......
......@@ -63,10 +63,12 @@
*/
struct guest_walker {
int level;
unsigned max_level;
gfn_t table_gfn[PT_MAX_FULL_LEVELS];
pt_element_t ptes[PT_MAX_FULL_LEVELS];
pt_element_t prefetch_ptes[PTE_PREFETCH_NUM];
gpa_t pte_gpa[PT_MAX_FULL_LEVELS];
pt_element_t __user *ptep_user[PT_MAX_FULL_LEVELS];
unsigned pt_access;
unsigned pte_access;
gfn_t gfn;
......@@ -101,38 +103,41 @@ static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
return (ret != orig_pte);
}
static unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, pt_element_t gpte,
bool last)
static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu,
struct kvm_mmu *mmu,
struct guest_walker *walker,
int write_fault)
{
unsigned access;
access = (gpte & (PT_WRITABLE_MASK | PT_USER_MASK)) | ACC_EXEC_MASK;
if (last && !is_dirty_gpte(gpte))
access &= ~ACC_WRITE_MASK;
#if PTTYPE == 64
if (vcpu->arch.mmu.nx)
access &= ~(gpte >> PT64_NX_SHIFT);
#endif
return access;
}
static bool FNAME(is_last_gpte)(struct guest_walker *walker,
struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
pt_element_t gpte)
{
if (walker->level == PT_PAGE_TABLE_LEVEL)
return true;
unsigned level, index;
pt_element_t pte, orig_pte;
pt_element_t __user *ptep_user;
gfn_t table_gfn;
int ret;
if ((walker->level == PT_DIRECTORY_LEVEL) && is_large_pte(gpte) &&
(PTTYPE == 64 || is_pse(vcpu)))
return true;
for (level = walker->max_level; level >= walker->level; --level) {
pte = orig_pte = walker->ptes[level - 1];
table_gfn = walker->table_gfn[level - 1];
ptep_user = walker->ptep_user[level - 1];
index = offset_in_page(ptep_user) / sizeof(pt_element_t);
if (!(pte & PT_ACCESSED_MASK)) {
trace_kvm_mmu_set_accessed_bit(table_gfn, index, sizeof(pte));
pte |= PT_ACCESSED_MASK;
}
if (level == walker->level && write_fault && !is_dirty_gpte(pte)) {
trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte));
pte |= PT_DIRTY_MASK;
}
if (pte == orig_pte)
continue;
if ((walker->level == PT_PDPE_LEVEL) && is_large_pte(gpte) &&
(mmu->root_level == PT64_ROOT_LEVEL))
return true;
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte);
if (ret)
return ret;
return false;
mark_page_dirty(vcpu->kvm, table_gfn);
walker->ptes[level] = pte;
}
return 0;
}
/*
......@@ -142,21 +147,22 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
gva_t addr, u32 access)
{
int ret;
pt_element_t pte;
pt_element_t __user *uninitialized_var(ptep_user);
gfn_t table_gfn;
unsigned index, pt_access, uninitialized_var(pte_access);
unsigned index, pt_access, pte_access, accessed_dirty, shift;
gpa_t pte_gpa;
bool eperm, last_gpte;
int offset;
const int write_fault = access & PFERR_WRITE_MASK;
const int user_fault = access & PFERR_USER_MASK;
const int fetch_fault = access & PFERR_FETCH_MASK;
u16 errcode = 0;
gpa_t real_gpa;
gfn_t gfn;
trace_kvm_mmu_pagetable_walk(addr, access);
retry_walk:
eperm = false;
walker->level = mmu->root_level;
pte = mmu->get_cr3(vcpu);
......@@ -169,15 +175,21 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
--walker->level;
}
#endif
walker->max_level = walker->level;
ASSERT((!is_long_mode(vcpu) && is_pae(vcpu)) ||
(mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0);
pt_access = ACC_ALL;
accessed_dirty = PT_ACCESSED_MASK;
pt_access = pte_access = ACC_ALL;
++walker->level;
for (;;) {
do {
gfn_t real_gfn;
unsigned long host_addr;
pt_access &= pte_access;
--walker->level;
index = PT_INDEX(addr, walker->level);
table_gfn = gpte_to_gfn(pte);
......@@ -199,6 +211,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
ptep_user = (pt_element_t __user *)((void *)host_addr + offset);
if (unlikely(__copy_from_user(&pte, ptep_user, sizeof(pte))))
goto error;
walker->ptep_user[walker->level - 1] = ptep_user;
trace_kvm_mmu_paging_element(pte, walker->level);
......@@ -211,92 +224,48 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
goto error;
}
if (!check_write_user_access(vcpu, write_fault, user_fault,
pte))
eperm = true;
#if PTTYPE == 64
if (unlikely(fetch_fault && (pte & PT64_NX_MASK)))
eperm = true;
#endif
accessed_dirty &= pte;
pte_access = pt_access & gpte_access(vcpu, pte);
last_gpte = FNAME(is_last_gpte)(walker, vcpu, mmu, pte);
if (last_gpte) {
pte_access = pt_access &
FNAME(gpte_access)(vcpu, pte, true);
/* check if the kernel is fetching from user page */
if (unlikely(pte_access & PT_USER_MASK) &&
kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))
if (fetch_fault && !user_fault)
eperm = true;
}
walker->ptes[walker->level - 1] = pte;
} while (!is_last_gpte(mmu, walker->level, pte));
if (!eperm && unlikely(!(pte & PT_ACCESSED_MASK))) {
int ret;
trace_kvm_mmu_set_accessed_bit(table_gfn, index,
sizeof(pte));
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index,
pte, pte|PT_ACCESSED_MASK);
if (unlikely(ret < 0))
if (unlikely(permission_fault(mmu, pte_access, access))) {
errcode |= PFERR_PRESENT_MASK;
goto error;
else if (ret)
goto retry_walk;
mark_page_dirty(vcpu->kvm, table_gfn);
pte |= PT_ACCESSED_MASK;
}
walker->ptes[walker->level - 1] = pte;
if (last_gpte) {
int lvl = walker->level;
gpa_t real_gpa;
gfn_t gfn;
u32 ac;
gfn = gpte_to_gfn_lvl(pte, lvl);
gfn += (addr & PT_LVL_OFFSET_MASK(lvl)) >> PAGE_SHIFT;
gfn = gpte_to_gfn_lvl(pte, walker->level);
gfn += (addr & PT_LVL_OFFSET_MASK(walker->level)) >> PAGE_SHIFT;
if (PTTYPE == 32 &&
walker->level == PT_DIRECTORY_LEVEL &&
is_cpuid_PSE36())
if (PTTYPE == 32 && walker->level == PT_DIRECTORY_LEVEL && is_cpuid_PSE36())
gfn += pse36_gfn_delta(pte);
ac = write_fault | fetch_fault | user_fault;
real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn),
ac);
real_gpa = mmu->translate_gpa(vcpu, gfn_to_gpa(gfn), access);
if (real_gpa == UNMAPPED_GVA)
return 0;
walker->gfn = real_gpa >> PAGE_SHIFT;
break;
}
pt_access &= FNAME(gpte_access)(vcpu, pte, false);
--walker->level;
}
if (!write_fault)
protect_clean_gpte(&pte_access, pte);
if (unlikely(eperm)) {
errcode |= PFERR_PRESENT_MASK;
goto error;
}
if (write_fault && unlikely(!is_dirty_gpte(pte))) {
int ret;
/*
* On a write fault, fold the dirty bit into accessed_dirty by shifting it one
* place right.
*
* On a read fault, do nothing.
*/
shift = write_fault >> ilog2(PFERR_WRITE_MASK);
shift *= PT_DIRTY_SHIFT - PT_ACCESSED_SHIFT;
accessed_dirty &= pte >> shift;
trace_kvm_mmu_set_dirty_bit(table_gfn, index, sizeof(pte));
ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index,
pte, pte|PT_DIRTY_MASK);
if (unlikely(!accessed_dirty)) {
ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault);
if (unlikely(ret < 0))
goto error;
else if (ret)
goto retry_walk;
mark_page_dirty(vcpu->kvm, table_gfn);
pte |= PT_DIRTY_MASK;
walker->ptes[walker->level - 1] = pte;
}
walker->pt_access = pt_access;
......@@ -368,12 +337,11 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
return;
pgprintk("%s: gpte %llx spte %p\n", __func__, (u64)gpte, spte);
pte_access = sp->role.access & FNAME(gpte_access)(vcpu, gpte, true);
pte_access = sp->role.access & gpte_access(vcpu, gpte);
protect_clean_gpte(&pte_access, gpte);
pfn = gfn_to_pfn_atomic(vcpu->kvm, gpte_to_gfn(gpte));
if (mmu_invalid_pfn(pfn)) {
kvm_release_pfn_clean(pfn);
if (mmu_invalid_pfn(pfn))
return;
}
/*
* we call mmu_set_spte() with host_writable = true because that
......@@ -443,15 +411,13 @@ static void FNAME(pte_prefetch)(struct kvm_vcpu *vcpu, struct guest_walker *gw,
if (FNAME(prefetch_invalid_gpte)(vcpu, sp, spte, gpte))
continue;
pte_access = sp->role.access & FNAME(gpte_access)(vcpu, gpte,
true);
pte_access = sp->role.access & gpte_access(vcpu, gpte);
protect_clean_gpte(&pte_access, gpte);
gfn = gpte_to_gfn(gpte);
pfn = pte_prefetch_gfn_to_pfn(vcpu, gfn,
pte_access & ACC_WRITE_MASK);
if (mmu_invalid_pfn(pfn)) {
kvm_release_pfn_clean(pfn);
if (mmu_invalid_pfn(pfn))
break;
}
mmu_set_spte(vcpu, spte, sp->role.access, pte_access, 0, 0,
NULL, PT_PAGE_TABLE_LEVEL, gfn,
......@@ -798,7 +764,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
gfn = gpte_to_gfn(gpte);
pte_access = sp->role.access;
pte_access &= FNAME(gpte_access)(vcpu, gpte, true);
pte_access &= gpte_access(vcpu, gpte);
protect_clean_gpte(&pte_access, gpte);
if (sync_mmio_spte(&sp->spt[i], gfn, pte_access, &nr_present))
continue;
......
/*
* Kernel-based Virtual Machine -- Performane Monitoring Unit support
* Kernel-based Virtual Machine -- Performance Monitoring Unit support
*
* Copyright 2011 Red Hat, Inc. and/or its affiliates.
*
......
This diff is collapsed.
/*
* Kernel-based Virtual Machine driver for Linux
*
* This module enables machines with Intel VT-x extensions to run virtual
* machines without emulation or binary translation.
*
* timer support
*
* Copyright 2010 Red Hat, Inc. and/or its affiliates.
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*/
#include <linux/kvm_host.h>
#include <linux/kvm.h>
#include <linux/hrtimer.h>
#include <linux/atomic.h>
#include "kvm_timer.h"
enum hrtimer_restart kvm_timer_fn(struct hrtimer *data)
{
struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
struct kvm_vcpu *vcpu = ktimer->vcpu;
wait_queue_head_t *q = &vcpu->wq;
/*
* There is a race window between reading and incrementing, but we do
* not care about potentially losing timer events in the !reinject
* case anyway. Note: KVM_REQ_PENDING_TIMER is implicitly checked
* in vcpu_enter_guest.
*/
if (ktimer->reinject || !atomic_read(&ktimer->pending)) {
atomic_inc(&ktimer->pending);
/* FIXME: this code should not know anything about vcpus */
kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
}
if (waitqueue_active(q))
wake_up_interruptible(q);
if (ktimer->t_ops->is_periodic(ktimer)) {
hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
return HRTIMER_RESTART;
} else
return HRTIMER_NORESTART;
}
This diff is collapsed.
This diff is collapsed.
......@@ -124,4 +124,5 @@ int kvm_write_guest_virt_system(struct x86_emulate_ctxt *ctxt,
extern u64 host_xcr0;
extern struct static_key kvm_no_apic_vcpu;
#endif
This diff is collapsed.
This diff is collapsed.
......@@ -118,6 +118,7 @@ void jump_label_rate_limit(struct static_key_deferred *key,
key->timeout = rl;
INIT_DELAYED_WORK(&key->work, jump_label_update_timeout);
}
EXPORT_SYMBOL_GPL(jump_label_rate_limit);
static int addr_conflict(struct jump_entry *entry, void *start, void *end)
{
......
......@@ -21,3 +21,6 @@ config KVM_ASYNC_PF
config HAVE_KVM_MSI
bool
config HAVE_KVM_CPU_RELAX_INTERCEPT
bool
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment