An error occurred fetching the project authors.
- 10 Mar, 2015 1 commit
-
-
Joel Schopp authored
Currently kvm_emulate() skips the instruction but kvm_emulate_* sometimes don't. The end reult is the caller ends up doing the skip themselves. Let's make them consistant. Signed-off-by:
Joel Schopp <joel.schopp@amd.com> Reviewed-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com>
-
- 23 Feb, 2015 1 commit
-
-
Radim Krčmář authored
'apic' is not defined if !CONFIG_X86_64 && !CONFIG_X86_LOCAL_APIC. Posted interrupt makes no sense without CONFIG_SMP, and CONFIG_X86_LOCAL_APIC will be set with it. Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 10 Feb, 2015 1 commit
-
-
Radim Krčmář authored
<asm/apic.h> isn't included directly and without CONFIG_SMP, an option that automagically pulls it can't be enabled. Reported-by:
Jim Davis <jim.epost@gmail.com> Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 04 Feb, 2015 2 commits
-
-
Andy Lutomirski authored
Context switches and TLB flushes can change individual bits of CR4. CR4 reads take several cycles, so store a shadow copy of CR4 in a per-cpu variable. To avoid wasting a cache line, I added the CR4 shadow to cpu_tlbstate, which is already touched in switch_mm. The heaviest users of the cr4 shadow will be switch_mm and __switch_to_xtra, and __switch_to_xtra is called shortly after switch_mm during context switch, so the cacheline is likely to be hot. Signed-off-by:
Andy Lutomirski <luto@amacapital.net> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.netSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
Andy Lutomirski authored
CR4 manipulation was split, seemingly at random, between direct (write_cr4) and using a helper (set/clear_in_cr4). Unfortunately, the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code, which only a small subset of users actually wanted. This patch replaces all cr4 access in functions that don't leave cr4 exactly the way they found it with new helpers cr4_set_bits, cr4_clear_bits, and cr4_set_bits_and_update_boot. Signed-off-by:
Andy Lutomirski <luto@amacapital.net> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.netSigned-off-by:
Ingo Molnar <mingo@kernel.org>
-
- 03 Feb, 2015 6 commits
-
-
Wincy Van authored
If vcpu has a interrupt in vmx non-root mode, injecting that interrupt requires a vmexit. With posted interrupt processing, the vmexit is not needed, and interrupts are fully taken care of by hardware. In nested vmx, this feature avoids much more vmexits than non-nested vmx. When L1 asks L0 to deliver L1's posted interrupt vector, and the target VCPU is in non-root mode, we use a physical ipi to deliver POSTED_INTR_NV to the target vCPU. Using POSTED_INTR_NV avoids unexpected interrupts if a concurrent vmexit happens and L1's vector is different with L0's. The IPI triggers posted interrupt processing in the target physical CPU. In case the target vCPU was not in guest mode, complete the posted interrupt delivery on the next entry to L2. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
With virtual interrupt delivery, the hardware lets KVM use a more efficient mechanism for interrupt injection. This is an important feature for nested VMX, because it reduces vmexits substantially and they are much more expensive with nested virtualization. This is especially important for throughput-bound scenarios. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
We can reduce apic register virtualization cost with this feature, it is also a requirement for virtual interrupt delivery and posted interrupt processing. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
To enable nested apicv support, we need per-cpu vmx control MSRs: 1. If in-kernel irqchip is enabled, we can enable nested posted interrupt, we should set posted intr bit in the nested_vmx_pinbased_ctls_high. 2. If in-kernel irqchip is disabled, we can not enable nested posted interrupt, the posted intr bit in the nested_vmx_pinbased_ctls_high will be cleared. Since there would be different settings about in-kernel irqchip between VMs, different nested control MSRs are needed. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
When L2 is using x2apic, we can use virtualize x2apic mode to gain higher performance, especially in apicv case. This patch also introduces nested_vmx_check_apicv_controls for the nested apicv patches. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
Currently, if L1 enables MSR_BITMAP, we will emulate this feature, all of L2's msr access is intercepted by L0. Features like "virtualize x2apic mode" require that the MSR bitmap is enabled, or the hardware will exit and for example not virtualize the x2apic MSRs. In order to let L1 use these features, we need to build a merged bitmap that only not cause a VMEXIT if 1) L1 requires that 2) the bit is not required by the processor for APIC virtualization. For now the guests are still run with MSR bitmap disabled, but this patch already introduces nested_vmx_merge_msr_bitmap for future use. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 02 Feb, 2015 1 commit
-
-
Marcelo Tosatti authored
Revert 7c6a98df, given that testing PIR is not necessary anymore. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 30 Jan, 2015 2 commits
-
-
Paolo Bonzini authored
A function pointer was not NULLed, causing kvm_vcpu_reload_apic_access_page to go down the wrong path and OOPS when doing put_page(NULL). This did not happen on old processors, only when setting the module option explicitly. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Kai Huang authored
This patch adds PML support in VMX. A new module parameter 'enable_pml' is added to allow user to enable/disable it manually. Signed-off-by:
Kai Huang <kai.huang@linux.intel.com> Reviewed-by:
Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 19 Jan, 2015 1 commit
-
-
Rickard Strandqvist authored
Removes some functions that are not used anywhere: cpu_has_vmx_eptp_writeback() cpu_has_vmx_eptp_uncacheable() This was partially found by using a static code analysis program called cppcheck. Signed-off-by:
Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 08 Jan, 2015 6 commits
-
-
Paolo Bonzini authored
The initialization function in mmu.c can always use walk_mmu, which is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to initialize vcpu->arch.nested_mmu. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Marcelo Tosatti authored
kvm_x86_ops->test_posted_interrupt() returns true/false depending whether 'vector' is set. Next patch makes use of this interface. Signed-off-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Tiejun Chen authored
In most cases calling hwapic_isr_update(), we always check if kvm_apic_vid_enabled() == 1, but actually, kvm_apic_vid_enabled() -> kvm_x86_ops->vm_has_apicv() -> vmx_vm_has_apicv() or '0' in svm case -> return enable_apicv && irqchip_in_kernel(kvm) So its a little cost to recall vmx_vm_has_apicv() inside hwapic_isr_update(), here just NULL out hwapic_isr_update() in case of !enable_apicv inside hardware_setup() then make all related stuffs follow this. Note we don't check this under that condition of irqchip_in_kernel() since we should make sure definitely any caller don't work without in-kernel irqchip. Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Eugene Korenevsky authored
When generating #PF VM-exit, check equality: (PFEC & PFEC_MASK) == PFEC_MATCH If there is equality, the 14 bit of exception bitmap is used to take decision about generating #PF VM-exit. If there is inequality, inverted 14 bit is used. Signed-off-by:
Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Eugene Korenevsky authored
This patch improve checks required by Intel Software Developer Manual. - SMM MSRs are not allowed. - microcode MSRs are not allowed. - check x2apic MSRs only when LAPIC is in x2apic mode. - MSR switch areas must be aligned to 16 bytes. - address of first and last byte in MSR switch areas should not set any bits beyond the processor's physical-address width. Also it adds warning messages on failures during MSR switch. These messages are useful for people who debug their VMMs in nVMX. Signed-off-by:
Eugene Korenevsky <ekorenevsky@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wincy Van authored
Several hypervisors need MSR auto load/restore feature. We read MSRs from VM-entry MSR load area which specified by L1, and load them via kvm_set_msr in the nested entry. When nested exit occurs, we get MSRs via kvm_get_msr, writing them to L1`s MSR store area. After this, we read MSRs from VM-exit MSR load area, and load them via kvm_set_msr. Signed-off-by:
Wincy Van <fanwenyi0529@gmail.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 27 Dec, 2014 1 commit
-
-
Tiejun Chen authored
The commit 34a1cd60, "x86: vmx: move some vmx setting from vmx_init() to hardware_setup()", tried to refactor some codes specific to vmx hardware setting into hardware_setup(), but some msr writing should depend on our previous setting condition like enable_apicv, enable_ept and so on. Reported-by:
Jamie Heilman <jamie@audible.transient.net> Tested-by:
Jamie Heilman <jamie@audible.transient.net> Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 11 Dec, 2014 1 commit
-
-
Bandan Das authored
If L0 has disabled EPT, don't advertise unrestricted mode at all since it depends on EPT to run real mode code. Fixes: 92fbc7b1 Cc: stable@vger.kernel.org Reviewed-by:
Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by:
Bandan Das <bsd@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 05 Dec, 2014 4 commits
-
-
Wanpeng Li authored
Add nested virtualization support for xsaves. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Add logic to get/set the XSS model-specific register. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Initialize the XSS exit bitmap. It is zero so there should be no XSAVES or XRSTORS exits. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wanpeng Li authored
Expose the XSAVES feature to the guest if the kvm_x86_ops say it is available. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 18 Nov, 2014 1 commit
-
-
Tiejun Chen authored
Instead, just use PFERR_{FETCH, PRESENT, WRITE}_MASK inside handle_ept_violation() for slightly better code. Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 12 Nov, 2014 2 commits
-
-
Andy Lutomirski authored
There's nothing to switch if the host and guest values are the same. I am unable to find evidence that this makes any difference whatsoever. Signed-off-by:
Andy Lutomirski <luto@amacapital.net> [I could see a difference on Nehalem. From 5 runs: userspace exit, guest!=host 12200 11772 12130 12164 12327 userspace exit, guest=host 11983 11780 11920 11919 12040 lightweight exit, guest!=host 3214 3220 3238 3218 3337 lightweight exit, guest=host 3178 3193 3193 3187 3220 This passes the t-test with 99% confidence for userspace exit, 98.5% confidence for lightweight exit. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Andy Lutomirski authored
At least on Sandy Bridge, letting the CPU switch IA32_EFER is much faster than switching it manually. I benchmarked this using the vmexit kvm-unit-test (single run, but GOAL multiplied by 5 to do more iterations): Test Before After Change cpuid 2000 1932 -3.40% vmcall 1914 1817 -5.07% mov_from_cr8 13 13 0.00% mov_to_cr8 19 19 0.00% inl_from_pmtimer 19164 10619 -44.59% inl_from_qemu 15662 10302 -34.22% inl_from_kernel 3916 3802 -2.91% outl_to_kernel 2230 2194 -1.61% mov_dr 172 176 2.33% ipi (skipped) (skipped) ipi+halt (skipped) (skipped) ple-round-robin 13 13 0.00% wr_tsc_adjust_msr 1920 1845 -3.91% rd_tsc_adjust_msr 1892 1814 -4.12% mmio-no-eventfd:pci-mem 16394 11165 -31.90% mmio-wildcard-eventfd:pci-mem 4607 4645 0.82% mmio-datamatch-eventfd:pci-mem 4601 4610 0.20% portio-no-eventfd:pci-io 11507 7942 -30.98% portio-wildcard-eventfd:pci-io 2239 2225 -0.63% portio-datamatch-eventfd:pci-io 2250 2234 -0.71% I haven't explicitly computed the significance of these numbers, but this isn't subtle. Signed-off-by:
Andy Lutomirski <luto@amacapital.net> [The results were reproducible on all of Nehalem, Sandy Bridge and Ivy Bridge. The slowness of manual switching is because writing to EFER with WRMSR triggers a TLB flush, even if the only bit you're touching is SCE (so the page table format is not affected). Doing the write as part of vmentry/vmexit, instead, does not flush the TLB, probably because all processors that have EPT also have VPID. - Paolo] Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 07 Nov, 2014 6 commits
-
-
Nadav Amit authored
x86 debug registers hold a linear address. Therefore, breakpoints detection should consider CS.base, and check whether instruction linear address equals (CS.base + RIP). This patch introduces a function to evaluate RIP linear address and uses it for breakpoints detection. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Nadav Amit authored
DR6[0:3] (previous breakpoint indications) are cleared when #DB is injected during handle_exception, just as real hardware does. Similarily, handle_dr should clear DR6[0:3]. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Wei Wang authored
A bug was reported as follows: when running Windows 7 32-bit guests on qemu-kvm, sometimes the guests run into blue screen during reboot. The problem was that a guest's RVI was not cleared when it rebooted. This patch has fixed the problem. Signed-off-by:
Wei Wang <wei.w.wang@intel.com> Signed-off-by:
Yang Zhang <yang.z.zhang@intel.com> Tested-by:
Rongrong Liu <rongrongx.liu@intel.com>, Da Chun <ngugc@qq.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Return a negative error code instead, and WARN() when we should be covering the entire 2-bit space of vmcs_field_type's return value. For increased robustness, add a BUILD_BUG_ON checking the range of vmcs_field_to_offset. Suggested-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Tiejun Chen authored
Instead of vmx_init(), actually it would make reasonable sense to do anything specific to vmx hardware setting in vmx_x86_ops->hardware_setup(). Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Tiejun Chen authored
Just move this pair of functions down to make sure later we can add something dependent on others. Signed-off-by:
Tiejun Chen <tiejun.chen@intel.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 03 Nov, 2014 4 commits
-
-
Nadav Amit authored
If DR4/5 is accessed when it is unavailable (since CR4.DE is set), then #UD should be generated even if CPL>0. This is according to Intel SDM Table 6-2: "Priority Among Simultaneous Exceptions and Interrupts". Note, that this may happen on the first DR access, even if the host does not sets debug breakpoints. Obviously, it occurs when the host debugs the guest. This patch moves the DR4/5 checks from __kvm_set_dr/_kvm_get_dr to handle_dr. The emulator already checks DR4/5 availability in check_dr_read. Nested virutalization related calls to kvm_set_dr/kvm_get_dr would not like to inject exceptions to the guest. As for SVM, the patch follows the previous logic as much as possible. Anyhow, it appears the DR interception code might be buggy - even if the DR access may cause an exception, the instruction is skipped. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Nadav Amit authored
DR7.LE should be cleared during task-switch. This feature is poorly documented. For reference, see: http://pdos.csail.mit.edu/6.828/2005/readings/i386/s12_02.htm SDM [17.2.4]: This feature is not supported in the P6 family processors, later IA-32 processors, and Intel 64 processors. AMD [2:13.1.1.4]: This bit is ignored by implementations of the AMD64 architecture. Intel's formulation could mean that it isn't even zeroed, but current hardware indeed does not behave like that. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Reviewed-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Nadav Amit authored
Intel SDM 17.2.4 (Debug Control Register (DR7)) says: "The processor clears the GD flag upon entering to the debug exception handler." This sentence may be misunderstood as if it happens only on #DB due to debug-register protection, but it happens regardless to the cause of the #DB. Fix the behavior to match both real hardware and Bochs. Signed-off-by:
Nadav Amit <namit@cs.technion.ac.il> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Andy Lutomirski authored
CR4.TSD is guest-owned; don't trap writes to it in VMX guests. This avoids a VM exit on context switches into or out of a PR_TSC_SIGSEGV task. I think that this fixes an unintentional side-effect of: 4c38609a KVM: VMX: Make guest cr4 mask more conservative Signed-off-by:
Andy Lutomirski <luto@amacapital.net> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-