1. 24 Aug, 2017 1 commit
    • Paolo Bonzini's avatar
      KVM: VMX: cache secondary exec controls · 80154d77
      Paolo Bonzini authored
      Currently, secondary execution controls are divided in three groups:
      
      - static, depending mostly on the module arguments or the processor
        (vmx_secondary_exec_control)
      
      - static, depending on CPUID (vmx_cpuid_update)
      
      - dynamic, depending on nested VMX or local APIC state
      
      Because walking CPUID is expensive, prepare_vmcs02 is using only
      the first group.  This however is unnecessarily complicated.  Just
      cache the static secondary execution controls, and then prepare_vmcs02
      does not need to compute them every time.  Computation of all static
      secondary execution controls is now kept in a single function,
      vmx_compute_secondary_exec_control.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      80154d77
  2. 23 Aug, 2017 2 commits
  3. 18 Aug, 2017 6 commits
  4. 15 Aug, 2017 1 commit
    • Arnd Bergmann's avatar
      kvm: avoid uninitialized-variable warnings · 076b925d
      Arnd Bergmann authored
      When PAGE_OFFSET is not a compile-time constant, we run into
      warnings from the use of kvm_is_error_hva() that the compiler
      cannot optimize out:
      
      arch/arm/kvm/../../../virt/kvm/kvm_main.c: In function '__kvm_gfn_to_hva_cache_init':
      arch/arm/kvm/../../../virt/kvm/kvm_main.c:1978:14: error: 'nr_pages_avail' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      arch/arm/kvm/../../../virt/kvm/kvm_main.c: In function 'gfn_to_page_many_atomic':
      arch/arm/kvm/../../../virt/kvm/kvm_main.c:1660:5: error: 'entry' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      This adds fake initializations to the two instances I ran into.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      076b925d
  5. 11 Aug, 2017 5 commits
    • Jim Mattson's avatar
      kvm: x86: Disallow illegal IA32_APIC_BASE MSR values · d3802286
      Jim Mattson authored
      Host-initiated writes to the IA32_APIC_BASE MSR do not have to follow
      local APIC state transition constraints, but the value written must be
      valid.
      Signed-off-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d3802286
    • Wanpeng Li's avatar
      KVM: MMU: Bail out immediately if there is no available mmu page · 26eeb53c
      Wanpeng Li authored
      Bailing out immediately if there is no available mmu page to alloc.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      26eeb53c
    • Wanpeng Li's avatar
      KVM: MMU: Fix softlockup due to mmu_lock is held too long · 42bcbebf
      Wanpeng Li authored
      watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [warn_test:3089]
       irq event stamp: 20532
       hardirqs last  enabled at (20531): [<ffffffff8e9b6908>] restore_regs_and_iret+0x0/0x1d
       hardirqs last disabled at (20532): [<ffffffff8e9b7ae8>] apic_timer_interrupt+0x98/0xb0
       softirqs last  enabled at (8266): [<ffffffff8e9badc6>] __do_softirq+0x206/0x4c1
       softirqs last disabled at (8253): [<ffffffff8e083918>] irq_exit+0xf8/0x100
       CPU: 5 PID: 3089 Comm: warn_test Tainted: G           OE   4.13.0-rc3+ #8
       RIP: 0010:kvm_mmu_prepare_zap_page+0x72/0x4b0 [kvm]
       Call Trace:
        make_mmu_pages_available.isra.120+0x71/0xc0 [kvm]
        kvm_mmu_load+0x1cf/0x410 [kvm]
        kvm_arch_vcpu_ioctl_run+0x1316/0x1bf0 [kvm]
        kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? __fget+0xfc/0x210
        do_vfs_ioctl+0xa4/0x6a0
        ? __fget+0x11d/0x210
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc2
        ? __this_cpu_preempt_check+0x13/0x20
      
      This can be reproduced readily by ept=N and running syzkaller tests since
      many syzkaller testcases don't setup any memory regions. However, if ept=Y
      rmode identity map will be created, then kvm_mmu_calculate_mmu_pages() will
      extend the number of VM's mmu pages to at least KVM_MIN_ALLOC_MMU_PAGES
      which just hide the issue.
      
      I saw the scenario kvm->arch.n_max_mmu_pages == 0 && kvm->arch.n_used_mmu_pages == 1,
      so there is one active mmu page on the list, kvm_mmu_prepare_zap_page() fails
      to zap any pages, however prepare_zap_oldest_mmu_page() always returns true.
      It incurs infinite loop in make_mmu_pages_available() which causes mmu->lock
      softlockup.
      
      This patch fixes it by setting the return value of prepare_zap_oldest_mmu_page()
      according to whether or not there is mmu page zapped.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      42bcbebf
    • David Hildenbrand's avatar
      KVM: nVMX: validate eptp pointer · a057e0e2
      David Hildenbrand authored
      Let's reuse the function introduced with eptp switching.
      
      We don't explicitly have to check against enable_ept_ad_bits, as this
      is implicitly done when checking against nested_vmx_ept_caps in
      valid_ept_address().
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a057e0e2
    • Andrew Jones's avatar
      KVM: MAINTAINERS improvements · a170504f
      Andrew Jones authored
      Remove nonexistent files, allow less awkward expressions when
      extracting arch-specific information, and only return relevant
      information when using arch-specific expressions. Additionally
      add include/trace/events/kvm.h, arch/*/include/uapi/asm/kvm*,
      and arch/powerpc/kernel/kvm* to appropriate sections. The arch-
      specific expressions are now:
      
       /KVM/                                        -- All KVM
       /\(KVM\)|\(KVM\/x86\)/                       -- X86
       /\(KVM\)|\(KVM\/x86\)|\(KVM\/amd\)/          -- X86 plus AMD
       /\(KVM\)|\(KVM\/arm\)/                       -- ARM
       /\(KVM\)|\(KVM\/arm\)|\(KVM\/arm64\)/        -- ARM plus ARM64
       /\(KVM\)|\(KVM\/powerpc\)/                   -- POWERPC
       /\(KVM\)|\(KVM\/s390\)/                      -- S390
       /\(KVM\)|\(KVM\/mips\)/                      -- MIPS
      Signed-off-by: default avatarAndrew Jones <drjones@redhat.com>
      Acked-by: default avatarCornelia Huck <cohuck@redhat.com>
      Acked-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a170504f
  6. 10 Aug, 2017 3 commits
    • Paolo Bonzini's avatar
      kvm: nVMX: Add support for fast unprotection of nested guest page tables · eebed243
      Paolo Bonzini authored
      This is the same as commit 14727754 ("kvm: svm: Add support for
      additional SVM NPF error codes", 2016-11-23), but for Intel processors.
      In this case, the exit qualification field's bit 8 says whether the
      EPT violation occurred while translating the guest's final physical
      address or rather while translating the guest page tables.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eebed243
    • Brijesh Singh's avatar
      KVM: SVM: Limit PFERR_NESTED_GUEST_PAGE error_code check to L1 guest · 64531a3b
      Brijesh Singh authored
      Commit 14727754 ("kvm: svm: Add support for additional SVM NPF error
      codes", 2016-11-23) added a new error code to aid nested page fault
      handling.  The commit unprotects (kvm_mmu_unprotect_page) the page when
      we get a NPF due to guest page table walk where the page was marked RO.
      
      However, if an L0->L2 shadow nested page table can also be marked read-only
      when a page is read only in L1's nested page table.  If such a page
      is accessed by L2 while walking page tables it can cause a nested
      page fault (page table walks are write accesses).  However, after
      kvm_mmu_unprotect_page we may get another page fault, and again in an
      endless stream.
      
      To cover this use case, we qualify the new error_code check with
      vcpu->arch.mmu_direct_map so that the error_code check would run on L1
      guest, and not the L2 guest.  This avoids hitting the above scenario.
      
      Fixes: 14727754
      Cc: stable@vger.kernel.org
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      64531a3b
    • Wanpeng Li's avatar
      KVM: X86: Fix residual mmio emulation request to userspace · bbeac283
      Wanpeng Li authored
      Reported by syzkaller:
      
      The kvm-intel.unrestricted_guest=0
      
         WARNING: CPU: 5 PID: 1014 at /home/kernel/data/kvm/arch/x86/kvm//x86.c:7227 kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm]
         CPU: 5 PID: 1014 Comm: warn_test Tainted: G        W  OE   4.13.0-rc3+ #8
         RIP: 0010:kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm]
         Call Trace:
          ? put_pid+0x3a/0x50
          ? rcu_read_lock_sched_held+0x79/0x80
          ? kmem_cache_free+0x2f2/0x350
          kvm_vcpu_ioctl+0x340/0x700 [kvm]
          ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
          ? __fget+0xfc/0x210
          do_vfs_ioctl+0xa4/0x6a0
          ? __fget+0x11d/0x210
          SyS_ioctl+0x79/0x90
          entry_SYSCALL_64_fastpath+0x23/0xc2
          ? __this_cpu_preempt_check+0x13/0x20
      
      The syszkaller folks reported a residual mmio emulation request to userspace
      due to vm86 fails to emulate inject real mode interrupt(fails to read CS) and
      incurs a triple fault. The vCPU returns to userspace with vcpu->mmio_needed == true
      and KVM_EXIT_SHUTDOWN exit reason. However, the syszkaller testcase constructs
      several threads to launch the same vCPU, the thread which lauch this vCPU after
      the thread whichs get the vcpu->mmio_needed == true and KVM_EXIT_SHUTDOWN will
      trigger the warning.
      
         #define _GNU_SOURCE
         #include <pthread.h>
         #include <stdio.h>
         #include <stdlib.h>
         #include <string.h>
         #include <sys/wait.h>
         #include <sys/types.h>
         #include <sys/stat.h>
         #include <sys/mman.h>
         #include <fcntl.h>
         #include <unistd.h>
         #include <linux/kvm.h>
         #include <stdio.h>
      
         int kvmcpu;
         struct kvm_run *run;
      
         void* thr(void* arg)
         {
           int res;
           res = ioctl(kvmcpu, KVM_RUN, 0);
           printf("ret1=%d exit_reason=%d suberror=%d\n",
               res, run->exit_reason, run->internal.suberror);
           return 0;
         }
      
         void test()
         {
           int i, kvm, kvmvm;
           pthread_t th[4];
      
           kvm = open("/dev/kvm", O_RDWR);
           kvmvm = ioctl(kvm, KVM_CREATE_VM, 0);
           kvmcpu = ioctl(kvmvm, KVM_CREATE_VCPU, 0);
           run = (struct kvm_run*)mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, kvmcpu, 0);
           srand(getpid());
           for (i = 0; i < 4; i++) {
             pthread_create(&th[i], 0, thr, 0);
             usleep(rand() % 10000);
           }
           for (i = 0; i < 4; i++)
             pthread_join(th[i], 0);
         }
      
         int main()
         {
           for (;;) {
             int pid = fork();
             if (pid < 0)
               exit(1);
             if (pid == 0) {
               test();
               exit(0);
             }
             int status;
             while (waitpid(pid, &status, __WALL) != pid) {}
           }
           return 0;
         }
      
      This patch fixes it by resetting the vcpu->mmio_needed once we receive
      the triple fault to avoid the residue.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bbeac283
  7. 08 Aug, 2017 4 commits
  8. 07 Aug, 2017 11 commits
  9. 06 Aug, 2017 7 commits