1. 28 Mar, 2018 8 commits
    • Babu Moger's avatar
      KVM: SVM: Add pause filter threshold · 1d8fb44a
      Babu Moger authored
      This patch adds the support for pause filtering threshold. This feature
      support is indicated by CPUID Fn8000_000A_EDX. See AMD APM Vol 2 Section
      15.14.4 Pause Intercept Filtering for more details.
      
      In this mode, a 16-bit pause filter threshold field is added in VMCB.
      The threshold value is a cycle count that is used to reset the pause
      counter.  As with simple pause filtering, VMRUN loads the pause count
      value from VMCB into an internal counter. Then, on each pause instruction
      the hardware checks the elapsed number of cycles since the most recent
      pause instruction against the pause Filter Threshold. If the elapsed cycle
      count is greater than the pause filter threshold, then the internal pause
      count is reloaded from VMCB and execution continues. If the elapsed cycle
      count is less than the pause filter threshold, then the internal pause
      count is decremented. If the count value is less than zero and pause
      intercept is enabled, a #VMEXIT is triggered. If advanced pause filtering
      is supported and pause filter threshold field is set to zero, the filter
      will operate in the simpler, count only mode.
      Signed-off-by: default avatarBabu Moger <babu.moger@amd.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      1d8fb44a
    • Babu Moger's avatar
      KVM: VMX: Bring the common code to header file · c8e88717
      Babu Moger authored
      This patch brings some of the code from vmx to x86.h header file. Now, we
      can share this code between vmx and svm. Modified couple functions to make
      it common.
      Signed-off-by: default avatarBabu Moger <babu.moger@amd.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      c8e88717
    • Babu Moger's avatar
      KVM: VMX: Remove ple_window_actual_max · 18abdc34
      Babu Moger authored
      Get rid of ple_window_actual_max, because its benefits are really
      minuscule and the logic is complicated.
      
      The overflows(and underflow) are controlled in __ple_window_grow
      and _ple_window_shrink respectively.
      Suggested-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarBabu Moger <babu.moger@amd.com>
      [Fixed potential wraparound and change the max to UINT_MAX. - Radim]
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      18abdc34
    • Babu Moger's avatar
      KVM: VMX: Fix the module parameters for vmx · 7fbc85a5
      Babu Moger authored
      The vmx module parameters are supposed to be unsigned variants.
      
      Also fixed the checkpatch errors like the one below.
      
      WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
      +module_param(ple_gap, uint, S_IRUGO);
      Signed-off-by: default avatarBabu Moger <babu.moger@amd.com>
      [Expanded uint to unsigned int in code. - Radim]
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      7fbc85a5
    • Andi Kleen's avatar
      KVM: x86: Fix perf timer mode IP reporting · dd60d217
      Andi Kleen authored
      KVM and perf have a special backdoor mechanism to report the IP for interrupts
      re-executed after vm exit. This works for the NMIs that perf normally uses.
      
      However when perf is in timer mode it doesn't work because the timer interrupt
      doesn't get this special treatment. This is common when KVM is running
      nested in another hypervisor which may not implement the PMU, so only
      timer mode is available.
      
      Call the functions to set up the backdoor IP also for non NMI interrupts.
      
      I renamed the functions to set up the backdoor IP reporting to be more
      appropiate for their new use.  The SVM change is only compile tested.
      
      v2: Moved the functions inline.
      For the normal interrupt case the before/after functions are now
      called from x86.c, not arch specific code.
      For the NMI case we still need to call it in the architecture
      specific code, because it's already needed in the low level *_run
      functions.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      [Removed unnecessary calls from arch handle_external_intr. - Radim]
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      dd60d217
    • Radim Krčmář's avatar
      Merge tag 'kvm-arm-for-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm · abe7a458
      Radim Krčmář authored
      KVM/ARM updates for v4.17
      
      - VHE optimizations
      - EL2 address space randomization
      - Variant 3a mitigation for Cortex-A57 and A72
      - The usual vgic fixes
      - Various minor tidying-up
      abe7a458
    • Marc Zyngier's avatar
      arm64: Add temporary ERRATA_MIDR_ALL_VERSIONS compatibility macro · dc6ed61d
      Marc Zyngier authored
      MIDR_ALL_VERSIONS is changing, and won't have the same meaning
      in 4.17, and the right thing to use will be ERRATA_MIDR_ALL_VERSIONS.
      
      In order to cope with the merge window, let's add a compatibility
      macro that will allow a relatively smooth transition, and that
      can be removed post 4.17-rc1.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      dc6ed61d
    • Marc Zyngier's avatar
      Revert "arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening" · adc91ab7
      Marc Zyngier authored
      Creates far too many conflicts with arm64/for-next/core, to be
      resent post -rc1.
      
      This reverts commit f9f5dc19.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      adc91ab7
  2. 26 Mar, 2018 2 commits
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic-its: Fix potential overrun in vgic_copy_lpi_list · 7d8b44c5
      Marc Zyngier authored
      vgic_copy_lpi_list() parses the LPI list and picks LPIs targeting
      a given vcpu. We allocate the array containing the intids before taking
      the lpi_list_lock, which means we can have an array size that is not
      equal to the number of LPIs.
      
      This is particularly obvious when looking at the path coming from
      vgic_enable_lpis, which is not a command, and thus can run in parallel
      with commands:
      
      vcpu 0:                                        vcpu 1:
      vgic_enable_lpis
        its_sync_lpi_pending_table
          vgic_copy_lpi_list
            intids = kmalloc_array(irq_count)
                                                     MAPI(lpi targeting vcpu 0)
            list_for_each_entry(lpi_list_head)
              intids[i++] = irq->intid;
      
      At that stage, we will happily overrun the intids array. Boo. An easy
      fix is is to break once the array is full. The MAPI command will update
      the config anyway, and we won't miss a thing. We also make sure that
      lpi_list_count is read exactly once, so that further updates of that
      value will not affect the array bound check.
      
      Cc: stable@vger.kernel.org
      Fixes: ccb1d791 ("KVM: arm64: vgic-its: Fix pending table sync")
      Reviewed-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      7d8b44c5
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic: Disallow Active+Pending for level interrupts · 67b5b673
      Marc Zyngier authored
      It was recently reported that VFIO mediated devices, and anything
      that VFIO exposes as level interrupts, do no strictly follow the
      expected logic of such interrupts as it only lowers the input
      line when the guest has EOId the interrupt at the GIC level, rather
      than when it Acked the interrupt at the device level.
      
      THe GIC's Active+Pending state is fundamentally incompatible with
      this behaviour, as it prevents KVM from observing the EOI, and in
      turn results in VFIO never dropping the line. This results in an
      interrupt storm in the guest, which it really never expected.
      
      As we cannot really change VFIO to follow the strict rules of level
      signalling, let's forbid the A+P state altogether, as it is in the
      end only an optimization. It ensures that we will transition via
      an invalid state, which we can use to notify VFIO of the EOI.
      Reviewed-by: default avatarEric Auger <eric.auger@redhat.com>
      Tested-by: default avatarEric Auger <eric.auger@redhat.com>
      Tested-by: default avatarShunyong Yang <shunyong.yang@hxt-semitech.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      67b5b673
  3. 23 Mar, 2018 5 commits
  4. 21 Mar, 2018 2 commits
    • Paolo Bonzini's avatar
      KVM: nVMX: fix vmentry failure code when L2 state would require emulation · 3184a995
      Paolo Bonzini authored
      Commit 2bb8cafe ("KVM: vVMX: signal failure for nested VMEntry if
      emulation_required", 2018-03-12) introduces a new error path which does
      not set *entry_failure_code.  Fix that to avoid a leak of L0 stack to L1.
      Reported-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3184a995
    • Liran Alon's avatar
      KVM: nVMX: Do not load EOI-exitmap while running L2 · e40ff1d6
      Liran Alon authored
      When L1 IOAPIC redirection-table is written, a request of
      KVM_REQ_SCAN_IOAPIC is set on all vCPUs. This is done such that
      all vCPUs will now recalc their IOAPIC handled vectors and load
      it to their EOI-exitmap.
      
      However, it could be that one of the vCPUs is currently running
      L2. In this case, load_eoi_exitmap() will be called which would
      write to vmcs02->eoi_exit_bitmap, which is wrong because
      vmcs02->eoi_exit_bitmap should always be equal to
      vmcs12->eoi_exit_bitmap. Furthermore, at this point
      KVM_REQ_SCAN_IOAPIC was already consumed and therefore we will
      never update vmcs01->eoi_exit_bitmap. This could lead to remote_irr
      of some IOAPIC level-triggered entry to remain set forever.
      
      Fix this issue by delaying the load of EOI-exitmap to when vCPU
      is running L1.
      
      One may wonder why not just delay entire KVM_REQ_SCAN_IOAPIC
      processing to when vCPU is running L1. This is done in order to handle
      correctly the case where LAPIC & IO-APIC of L1 is pass-throughed into
      L2. In this case, vmcs12->virtual_interrupt_delivery should be 0. In
      current nVMX implementation, that results in
      vmcs02->virtual_interrupt_delivery to also be 0. Thus,
      vmcs02->eoi_exit_bitmap is not used. Therefore, every L2 EOI cause
      a #VMExit into L0 (either on MSR_WRITE to x2APIC MSR or
      APIC_ACCESS/APIC_WRITE/EPT_MISCONFIG to APIC MMIO page).
      In order for such L2 EOI to be broadcasted, if needed, from LAPIC
      to IO-APIC, vcpu->arch.ioapic_handled_vectors must be updated
      while L2 is running. Therefore, patch makes sure to delay only the
      loading of EOI-exitmap but not the update of
      vcpu->arch.ioapic_handled_vectors.
      Reviewed-by: default avatarArbel Moshe <arbel.moshe@oracle.com>
      Reviewed-by: default avatarKrish Sadhukhan <krish.sadhukhan@oracle.com>
      Signed-off-by: default avatarLiran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e40ff1d6
  5. 19 Mar, 2018 23 commits