An error occurred fetching the project authors.
  1. 08 Jun, 2022 6 commits
    • Like Xu's avatar
      perf/x86/core: Pass "struct kvm_pmu *" to determine the guest values · 39a4d779
      Like Xu authored
      Splitting the logic for determining the guest values is unnecessarily
      confusing, and potentially fragile. Perf should have full knowledge and
      control of what values are loaded for the guest.
      
      If we change .guest_get_msrs() to take a struct kvm_pmu pointer, then it
      can generate the full set of guest values by grabbing guest ds_area and
      pebs_data_cfg. Alternatively, .guest_get_msrs() could take the desired
      guest MSR values directly (ds_area and pebs_data_cfg), but kvm_pmu is
      vendor agnostic, so we don't see any reason to not just pass the pointer.
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarLike Xu <like.xu@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Message-Id: <20220411101946.20262-4-likexu@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      39a4d779
    • Chao Gao's avatar
      KVM: VMX: enable IPI virtualization · d588bb9b
      Chao Gao authored
      With IPI virtualization enabled, the processor emulates writes to
      APIC registers that would send IPIs. The processor sets the bit
      corresponding to the vector in target vCPU's PIR and may send a
      notification (IPI) specified by NDST and NV fields in target vCPU's
      Posted-Interrupt Descriptor (PID). It is similar to what IOMMU
      engine does when dealing with posted interrupt from devices.
      
      A PID-pointer table is used by the processor to locate the PID of a
      vCPU with the vCPU's APIC ID. The table size depends on maximum APIC
      ID assigned for current VM session from userspace. Allocating memory
      for PID-pointer table is deferred to vCPU creation, because irqchip
      mode and VM-scope maximum APIC ID is settled at that point. KVM can
      skip PID-pointer table allocation if !irqchip_in_kernel().
      
      Like VT-d PI, if a vCPU goes to blocked state, VMM needs to switch its
      notification vector to wakeup vector. This can ensure that when an IPI
      for blocked vCPUs arrives, VMM can get control and wake up blocked
      vCPUs. And if a VCPU is preempted, its posted interrupt notification
      is suppressed.
      
      Note that IPI virtualization can only virualize physical-addressing,
      flat mode, unicast IPIs. Sending other IPIs would still cause a
      trap-like APIC-write VM-exit and need to be handled by VMM.
      Signed-off-by: default avatarChao Gao <chao.gao@intel.com>
      Signed-off-by: default avatarZeng Guang <guang.zeng@intel.com>
      Message-Id: <20220419154510.11938-1-guang.zeng@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d588bb9b
    • Zeng Guang's avatar
      KVM: VMX: Clean up vmx_refresh_apicv_exec_ctrl() · f08a06c9
      Zeng Guang authored
      Remove the condition check cpu_has_secondary_exec_ctrls(). Calling
      vmx_refresh_apicv_exec_ctrl() premises secondary controls activated
      and VMCS fields related to APICv valid as well. If it's invoked in
      wrong circumstance at the worst case, VMX operation will report
      VMfailValid error without further harmful impact and just functions
      as if all the secondary controls were 0.
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarZeng Guang <guang.zeng@intel.com>
      Message-Id: <20220419153604.11786-1-guang.zeng@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f08a06c9
    • Robert Hoo's avatar
      KVM: VMX: Report tertiary_exec_control field in dump_vmcs() · 0b85baa5
      Robert Hoo authored
      Add tertiary_exec_control field report in dump_vmcs(). Meanwhile,
      reorganize the dump output of VMCS category as follows.
      
      Before change:
      *** Control State ***
       PinBased=0x000000ff CPUBased=0xb5a26dfa SecondaryExec=0x061037eb
       EntryControls=0000d1ff ExitControls=002befff
      
      After change:
      *** Control State ***
       CPUBased=0xb5a26dfa SecondaryExec=0x061037eb TertiaryExec=0x0000000000000010
       PinBased=0x000000ff EntryControls=0000d1ff ExitControls=002befff
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarRobert Hoo <robert.hu@linux.intel.com>
      Signed-off-by: default avatarZeng Guang <guang.zeng@intel.com>
      Message-Id: <20220419153441.11687-1-guang.zeng@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0b85baa5
    • Robert Hoo's avatar
      KVM: VMX: Detect Tertiary VM-Execution control when setup VMCS config · 1ad4e543
      Robert Hoo authored
      Check VMX features on tertiary execution control in VMCS config setup.
      Sub-features in tertiary execution control to be enabled are adjusted
      according to hardware capabilities although no sub-feature is enabled
      in this patch.
      
      EVMCSv1 doesn't support tertiary VM-execution control, so disable it
      when EVMCSv1 is in use. And define the auxiliary functions for Tertiary
      control field here, using the new BUILD_CONTROLS_SHADOW().
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarRobert Hoo <robert.hu@linux.intel.com>
      Signed-off-by: default avatarZeng Guang <guang.zeng@intel.com>
      Message-Id: <20220419153400.11642-1-guang.zeng@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1ad4e543
    • Sean Christopherson's avatar
      KVM: x86: Differentiate Soft vs. Hard IRQs vs. reinjected in tracepoint · 2d613912
      Sean Christopherson authored
      In the IRQ injection tracepoint, differentiate between Hard IRQs and Soft
      "IRQs", i.e. interrupts that are reinjected after incomplete delivery of
      a software interrupt from an INTn instruction.  Tag reinjected interrupts
      as such, even though the information is usually redundant since soft
      interrupts are only ever reinjected by KVM.  Though rare in practice, a
      hard IRQ can be reinjected.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      [MSS: change "kvm_inj_virq" event "reinjected" field type to bool]
      Signed-off-by: default avatarMaciej S. Szmigiero <maciej.szmigiero@oracle.com>
      Message-Id: <9664d49b3bd21e227caa501cff77b0569bebffe2.1651440202.git.maciej.szmigiero@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2d613912
  2. 27 May, 2022 1 commit
  3. 25 May, 2022 3 commits
  4. 12 May, 2022 1 commit
  5. 06 May, 2022 1 commit
    • Sean Christopherson's avatar
      KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state · 053d2290
      Sean Christopherson authored
      Exit to userspace with an emulation error if KVM encounters an injected
      exception with invalid guest state, in addition to the existing check of
      bailing if there's a pending exception (KVM doesn't support emulating
      exceptions except when emulating real mode via vm86).
      
      In theory, KVM should never get to such a situation as KVM is supposed to
      exit to userspace before injecting an exception with invalid guest state.
      But in practice, userspace can intervene and manually inject an exception
      and/or stuff registers to force invalid guest state while a previously
      injected exception is awaiting reinjection.
      
      Fixes: fc4fad79 ("KVM: VMX: Reject KVM_RUN if emulation is required with pending exception")
      Reported-by: syzbot+cfafed3bb76d3e37581b@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220502221850.131873-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      053d2290
  6. 29 Apr, 2022 1 commit
  7. 21 Apr, 2022 1 commit
    • Sean Christopherson's avatar
      KVM: nVMX: Defer APICv updates while L2 is active until L1 is active · 7c69661e
      Sean Christopherson authored
      Defer APICv updates that occur while L2 is active until nested VM-Exit,
      i.e. until L1 regains control.  vmx_refresh_apicv_exec_ctrl() assumes L1
      is active and (a) stomps all over vmcs02 and (b) neglects to ever updated
      vmcs01.  E.g. if vmcs12 doesn't enable the TPR shadow for L2 (and thus no
      APICv controls), L1 performs nested VM-Enter APICv inhibited, and APICv
      becomes unhibited while L2 is active, KVM will set various APICv controls
      in vmcs02 and trigger a failed VM-Entry.  The kicker is that, unless
      running with nested_early_check=1, KVM blames L1 and chaos ensues.
      
      In all cases, ignoring vmcs02 and always deferring the inhibition change
      to vmcs01 is correct (or at least acceptable).  The ABSENT and DISABLE
      inhibitions cannot truly change while L2 is active (see below).
      
      IRQ_BLOCKING can change, but it is firmly a best effort debug feature.
      Furthermore, only L2's APIC is accelerated/virtualized to the full extent
      possible, e.g. even if L1 passes through its APIC to L2, normal MMIO/MSR
      interception will apply to the virtual APIC managed by KVM.
      The exception is the SELF_IPI register when x2APIC is enabled, but that's
      an acceptable hole.
      
      Lastly, Hyper-V's Auto EOI can technically be toggled if L1 exposes the
      MSRs to L2, but for that to work in any sane capacity, L1 would need to
      pass through IRQs to L2 as well, and IRQs must be intercepted to enable
      virtual interrupt delivery.  I.e. exposing Auto EOI to L2 and enabling
      VID for L2 are, for all intents and purposes, mutually exclusive.
      
      Lack of dynamic toggling is also why this scenario is all but impossible
      to encounter in KVM's current form.  But a future patch will pend an
      APICv update request _during_ vCPU creation to plug a race where a vCPU
      that's being created doesn't get included in the "all vCPUs request"
      because it's not yet visible to other vCPUs.  If userspaces restores L2
      after VM creation (hello, KVM selftests), the first KVM_RUN will occur
      while L2 is active and thus service the APICv update request made during
      VM creation.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220420013732.3308816-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7c69661e
  8. 13 Apr, 2022 3 commits
  9. 02 Apr, 2022 5 commits
  10. 01 Mar, 2022 1 commit
  11. 25 Feb, 2022 6 commits
    • Paolo Bonzini's avatar
      KVM: x86: use struct kvm_mmu_root_info for mmu->root · b9e5603c
      Paolo Bonzini authored
      The root_hpa and root_pgd fields form essentially a struct kvm_mmu_root_info.
      Use the struct to have more consistency between mmu->root and
      mmu->prev_roots.
      
      The patch is entirely search and replace except for cached_root_available,
      which does not need a temporary struct kvm_mmu_root_info anymore.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b9e5603c
    • Peng Hao's avatar
      KVM: VMX: Remove scratch 'cpu' variable that shadows an identical scratch var · 0b8934d3
      Peng Hao authored
       From: Peng Hao <flyingpeng@tencent.com>
      
       Remove a redundant 'cpu' declaration from inside an if-statement that
       that shadows an identical declaration at function scope.  Both variables
       are used as scratch variables in for_each_*_cpu() loops, thus there's no
       harm in sharing a variable.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPeng Hao <flyingpeng@tencent.com>
      Message-Id: <20220222103954.70062-1-flyingpeng@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0b8934d3
    • Peng Hao's avatar
      kvm: vmx: Fix typos comment in __loaded_vmcs_clear() · 105e0c44
      Peng Hao authored
      Fix a comment documenting the memory barrier related to clearing a
      loaded_vmcs; loaded_vmcs tracks the host CPU the VMCS is loaded on via
      the field 'cpu', it doesn't have a 'vcpu' field.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPeng Hao <flyingpeng@tencent.com>
      Message-Id: <20220222104029.70129-1-flyingpeng@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      105e0c44
    • Peng Hao's avatar
      KVM: nVMX: Make setup/unsetup under the same conditions · fbc2dfe5
      Peng Hao authored
      Make sure nested_vmx_hardware_setup/unsetup() are called in pairs under
      the same conditions.  Calling nested_vmx_hardware_unsetup() when nested
      is false "works" right now because it only calls free_page() on zero-
      initialized pointers, but it's possible that more code will be added to
      nested_vmx_hardware_unsetup() in the future.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPeng Hao <flyingpeng@tencent.com>
      Message-Id: <20220222104054.70286-1-flyingpeng@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fbc2dfe5
    • Sean Christopherson's avatar
      Revert "KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()" · 1a715810
      Sean Christopherson authored
      Revert back to refreshing vmcs.HOST_CR3 immediately prior to VM-Enter.
      The PCID (ASID) part of CR3 can be bumped without KVM being scheduled
      out, as the kernel will switch CR3 during __text_poke(), e.g. in response
      to a static key toggling.  If switch_mm_irqs_off() chooses a new ASID for
      the mm associate with KVM, KVM will do VM-Enter => VM-Exit with a stale
      vmcs.HOST_CR3.
      
      Add a comment to explain why KVM must wait until VM-Enter is imminent to
      refresh vmcs.HOST_CR3.
      
      The following splat was captured by stashing vmcs.HOST_CR3 in kvm_vcpu
      and adding a WARN in load_new_mm_cr3() to fire if a new ASID is being
      loaded for the KVM-associated mm while KVM has a "running" vCPU:
      
        static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush)
        {
      	struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
      
      	...
      
      	WARN(vcpu && (vcpu->cr3 & GENMASK(11, 0)) != (new_mm_cr3 & GENMASK(11, 0)) &&
      	     (vcpu->cr3 & PHYSICAL_PAGE_MASK) == (new_mm_cr3 & PHYSICAL_PAGE_MASK),
      	     "KVM is hosed, loading CR3 = %lx, vmcs.HOST_CR3 = %lx", new_mm_cr3, vcpu->cr3);
        }
      
        ------------[ cut here ]------------
        KVM is hosed, loading CR3 = 8000000105393004, vmcs.HOST_CR3 = 105393003
        WARNING: CPU: 4 PID: 20717 at arch/x86/mm/tlb.c:291 load_new_mm_cr3+0x82/0xe0
        Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel
        CPU: 4 PID: 20717 Comm: stable Tainted: G        W         5.17.0-rc3+ #747
        Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
        RIP: 0010:load_new_mm_cr3+0x82/0xe0
        RSP: 0018:ffffc9000489fa98 EFLAGS: 00010082
        RAX: 0000000000000000 RBX: 8000000105393004 RCX: 0000000000000027
        RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff888277d1b788
        RBP: 0000000000000004 R08: ffff888277d1b780 R09: ffffc9000489f8b8
        R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
        R13: ffff88810678a800 R14: 0000000000000004 R15: 0000000000000c33
        FS:  00007fa9f0e72700(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 00000001001b5003 CR4: 0000000000172ea0
        Call Trace:
         <TASK>
         switch_mm_irqs_off+0x1cb/0x460
         __text_poke+0x308/0x3e0
         text_poke_bp_batch+0x168/0x220
         text_poke_finish+0x1b/0x30
         arch_jump_label_transform_apply+0x18/0x30
         static_key_slow_inc_cpuslocked+0x7c/0x90
         static_key_slow_inc+0x16/0x20
         kvm_lapic_set_base+0x116/0x190
         kvm_set_apic_base+0xa5/0xe0
         kvm_set_msr_common+0x2f4/0xf60
         vmx_set_msr+0x355/0xe70 [kvm_intel]
         kvm_set_msr_ignored_check+0x91/0x230
         kvm_emulate_wrmsr+0x36/0x120
         vmx_handle_exit+0x609/0x6c0 [kvm_intel]
         kvm_arch_vcpu_ioctl_run+0x146f/0x1b80
         kvm_vcpu_ioctl+0x279/0x690
         __x64_sys_ioctl+0x83/0xb0
         do_syscall_64+0x3b/0xc0
         entry_SYSCALL_64_after_hwframe+0x44/0xae
         </TASK>
        ---[ end trace 0000000000000000 ]---
      
      This reverts commit 15ad9762.
      
      Fixes: 15ad9762 ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()")
      Reported-by: default avatarWanpeng Li <kernellwp@gmail.com>
      Cc: Lai Jiangshan <laijs@linux.alibaba.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Acked-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Message-Id: <20220224191917.3508476-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1a715810
    • Sean Christopherson's avatar
      Revert "KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()" · bca06b85
      Sean Christopherson authored
      Undo a nested VMX fix as a step toward reverting the commit it fixed,
      15ad9762 ("KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()"),
      as the underlying premise that "host CR3 in the vcpu thread can only be
      changed when scheduling" is wrong.
      
      This reverts commit a9f2705e.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220224191917.3508476-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bca06b85
  12. 18 Feb, 2022 1 commit
  13. 10 Feb, 2022 5 commits
    • Oliver Upton's avatar
      KVM: VMX: Use local pointer to vcpu_vmx in vmx_vcpu_after_set_cpuid() · 48ebd0cf
      Oliver Upton authored
      There is a local that contains a pointer to vcpu_vmx already. Just use
      that instead to get at the structure directly instead of doing pointer
      arithmetic.
      
      No functional change intended.
      Signed-off-by: default avatarOliver Upton <oupton@google.com>
      Message-Id: <20220204204705.3538240-8-oupton@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      48ebd0cf
    • Wanpeng Li's avatar
      KVM: VMX: Dont' send posted IRQ if vCPU == this vCPU and vCPU is IN_GUEST_MODE · 9b44423b
      Wanpeng Li authored
      When delivering a virtual interrupt, don't actually send a posted interrupt
      if the target vCPU is also the currently running vCPU and is IN_GUEST_MODE,
      in which case the interrupt is being sent from a VM-Exit fastpath and the
      core run loop in vcpu_enter_guest() will manually move the interrupt from
      the PIR to vmcs.GUEST_RVI.  IRQs are disabled while IN_GUEST_MODE, thus
      there's no possibility of the virtual interrupt being sent from anything
      other than KVM, i.e. KVM won't suppress a wake event from an IRQ handler
      (see commit fdba608f, "KVM: VMX: Wake vCPU when delivering posted IRQ
      even if vCPU == this vCPU").
      
      Eliding the posted interrupt restores the performance provided by the
      combination of commits 379a3c8e ("KVM: VMX: Optimize posted-interrupt
      delivery for timer fastpath") and 26efe2fd ("KVM: VMX: Handle
      preemption timer fastpath").
      
      Thanks Sean for better comments.
      Suggested-by: default avatarChao Gao <chao.gao@intel.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Message-Id: <1643111979-36447-1-git-send-email-wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9b44423b
    • Sean Christopherson's avatar
      KVM: VMX: Rename VMX functions to conform to kvm_x86_ops names · 58fccda4
      Sean Christopherson authored
      Massage VMX's implementation names for kvm_x86_ops to maximize use of
      kvm-x86-ops.h.  Leave cpu_has_vmx_wbinvd_exit() as-is to preserve the
      cpu_has_vmx_*() pattern used for querying VMCS capabilities.  Keep
      pi_has_pending_interrupt() as vmx_dy_apicv_has_pending_interrupt() does
      a poor job of describing exactly what is being checked in VMX land.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220128005208.4008533-14-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      58fccda4
    • Sean Christopherson's avatar
      KVM: VMX: Call vmx_get_cpl() directly in handle_dr() · ef2d488c
      Sean Christopherson authored
      Use vmx_get_cpl() instead of bouncing through kvm_x86_ops.get_cpl() when
      performing a CPL check on MOV DR accesses.  This avoids a RETPOLINE (when
      enabled), and more importantly removes a vendor reference to kvm_x86_ops
      and helps pave the way for unexporting kvm_x86_ops.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220128005208.4008533-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ef2d488c
    • Sean Christopherson's avatar
      KVM: x86: Rename kvm_x86_ops pointers to align w/ preferred vendor names · e27bc044
      Sean Christopherson authored
      Rename a variety of kvm_x86_op function pointers so that preferred name
      for vendor implementations follows the pattern <vendor>_<function>, e.g.
      rename .run() to .vcpu_run() to match {svm,vmx}_vcpu_run().  This will
      allow vendor implementations to be wired up via the KVM_X86_OP macro.
      
      In many cases, VMX and SVM "disagree" on the preferred name, though in
      reality it's VMX and x86 that disagree as SVM blindly prepended _svm to
      the kvm_x86_ops name.  Justification for using the VMX nomenclature:
      
        - set_{irq,nmi} => inject_{irq,nmi} because the helper is injecting an
          event that has already been "set" in e.g. the vIRR.  SVM's relevant
          VMCB field is even named event_inj, and KVM's stat is irq_injections.
      
        - prepare_guest_switch => prepare_switch_to_guest because the former is
          ambiguous, e.g. it could mean switching between multiple guests,
          switching from the guest to host, etc...
      
        - update_pi_irte => pi_update_irte to allow for matching match the rest
          of VMX's posted interrupt naming scheme, which is vmx_pi_<blah>().
      
        - start_assignment => pi_start_assignment to again follow VMX's posted
          interrupt naming scheme, and to provide context for what bit of code
          might care about an otherwise undescribed "assignment".
      
      The "tlb_flush" => "flush_tlb" creates an inconsistency with respect to
      Hyper-V's "tlb_remote_flush" hooks, but Hyper-V really is the one that's
      wrong.  x86, VMX, and SVM all use flush_tlb, and even common KVM is on a
      variant of the bandwagon with "kvm_flush_remote_tlbs", e.g. a more
      appropriate name for the Hyper-V hooks would be flush_remote_tlbs.  Leave
      that change for another time as the Hyper-V hooks always start as NULL,
      i.e. the name doesn't matter for using kvm-x86-ops.h, and changing all
      names requires an astounding amount of churn.
      
      VMX and SVM function names are intentionally left as is to minimize the
      diff.  Both VMX and SVM will need to rename even more functions in order
      to fully utilize KVM_X86_OPS, i.e. an additional patch for each is
      inevitable.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220128005208.4008533-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e27bc044
  14. 08 Feb, 2022 1 commit
  15. 01 Feb, 2022 2 commits
    • Mark Rutland's avatar
      kvm/x86: rework guest entry logic · b2d2af7e
      Mark Rutland authored
      For consistency and clarity, migrate x86 over to the generic helpers for
      guest timing and lockdep/RCU/tracing management, and remove the
      x86-specific helpers.
      
      Prior to this patch, the guest timing was entered in
      kvm_guest_enter_irqoff() (called by svm_vcpu_enter_exit() and
      svm_vcpu_enter_exit()), and was exited by the call to
      vtime_account_guest_exit() within vcpu_enter_guest().
      
      To minimize duplication and to more clearly balance entry and exit, both
      entry and exit of guest timing are placed in vcpu_enter_guest(), using
      the new guest_timing_{enter,exit}_irqoff() helpers. When context
      tracking is used a small amount of additional time will be accounted
      towards guests; tick-based accounting is unnaffected as IRQs are
      disabled at this point and not enabled until after the return from the
      guest.
      
      This also corrects (benign) mis-balanced context tracking accounting
      introduced in commits:
      
        ae95f566 ("KVM: X86: TSCDEADLINE MSR emulation fastpath")
        26efe2fd ("KVM: VMX: Handle preemption timer fastpath")
      
      Where KVM can enter a guest multiple times, calling vtime_guest_enter()
      without a corresponding call to vtime_account_guest_exit(), and with
      vtime_account_system() called when vtime_account_guest() should be used.
      As account_system_time() checks PF_VCPU and calls account_guest_time(),
      this doesn't result in any functional problem, but is unnecessarily
      confusing.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarNicolas Saenz Julienne <nsaenzju@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Message-Id: <20220201132926.3301912-4-mark.rutland@arm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b2d2af7e
    • Sean Christopherson's avatar
      KVM: x86: Move delivery of non-APICv interrupt into vendor code · 57dfd7b5
      Sean Christopherson authored
      Handle non-APICv interrupt delivery in vendor code, even though it means
      VMX and SVM will temporarily have duplicate code.  SVM's AVIC has a race
      condition that requires KVM to fall back to legacy interrupt injection
      _after_ the interrupt has been logged in the vIRR, i.e. to fix the race,
      SVM will need to open code the full flow anyways[*].  Refactor the code
      so that the SVM bug without introducing other issues, e.g. SVM would
      return "success" and thus invoke trace_kvm_apicv_accept_irq() even when
      delivery through the AVIC failed, and to opportunistically prepare for
      using KVM_X86_OP to fill each vendor's kvm_x86_ops struct, which will
      rely on the vendor function matching the kvm_x86_op pointer name.
      
      No functional change intended.
      
      [*] https://lore.kernel.org/all/20211213104634.199141-4-mlevitsk@redhat.comSigned-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220128005208.4008533-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      57dfd7b5
  16. 26 Jan, 2022 2 commits