1. 13 Jan, 2023 15 commits
    • Sean Christopherson's avatar
      KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch · 8578e451
      Sean Christopherson authored
      Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
      vCPU ID, i.e. that there's no need to check for a destination match in
      the AVIC kick fast path.
      
      Opportunistically tweak comments to remove "guest bug", as that suggests
      KVM is punting on error handling, which is not the case.  Targeting a
      non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
      or not it's a guest bug is irrelevant.  Such behavior is architecturally
      legal and thus needs to faithfully emulated by KVM (and it is).
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-16-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8578e451
    • Sean Christopherson's avatar
      Revert "KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible" · f9829c90
      Sean Christopherson authored
      Due to a likely mismerge of patches, KVM ended up with a superfluous
      commit to "enable" AVIC's fast path for x2AVIC mode.  Even worse, the
      superfluous commit has several bugs and creates a nasty local shadow
      variable.
      
      Rather than fix the bugs piece-by-piece[*] to achieve the same end
      result, revert the patch wholesale.
      
      Opportunistically add a comment documenting the x2AVIC dependencies.
      
      This reverts commit 8c9e639d.
      
      [*] https://lore.kernel.org/all/YxEP7ZBRIuFWhnYJ@google.com
      
      Fixes: 8c9e639d ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
      Suggested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-15-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f9829c90
    • Suravee Suthikulpanit's avatar
      KVM: SVM: Fix x2APIC Logical ID calculation for avic_kick_target_vcpus_fast · da3fb46d
      Suravee Suthikulpanit authored
      For X2APIC ID in cluster mode, the logical ID is bit [15:0].
      
      Fixes: 603ccef4 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-14-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      da3fb46d
    • Sean Christopherson's avatar
      KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick · a879a88e
      Sean Christopherson authored
      Compute the destination from ICRH using the sender's x2APIC status, not
      each (potential) target's x2APIC status.
      
      Fixes: c514d3a3 ("KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID")
      Cc: Li RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarLi RongQing <lirongqing@baidu.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-13-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a879a88e
    • Sean Christopherson's avatar
      KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean · f628a34a
      Sean Christopherson authored
      Replace the "avic_mode" enum with a single bool to track whether or not
      x2AVIC is enabled.  KVM already has "apicv_enabled" that tracks if any
      flavor of AVIC is enabled, i.e. AVIC_MODE_NONE and AVIC_MODE_X1 are
      redundant and unnecessary noise.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-12-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f628a34a
    • Sean Christopherson's avatar
      KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled · 2008fab3
      Sean Christopherson authored
      Free the APIC access page memslot if any vCPU enables x2APIC and SVM's
      AVIC is enabled to prevent accesses to the virtual APIC on vCPUs with
      x2APIC enabled.  On AMD, if its "hybrid" mode is enabled (AVIC is enabled
      when x2APIC is enabled even without x2AVIC support), keeping the APIC
      access page memslot results in the guest being able to access the virtual
      APIC page as x2APIC is fully emulated by KVM.  I.e. hardware isn't aware
      that the guest is operating in x2APIC mode.
      
      Exempt nested SVM's update of APICv state from the new logic as x2APIC
      can't be toggled on VM-Exit.  In practice, invoking the x2APIC logic
      should be harmless precisely because it should be a glorified nop, but
      play it safe to avoid latent bugs, e.g. with dropping the vCPU's SRCU
      lock.
      
      Intel doesn't suffer from the same issue as APICv has fully independent
      VMCS controls for xAPIC vs. x2APIC virtualization.  Technically, KVM
      should provide bus error semantics and not memory semantics for the APIC
      page when x2APIC is enabled, but KVM already provides memory semantics in
      other scenarios, e.g. if APICv/AVIC is enabled and the APIC is hardware
      disabled (via APIC_BASE MSR).
      
      Note, checking apic_access_memslot_enabled without taking locks relies
      it being set during vCPU creation (before kvm_vcpu_reset()).  vCPUs can
      race to set the inhibit and delete the memslot, i.e. can get false
      positives, but can't get false negatives as apic_access_memslot_enabled
      can't be toggled "on" once any vCPU reaches KVM_RUN.
      
      Opportunistically drop the "can" while updating avic_activate_vmcb()'s
      comment, i.e. to state that KVM _does_ support the hybrid mode.  Move
      the "Note:" down a line to conform to preferred kernel/KVM multi-line
      comment style.
      
      Opportunistically update the apicv_update_lock comment, as it isn't
      actually used to protect apic_access_memslot_enabled (which is protected
      by slots_lock).
      
      Fixes: 0e311d33 ("KVM: SVM: Introduce hybrid-AVIC mode")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-11-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2008fab3
    • Sean Christopherson's avatar
      KVM: x86: Move APIC access page helper to common x86 code · c482f2ce
      Sean Christopherson authored
      Move the APIC access page allocation helper function to common x86 code,
      the allocation routine is virtually identical between APICv (VMX) and
      AVIC (SVM).  Keep APICv's gfn_to_page() + put_page() sequence, which
      verifies that a backing page can be allocated, i.e. that the system isn't
      under heavy memory pressure.  Forcing the backing page to be populated
      isn't strictly necessary, but skipping the effective prefetch only delays
      the inevitable.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-10-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c482f2ce
    • Sean Christopherson's avatar
      KVM: x86: Handle APICv updates for APIC "mode" changes via request · 1459f5c6
      Sean Christopherson authored
      Use KVM_REQ_UPDATE_APICV to react to APIC "mode" changes, i.e. to handle
      the APIC being hardware enabled/disabled and/or x2APIC being toggled.
      There is no need to immediately update APICv state, the only requirement
      is that APICv be updating prior to the next VM-Enter.
      
      Making a request will allow piggybacking KVM_REQ_UPDATE_APICV to "inhibit"
      the APICv memslot when x2APIC is enabled.  Doing that directly from
      kvm_lapic_set_base() isn't feasible as KVM's SRCU must not be held when
      modifying memslots (to avoid deadlock), and may or may not be held when
      kvm_lapic_set_base() is called, i.e. KVM can't do the right thing without
      tracking that is rightly buried behind CONFIG_PROVE_RCU=y.
      Suggested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-9-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1459f5c6
    • Sean Christopherson's avatar
      KVM: SVM: Don't put/load AVIC when setting virtual APIC mode · e0bead97
      Sean Christopherson authored
      Move the VMCB updates from avic_refresh_apicv_exec_ctrl() into
      avic_set_virtual_apic_mode() and invert the dependency being said
      functions to avoid calling avic_vcpu_{load,put}() and
      avic_set_pi_irte_mode() when "only" setting the virtual APIC mode.
      
      avic_set_virtual_apic_mode() is invoked from common x86 with preemption
      enabled, which makes avic_vcpu_{load,put}() unhappy.  Luckily, calling
      those and updating IRTE stuff is unnecessary as the only reason
      avic_set_virtual_apic_mode() is called is to handle transitions between
      xAPIC and x2APIC that don't also toggle APICv activation.  And if
      activation doesn't change, there's no need to fiddle with the physical
      APIC ID table or update IRTE.
      
      The "full" refresh is guaranteed to be called if activation changes in
      this case as the only call to the "set" path is:
      
      	kvm_vcpu_update_apicv(vcpu);
      	static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
      
      and kvm_vcpu_update_apicv() invokes the refresh if activation changes:
      
      	if (apic->apicv_active == activate)
      		goto out;
      
      	apic->apicv_active = activate;
      	kvm_apic_update_apicv(vcpu);
      	static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu);
      
      Rename the helper to reflect that it is also called during "refresh".
      
        WARNING: CPU: 183 PID: 49186 at arch/x86/kvm/svm/avic.c:1081 avic_vcpu_put+0xde/0xf0 [kvm_amd]
        CPU: 183 PID: 49186 Comm: stable Tainted: G           O       6.0.0-smp--fcddbca45f0a-sink #34
        Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022
        RIP: 0010:avic_vcpu_put+0xde/0xf0 [kvm_amd]
         avic_refresh_apicv_exec_ctrl+0x142/0x1c0 [kvm_amd]
         avic_set_virtual_apic_mode+0x5a/0x70 [kvm_amd]
         kvm_lapic_set_base+0x149/0x1a0 [kvm]
         kvm_set_apic_base+0x8f/0xd0 [kvm]
         kvm_set_msr_common+0xa3a/0xdc0 [kvm]
         svm_set_msr+0x364/0x6b0 [kvm_amd]
         __kvm_set_msr+0xb8/0x1c0 [kvm]
         kvm_emulate_wrmsr+0x58/0x1d0 [kvm]
         msr_interception+0x1c/0x30 [kvm_amd]
         svm_invoke_exit_handler+0x31/0x100 [kvm_amd]
         svm_handle_exit+0xfc/0x160 [kvm_amd]
         vcpu_enter_guest+0x21bb/0x23e0 [kvm]
         vcpu_run+0x92/0x450 [kvm]
         kvm_arch_vcpu_ioctl_run+0x43e/0x6e0 [kvm]
         kvm_vcpu_ioctl+0x559/0x620 [kvm]
      
      Fixes: 05c4fe8c ("KVM: SVM: Refresh AVIC configuration when changing APIC mode")
      Cc: stable@vger.kernel.org
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-8-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e0bead97
    • Sean Christopherson's avatar
      KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID · f651a008
      Sean Christopherson authored
      Truncate the vcpu_id, a.k.a. x2APIC ID, to an 8-bit value when comparing
      it against the xAPIC ID to avoid false positives (sort of) on systems
      with >255 CPUs, i.e. with IDs that don't fit into a u8.  The intent of
      APIC_ID_MODIFIED is to inhibit APICv/AVIC when the xAPIC is changed from
      it's original value,
      
      The mismatch isn't technically a false positive, as architecturally the
      xAPIC IDs do end up being aliased in this scenario, and neither APICv
      nor AVIC correctly handles IPI virtualization when there is aliasing.
      However, KVM already deliberately does not honor the aliasing behavior
      that results when an x2APIC ID gets truncated to an xAPIC ID.  I.e. the
      resulting APICv/AVIC behavior is aligned with KVM's existing behavior
      when KVM's x2APIC hotplug hack is effectively enabled.
      
      If/when KVM provides a way to disable the hotplug hack, APICv/AVIC can
      piggyback whatever logic disables the optimized APIC map (which is what
      provides the hotplug hack), i.e. so that KVM's optimized map and APIC
      virtualization yield the same behavior.
      
      For now, fix the immediate problem of APIC virtualization being disabled
      for large VMs, which is a much more pressing issue than ensuring KVM
      honors architectural behavior for APIC ID aliasing.
      
      Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
      Reported-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f651a008
    • Sean Christopherson's avatar
      KVM: x86: Don't inhibit APICv/AVIC on xAPIC ID "change" if APIC is disabled · a58a66af
      Sean Christopherson authored
      Don't inhibit APICv/AVIC due to an xAPIC ID mismatch if the APIC is
      hardware disabled.  The ID cannot be consumed while the APIC is disabled,
      and the ID is guaranteed to be set back to the vcpu_id when the APIC is
      hardware enabled (architectural behavior correctly emulated by KVM).
      
      Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a58a66af
    • Sean Christopherson's avatar
      KVM: SVM: Process ICR on AVIC IPI delivery failure due to invalid target · 5aede752
      Sean Christopherson authored
      Emulate ICR writes on AVIC IPI failures due to invalid targets using the
      same logic as failures due to invalid types.  AVIC acceleration fails if
      _any_ of the targets are invalid, and crucially VM-Exits before sending
      IPIs to targets that _are_ valid.  In logical mode, the destination is a
      bitmap, i.e. a single IPI can target multiple logical IDs.  Doing nothing
      causes KVM to drop IPIs if at least one target is valid and at least one
      target is invalid.
      
      Fixes: 18f40c53 ("svm: Add VMEXIT handlers for AVIC")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-5-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5aede752
    • Sean Christopherson's avatar
      KVM: SVM: Flush the "current" TLB when activating AVIC · 0ccf3e7c
      Sean Christopherson authored
      Flush the TLB when activating AVIC as the CPU can insert into the TLB
      while AVIC is "locally" disabled.  KVM doesn't treat "APIC hardware
      disabled" as VM-wide AVIC inhibition, and so when a vCPU has its APIC
      hardware disabled, AVIC is not guaranteed to be inhibited.  As a result,
      KVM may create a valid NPT mapping for the APIC base, which the CPU can
      cache as a non-AVIC translation.
      
      Note, Intel handles this in vmx_set_virtual_apic_mode().
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0ccf3e7c
    • Sean Christopherson's avatar
      KVM: x86: Purge "highest ISR" cache when updating APICv state · 97a71c44
      Sean Christopherson authored
      Purge the "highest ISR" cache when updating APICv state on a vCPU.  The
      cache must not be used when APICv is active as hardware may emulate EOIs
      (and other operations) without exiting to KVM.
      
      This fixes a bug where KVM will effectively block IRQs in perpetuity due
      to the "highest ISR" never getting reset if APICv is activated on a vCPU
      while an IRQ is in-service.  Hardware emulates the EOI and KVM never gets
      a chance to update its cache.
      
      Fixes: b26a695a ("kvm: lapic: Introduce APICv update helper function")
      Cc: stable@vger.kernel.org
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      97a71c44
    • Sean Christopherson's avatar
      KVM: x86: Blindly get current x2APIC reg value on "nodecode write" traps · 0a19807b
      Sean Christopherson authored
      When emulating a x2APIC write in response to an APICv/AVIC trap, get the
      the written value from the vAPIC page without checking that reads are
      allowed for the target register.  AVIC can generate trap-like VM-Exits on
      writes to EOI, and so KVM needs to get the written value from the backing
      page without running afoul of EOI's write-only behavior.
      
      Alternatively, EOI could be special cased to always write '0', e.g. so
      that the sanity check could be preserved, but x2APIC on AMD is actually
      supposed to disallow non-zero writes (not emulated by KVM), and the
      sanity check was a byproduct of how the KVM code was written, i.e. wasn't
      added to guard against anything in particular.
      
      Fixes: 70c8327c ("KVM: x86: Bug the VM if an accelerated x2APIC trap occurs on a "bad" reg")
      Fixes: 1bd9dfec ("KVM: x86: Do not block APIC write for non ICR registers")
      Reported-by: default avatarAlejandro Jimenez <alejandro.j.jimenez@oracle.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0a19807b
  2. 28 Dec, 2022 6 commits
  3. 27 Dec, 2022 19 commits