1. 24 Jan, 2023 11 commits
  2. 13 Jan, 2023 29 commits
    • Sean Christopherson's avatar
      KVM: x86: Add helpers to recalc physical vs. logical optimized APIC maps · 72c70cee
      Sean Christopherson authored
      Move the guts of kvm_recalculate_apic_map()'s main loop to two separate
      helpers to handle recalculating the physical and logical pieces of the
      optimized map.  Having 100+ lines of code in the for-loop makes it hard
      to understand what is being calculated where.
      
      No functional change intended.
      Suggested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-34-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      72c70cee
    • Greg Edwards's avatar
      KVM: x86: Allow APICv APIC ID inhibit to be cleared · d471bd85
      Greg Edwards authored
      Legacy kernels prior to commit 4399c03c ("x86/apic: Remove
      verify_local_APIC()") write the APIC ID of the boot CPU twice to verify
      a functioning local APIC.  This results in APIC acceleration inhibited
      on these kernels for reason APICV_INHIBIT_REASON_APIC_ID_MODIFIED.
      
      Allow the APICV_INHIBIT_REASON_APIC_ID_MODIFIED inhibit reason to be
      cleared if/when all APICs in xAPIC mode set their APIC ID back to the
      expected vcpu_id value.
      
      Fold the functionality previously in kvm_lapic_xapic_id_updated() into
      kvm_recalculate_apic_map(), as this allows examining all APICs in one
      pass.
      
      Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
      Signed-off-by: default avatarGreg Edwards <gedwards@ddn.com>
      Link: https://lore.kernel.org/r/20221117183247.94314-1-gedwards@ddn.comSigned-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-33-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d471bd85
    • Sean Christopherson's avatar
      KVM: x86: Track required APICv inhibits with variable, not callback · b3f257a8
      Sean Christopherson authored
      Track the per-vendor required APICv inhibits with a variable instead of
      calling into vendor code every time KVM wants to query the set of
      required inhibits.  The required inhibits are a property of the vendor's
      virtualization architecture, i.e. are 100% static.
      
      Using a variable allows the compiler to inline the check, e.g. generate
      a single-uop TEST+Jcc, and thus eliminates any desire to avoid checking
      inhibits for performance reasons.
      
      No functional change intended.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-32-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b3f257a8
    • Sean Christopherson's avatar
      Revert "KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu" · e2ed3e64
      Sean Christopherson authored
      Turns out that some warnings exist for good reasons.  Restore the warning
      in avic_vcpu_load() that guards against calling avic_vcpu_load() on a
      running vCPU now that KVM avoids doing so when switching between x2APIC
      and xAPIC.  The entire point of the WARN is to highlight that KVM should
      not be reloading an AVIC.
      
      Opportunistically convert the WARN_ON() to WARN_ON_ONCE() to avoid
      spamming the kernel if it does fire.
      
      This reverts commit c0caeee6.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-31-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e2ed3e64
    • Sean Christopherson's avatar
      KVM: SVM: Ignore writes to Remote Read Data on AVIC write traps · a790e338
      Sean Christopherson authored
      Drop writes to APIC_RRR, a.k.a. Remote Read Data Register, on AVIC
      unaccelerated write traps.  The register is read-only and isn't emulated
      by KVM.  Sending the register through kvm_apic_write_nodecode() will
      result in screaming when x2APIC is enabled due to the unexpected failure
      to retrieve the MSR (KVM expects that only "legal" accesses will trap).
      
      Fixes: 4d1d7942 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-30-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a790e338
    • Sean Christopherson's avatar
      KVM: SVM: Handle multiple logical targets in AVIC kick fastpath · bbfc7aa6
      Sean Christopherson authored
      Iterate over all target logical IDs in the AVIC kick fastpath instead of
      bailing if there is more than one target.  Now that KVM inhibits AVIC if
      vCPUs aren't mapped 1:1 with logical IDs, each bit in the destination is
      guaranteed to match to at most one vCPU, i.e. iterating over the bitmap
      is guaranteed to kick each valid target exactly once.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-29-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bbfc7aa6
    • Sean Christopherson's avatar
      KVM: SVM: Require logical ID to be power-of-2 for AVIC entry · 1808c950
      Sean Christopherson authored
      Do not modify AVIC's logical ID table if the logical ID portion of the
      LDR is not a power-of-2, i.e. if the LDR has multiple bits set.  Taking
      only the first bit means that KVM will fail to match MDAs that intersect
      with "higher" bits in the "ID"
      
      The "ID" acts as a bitmap, but is referred to as an ID because there's an
      implicit, unenforced "requirement" that software only set one bit.  This
      edge case is arguably out-of-spec behavior, but KVM cleanly handles it
      in all other cases, e.g. the optimized logical map (and AVIC!) is also
      disabled in this scenario.
      
      Refactor the code to consolidate the checks, and so that the code looks
      more like avic_kick_target_vcpus_fast().
      
      Fixes: 18f40c53 ("svm: Add VMEXIT handlers for AVIC")
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-28-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1808c950
    • Sean Christopherson's avatar
      KVM: SVM: Update svm->ldr_reg cache even if LDR is "bad" · 4f160b7b
      Sean Christopherson authored
      Update SVM's cache of the LDR even if the new value is "bad".  Leaving
      stale information in the cache can result in KVM missing updates and/or
      invalidating the wrong entry, e.g. if avic_invalidate_logical_id_entry()
      is triggered after a different vCPU has "claimed" the old LDR.
      
      Fixes: 18f40c53 ("svm: Add VMEXIT handlers for AVIC")
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-27-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4f160b7b
    • Sean Christopherson's avatar
      KVM: SVM: Always update local APIC on writes to logical dest register · 1ba59a44
      Sean Christopherson authored
      Update the vCPU's local (virtual) APIC on LDR writes even if the write
      "fails".  The APIC needs to recalc the optimized logical map even if the
      LDR is invalid or zero, e.g. if the guest clears its LDR, the optimized
      map will be left as is and the vCPU will receive interrupts using its
      old LDR.
      
      Fixes: 18f40c53 ("svm: Add VMEXIT handlers for AVIC")
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-26-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1ba59a44
    • Sean Christopherson's avatar
      KVM: SVM: Inhibit AVIC if vCPUs are aliased in logical mode · 9a364857
      Sean Christopherson authored
      Inhibit SVM's AVIC if multiple vCPUs are aliased to the same logical ID.
      Architecturally, all CPUs whose logical ID matches the MDA are supposed
      to receive the interrupt; overwriting existing entries in AVIC's
      logical=>physical map can result in missed IPIs.
      
      Fixes: 18f40c53 ("svm: Add VMEXIT handlers for AVIC")
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-25-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9a364857
    • Sean Christopherson's avatar
      KVM: x86: Inhibit APICv/AVIC if the optimized physical map is disabled · 5063c41b
      Sean Christopherson authored
      Inhibit APICv/AVIC if the optimized physical map is disabled so that KVM
      KVM provides consistent APIC behavior if xAPIC IDs are aliased due to
      vcpu_id being truncated and the x2APIC hotplug hack isn't enabled.  If
      the hotplug hack is disabled, events that are emulated by KVM will follow
      architectural behavior (all matching vCPUs receive events, even if the
      "match" is due to truncation), whereas APICv and AVIC will deliver events
      only to the first matching vCPU, i.e. the vCPU that matches without
      truncation.
      
      Note, the "extra" inhibit is needed because  KVM deliberately ignores
      mismatches due to truncation when applying the APIC_ID_MODIFIED inhibit
      so that large VMs (>255 vCPUs) can run with APICv/AVIC.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-24-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5063c41b
    • Sean Christopherson's avatar
      KVM: x86: Honor architectural behavior for aliased 8-bit APIC IDs · 5b84b029
      Sean Christopherson authored
      Apply KVM's hotplug hack if and only if userspace has enabled 32-bit IDs
      for x2APIC.  If 32-bit IDs are not enabled, disable the optimized map to
      honor x86 architectural behavior if multiple vCPUs shared a physical APIC
      ID.  As called out in the changelog that added the hack, all CPUs whose
      (possibly truncated) APIC ID matches the target are supposed to receive
      the IPI.
      
        KVM intentionally differs from real hardware, because real hardware
        (Knights Landing) does just "x2apic_id & 0xff" to decide whether to
        accept the interrupt in xAPIC mode and it can deliver one interrupt to
        more than one physical destination, e.g. 0x123 to 0x123 and 0x23.
      
      Applying the hack even when x2APIC is not fully enabled means KVM doesn't
      correctly handle scenarios where the guest has aliased xAPIC IDs across
      multiple vCPUs, as only the vCPU with the lowest vCPU ID will receive any
      interrupts.  It's extremely unlikely any real world guest aliases APIC
      IDs, or even modifies APIC IDs, but KVM's behavior is arbitrary, e.g. the
      lowest vCPU ID "wins" regardless of which vCPU is "aliasing" and which
      vCPU is "normal".
      
      Furthermore, the hack is _not_ guaranteed to work!  The hack works if and
      only if the optimized APIC map is successfully allocated.  If the map
      allocation fails (unlikely), KVM will fall back to its unoptimized
      behavior, which _does_ honor the architectural behavior.
      
      Pivot on 32-bit x2APIC IDs being enabled as that is required to take
      advantage of the hotplug hack (see kvm_apic_state_fixup()), i.e. won't
      break existing setups unless they are way, way off in the weeds.
      
      And an entry in KVM's errata to document the hack.  Alternatively, KVM
      could provide an actual x2APIC quirk and document the hack that way, but
      there's unlikely to ever be a use case for disabling the quirk.  Go the
      errata route to avoid having to validate a quirk no one cares about.
      
      Fixes: 5bd5db38 ("KVM: x86: allow hotplug of VCPU with APIC ID over 0xff")
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-23-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5b84b029
    • Sean Christopherson's avatar
      KVM: x86: Disable APIC logical map if vCPUs are aliased in logical mode · 29700524
      Sean Christopherson authored
      Disable the optimized APIC logical map if multiple vCPUs are aliased to
      the same logical ID.  Architecturally, all CPUs whose logical ID matches
      the MDA are supposed to receive the interrupt; overwriting existing map
      entries can result in missed IPIs.
      
      Fixes: 1e08ec4a ("KVM: optimize apic interrupt delivery")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-22-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      29700524
    • Sean Christopherson's avatar
      KVM: x86: Disable APIC logical map if logical ID covers multiple MDAs · 2bf934aa
      Sean Christopherson authored
      Disable the optimized APIC logical map if a logical ID covers multiple
      MDAs, i.e. if a vCPU has multiple bits set in its ID.  In logical mode,
      events match if "ID & MDA != 0", i.e. creating an entry for only the
      first bit can cause interrupts to be missed.
      
      Note, creating an entry for every bit is also wrong as KVM would generate
      IPIs for every matching bit.  It would be possible to teach KVM to play
      nice with this edge case, but it is very much an edge case and probably
      not used in any real world OS, i.e. it's not worth optimizing.
      
      Fixes: 1e08ec4a ("KVM: optimize apic interrupt delivery")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-21-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2bf934aa
    • Sean Christopherson's avatar
      KVM: x86: Skip redundant x2APIC logical mode optimized cluster setup · 76e52750
      Sean Christopherson authored
      Skip the optimized cluster[] setup for x2APIC logical mode, as KVM reuses
      the optimized map's phys_map[] and doesn't actually need to insert the
      target apic into the cluster[].  The LDR is derived from the x2APIC ID,
      and both are read-only in KVM, thus the vCPU's cluster[ldr] is guaranteed
      to be the same entry as the vCPU's phys_map[x2apic_id] entry.
      
      Skipping the unnecessary setup will allow a future fix for aliased xAPIC
      logical IDs to simply require that cluster[ldr] is non-NULL, i.e. won't
      have to special case x2APIC.
      
      Alternatively, the future check could allow "cluster[ldr] == apic", but
      that ends up being terribly confusing because cluster[ldr] is only set
      at the very end, i.e. it's only possible due to x2APIC's shenanigans.
      
      Another alternative would be to send x2APIC down a separate path _after_
      the calculation and then assert that all of the above, but the resulting
      code is rather messy, and it's arguably unnecessary since asserting that
      the actual LDR matches the expected LDR means that simply testing that
      interrupts are delivered correctly provides the same guarantees.
      Reported-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-20-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      76e52750
    • Sean Christopherson's avatar
      KVM: x86: Explicitly track all possibilities for APIC map's logical modes · 35366901
      Sean Christopherson authored
      Track all possibilities for the optimized APIC map's logical modes
      instead of overloading the pseudo-bitmap and treating any "unknown" value
      as "invalid".
      
      As documented by the now-stale comment above the mode values, the values
      did have meaning when the optimized map was originally added.  That
      dependent logical was removed by commit e45115b6 ("KVM: x86: use
      physical LAPIC array for logical x2APIC"), but the obfuscated behavior
      and its comment were left behind.
      
      Opportunistically rename "mode" to "logical_mode", partly to make it
      clear that the "disabled" case applies only to the logical map, but also
      to prove that there is no lurking code that expects "mode" to be a bitmap.
      
      Functionally, this is a glorified nop.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-19-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      35366901
    • Sean Christopherson's avatar
      KVM: x86: Explicitly skip optimized logical map setup if vCPU's LDR==0 · 6ea567ca
      Sean Christopherson authored
      Explicitly skip the optimized map setup if the vCPU's LDR is '0', i.e. if
      the vCPU will never respond to logical mode interrupts.  KVM already
      skips setup in this case, but relies on kvm_apic_map_get_logical_dest()
      to generate mask==0.  KVM still needs the mask=0 check as a non-zero LDR
      can yield mask==0 depending on the mode, but explicitly handling the LDR
      will make it simpler to clean up the logical mode tracking in the future.
      
      No functional change intended.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-18-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6ea567ca
    • Sean Christopherson's avatar
      KVM: SVM: Add helper to perform final AVIC "kick" of single vCPU · 1d22a597
      Sean Christopherson authored
      Add a helper to perform the final kick, two instances of the ICR decoding
      is one too many.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-17-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1d22a597
    • Sean Christopherson's avatar
      KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch · 8578e451
      Sean Christopherson authored
      Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
      vCPU ID, i.e. that there's no need to check for a destination match in
      the AVIC kick fast path.
      
      Opportunistically tweak comments to remove "guest bug", as that suggests
      KVM is punting on error handling, which is not the case.  Targeting a
      non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
      or not it's a guest bug is irrelevant.  Such behavior is architecturally
      legal and thus needs to faithfully emulated by KVM (and it is).
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-16-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8578e451
    • Sean Christopherson's avatar
      Revert "KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible" · f9829c90
      Sean Christopherson authored
      Due to a likely mismerge of patches, KVM ended up with a superfluous
      commit to "enable" AVIC's fast path for x2AVIC mode.  Even worse, the
      superfluous commit has several bugs and creates a nasty local shadow
      variable.
      
      Rather than fix the bugs piece-by-piece[*] to achieve the same end
      result, revert the patch wholesale.
      
      Opportunistically add a comment documenting the x2AVIC dependencies.
      
      This reverts commit 8c9e639d.
      
      [*] https://lore.kernel.org/all/YxEP7ZBRIuFWhnYJ@google.com
      
      Fixes: 8c9e639d ("KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible")
      Suggested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-15-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f9829c90
    • Suravee Suthikulpanit's avatar
      KVM: SVM: Fix x2APIC Logical ID calculation for avic_kick_target_vcpus_fast · da3fb46d
      Suravee Suthikulpanit authored
      For X2APIC ID in cluster mode, the logical ID is bit [15:0].
      
      Fixes: 603ccef4 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast")
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-14-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      da3fb46d
    • Sean Christopherson's avatar
      KVM: SVM: Compute dest based on sender's x2APIC status for AVIC kick · a879a88e
      Sean Christopherson authored
      Compute the destination from ICRH using the sender's x2APIC status, not
      each (potential) target's x2APIC status.
      
      Fixes: c514d3a3 ("KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID")
      Cc: Li RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarLi RongQing <lirongqing@baidu.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-13-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a879a88e
    • Sean Christopherson's avatar
      KVM: SVM: Replace "avic_mode" enum with "x2avic_enabled" boolean · f628a34a
      Sean Christopherson authored
      Replace the "avic_mode" enum with a single bool to track whether or not
      x2AVIC is enabled.  KVM already has "apicv_enabled" that tracks if any
      flavor of AVIC is enabled, i.e. AVIC_MODE_NONE and AVIC_MODE_X1 are
      redundant and unnecessary noise.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-12-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f628a34a
    • Sean Christopherson's avatar
      KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled · 2008fab3
      Sean Christopherson authored
      Free the APIC access page memslot if any vCPU enables x2APIC and SVM's
      AVIC is enabled to prevent accesses to the virtual APIC on vCPUs with
      x2APIC enabled.  On AMD, if its "hybrid" mode is enabled (AVIC is enabled
      when x2APIC is enabled even without x2AVIC support), keeping the APIC
      access page memslot results in the guest being able to access the virtual
      APIC page as x2APIC is fully emulated by KVM.  I.e. hardware isn't aware
      that the guest is operating in x2APIC mode.
      
      Exempt nested SVM's update of APICv state from the new logic as x2APIC
      can't be toggled on VM-Exit.  In practice, invoking the x2APIC logic
      should be harmless precisely because it should be a glorified nop, but
      play it safe to avoid latent bugs, e.g. with dropping the vCPU's SRCU
      lock.
      
      Intel doesn't suffer from the same issue as APICv has fully independent
      VMCS controls for xAPIC vs. x2APIC virtualization.  Technically, KVM
      should provide bus error semantics and not memory semantics for the APIC
      page when x2APIC is enabled, but KVM already provides memory semantics in
      other scenarios, e.g. if APICv/AVIC is enabled and the APIC is hardware
      disabled (via APIC_BASE MSR).
      
      Note, checking apic_access_memslot_enabled without taking locks relies
      it being set during vCPU creation (before kvm_vcpu_reset()).  vCPUs can
      race to set the inhibit and delete the memslot, i.e. can get false
      positives, but can't get false negatives as apic_access_memslot_enabled
      can't be toggled "on" once any vCPU reaches KVM_RUN.
      
      Opportunistically drop the "can" while updating avic_activate_vmcb()'s
      comment, i.e. to state that KVM _does_ support the hybrid mode.  Move
      the "Note:" down a line to conform to preferred kernel/KVM multi-line
      comment style.
      
      Opportunistically update the apicv_update_lock comment, as it isn't
      actually used to protect apic_access_memslot_enabled (which is protected
      by slots_lock).
      
      Fixes: 0e311d33 ("KVM: SVM: Introduce hybrid-AVIC mode")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20230106011306.85230-11-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2008fab3
    • Sean Christopherson's avatar
      KVM: x86: Move APIC access page helper to common x86 code · c482f2ce
      Sean Christopherson authored
      Move the APIC access page allocation helper function to common x86 code,
      the allocation routine is virtually identical between APICv (VMX) and
      AVIC (SVM).  Keep APICv's gfn_to_page() + put_page() sequence, which
      verifies that a backing page can be allocated, i.e. that the system isn't
      under heavy memory pressure.  Forcing the backing page to be populated
      isn't strictly necessary, but skipping the effective prefetch only delays
      the inevitable.
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-10-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c482f2ce
    • Sean Christopherson's avatar
      KVM: x86: Handle APICv updates for APIC "mode" changes via request · 1459f5c6
      Sean Christopherson authored
      Use KVM_REQ_UPDATE_APICV to react to APIC "mode" changes, i.e. to handle
      the APIC being hardware enabled/disabled and/or x2APIC being toggled.
      There is no need to immediately update APICv state, the only requirement
      is that APICv be updating prior to the next VM-Enter.
      
      Making a request will allow piggybacking KVM_REQ_UPDATE_APICV to "inhibit"
      the APICv memslot when x2APIC is enabled.  Doing that directly from
      kvm_lapic_set_base() isn't feasible as KVM's SRCU must not be held when
      modifying memslots (to avoid deadlock), and may or may not be held when
      kvm_lapic_set_base() is called, i.e. KVM can't do the right thing without
      tracking that is rightly buried behind CONFIG_PROVE_RCU=y.
      Suggested-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-9-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1459f5c6
    • Sean Christopherson's avatar
      KVM: SVM: Don't put/load AVIC when setting virtual APIC mode · e0bead97
      Sean Christopherson authored
      Move the VMCB updates from avic_refresh_apicv_exec_ctrl() into
      avic_set_virtual_apic_mode() and invert the dependency being said
      functions to avoid calling avic_vcpu_{load,put}() and
      avic_set_pi_irte_mode() when "only" setting the virtual APIC mode.
      
      avic_set_virtual_apic_mode() is invoked from common x86 with preemption
      enabled, which makes avic_vcpu_{load,put}() unhappy.  Luckily, calling
      those and updating IRTE stuff is unnecessary as the only reason
      avic_set_virtual_apic_mode() is called is to handle transitions between
      xAPIC and x2APIC that don't also toggle APICv activation.  And if
      activation doesn't change, there's no need to fiddle with the physical
      APIC ID table or update IRTE.
      
      The "full" refresh is guaranteed to be called if activation changes in
      this case as the only call to the "set" path is:
      
      	kvm_vcpu_update_apicv(vcpu);
      	static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
      
      and kvm_vcpu_update_apicv() invokes the refresh if activation changes:
      
      	if (apic->apicv_active == activate)
      		goto out;
      
      	apic->apicv_active = activate;
      	kvm_apic_update_apicv(vcpu);
      	static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu);
      
      Rename the helper to reflect that it is also called during "refresh".
      
        WARNING: CPU: 183 PID: 49186 at arch/x86/kvm/svm/avic.c:1081 avic_vcpu_put+0xde/0xf0 [kvm_amd]
        CPU: 183 PID: 49186 Comm: stable Tainted: G           O       6.0.0-smp--fcddbca45f0a-sink #34
        Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 10.48.0 01/27/2022
        RIP: 0010:avic_vcpu_put+0xde/0xf0 [kvm_amd]
         avic_refresh_apicv_exec_ctrl+0x142/0x1c0 [kvm_amd]
         avic_set_virtual_apic_mode+0x5a/0x70 [kvm_amd]
         kvm_lapic_set_base+0x149/0x1a0 [kvm]
         kvm_set_apic_base+0x8f/0xd0 [kvm]
         kvm_set_msr_common+0xa3a/0xdc0 [kvm]
         svm_set_msr+0x364/0x6b0 [kvm_amd]
         __kvm_set_msr+0xb8/0x1c0 [kvm]
         kvm_emulate_wrmsr+0x58/0x1d0 [kvm]
         msr_interception+0x1c/0x30 [kvm_amd]
         svm_invoke_exit_handler+0x31/0x100 [kvm_amd]
         svm_handle_exit+0xfc/0x160 [kvm_amd]
         vcpu_enter_guest+0x21bb/0x23e0 [kvm]
         vcpu_run+0x92/0x450 [kvm]
         kvm_arch_vcpu_ioctl_run+0x43e/0x6e0 [kvm]
         kvm_vcpu_ioctl+0x559/0x620 [kvm]
      
      Fixes: 05c4fe8c ("KVM: SVM: Refresh AVIC configuration when changing APIC mode")
      Cc: stable@vger.kernel.org
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-8-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e0bead97
    • Sean Christopherson's avatar
      KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID · f651a008
      Sean Christopherson authored
      Truncate the vcpu_id, a.k.a. x2APIC ID, to an 8-bit value when comparing
      it against the xAPIC ID to avoid false positives (sort of) on systems
      with >255 CPUs, i.e. with IDs that don't fit into a u8.  The intent of
      APIC_ID_MODIFIED is to inhibit APICv/AVIC when the xAPIC is changed from
      it's original value,
      
      The mismatch isn't technically a false positive, as architecturally the
      xAPIC IDs do end up being aliased in this scenario, and neither APICv
      nor AVIC correctly handles IPI virtualization when there is aliasing.
      However, KVM already deliberately does not honor the aliasing behavior
      that results when an x2APIC ID gets truncated to an xAPIC ID.  I.e. the
      resulting APICv/AVIC behavior is aligned with KVM's existing behavior
      when KVM's x2APIC hotplug hack is effectively enabled.
      
      If/when KVM provides a way to disable the hotplug hack, APICv/AVIC can
      piggyback whatever logic disables the optimized APIC map (which is what
      provides the hotplug hack), i.e. so that KVM's optimized map and APIC
      virtualization yield the same behavior.
      
      For now, fix the immediate problem of APIC virtualization being disabled
      for large VMs, which is a much more pressing issue than ensuring KVM
      honors architectural behavior for APIC ID aliasing.
      
      Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
      Reported-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-7-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f651a008
    • Sean Christopherson's avatar
      KVM: x86: Don't inhibit APICv/AVIC on xAPIC ID "change" if APIC is disabled · a58a66af
      Sean Christopherson authored
      Don't inhibit APICv/AVIC due to an xAPIC ID mismatch if the APIC is
      hardware disabled.  The ID cannot be consumed while the APIC is disabled,
      and the ID is guaranteed to be set back to the vcpu_id when the APIC is
      hardware enabled (architectural behavior correctly emulated by KVM).
      
      Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20230106011306.85230-6-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a58a66af