1. 27 Mar, 2018 36 commits
  2. 23 Mar, 2018 4 commits
    • Michael Ellerman's avatar
      Merge branch 'topic/ppc-kvm' into next · a26cf1c9
      Michael Ellerman authored
      This brings in two series from Paul, one of which touches KVM code and
      may need to be merged into the kvm-ppc tree to resolve conflicts.
      a26cf1c9
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Work around TEXASR bug in fake suspend state · 681c617b
      Paul Mackerras authored
      This works around a hardware bug in "Nimbus" POWER9 DD2.2 processors,
      where the contents of the TEXASR can get corrupted while a thread is
      in fake suspend state.  The workaround is for the instruction emulation
      code to use the value saved at the most recent guest exit in real
      suspend mode.  We achieve this by simply not saving the TEXASR into
      the vcpu struct on an exit in fake suspend state.  We also have to
      take care to set the orig_texasr field only on guest exit in real
      suspend state.
      
      This also means that on guest entry in fake suspend state, TEXASR
      will be restored to the value it had on the last exit in real suspend
      state, effectively counteracting any hardware-caused corruption.  This
      works because TEXASR may not be written in suspend state.
      
      With this, the guest might see the wrong values in TEXASR if it reads
      it while in suspend state, but will see the correct value in
      non-transactional state (e.g. after a treclaim), and treclaim will
      work correctly.
      
      With this workaround, the code will actually run slightly faster, and
      will operate correctly on systems without the TEXASR bug (since TEXASR
      may not be written in suspend state, and is only changed by failure
      recording, which will have already been done before we get into fake
      suspend state).  Therefore these changes are not made subject to a CPU
      feature bit.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      681c617b
    • Suraj Jitindar Singh's avatar
      KVM: PPC: Book3S HV: Work around XER[SO] bug in fake suspend mode · 87a11bb6
      Suraj Jitindar Singh authored
      This works around a hardware bug in "Nimbus" POWER9 DD2.2 processors,
      where a treclaim performed in fake suspend mode can cause subsequent
      reads from the XER register to return inconsistent values for the SO
      (summary overflow) bit.  The inconsistent SO bit state can potentially
      be observed on any thread in the core.  We have to do the treclaim
      because that is the only way to get the thread out of suspend state
      (fake or real) and into non-transactional state.
      
      The workaround for the bug is to force the core into SMT4 mode before
      doing the treclaim.  This patch adds the code to do that, conditional
      on the CPU_FTR_P9_TM_XER_SO_BUG feature bit.
      Signed-off-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      87a11bb6
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 · 4bb3c7a0
      Paul Mackerras authored
      POWER9 has hardware bugs relating to transactional memory and thread
      reconfiguration (changes to hardware SMT mode).  Specifically, the core
      does not have enough storage to store a complete checkpoint of all the
      architected state for all four threads.  The DD2.2 version of POWER9
      includes hardware modifications designed to allow hypervisor software
      to implement workarounds for these problems.  This patch implements
      those workarounds in KVM code so that KVM guests see a full, working
      transactional memory implementation.
      
      The problems center around the use of TM suspended state, where the
      CPU has a checkpointed state but execution is not transactional.  The
      workaround is to implement a "fake suspend" state, which looks to the
      guest like suspended state but the CPU does not store a checkpoint.
      In this state, any instruction that would cause a transition to
      transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
      checkpointed state (treclaim) causes a "soft patch" interrupt (vector
      0x1500) to the hypervisor so that it can be emulated.  The trechkpt
      instruction also causes a soft patch interrupt.
      
      On POWER9 DD2.2, we avoid returning to the guest in any state which
      would require a checkpoint to be present.  The trechkpt in the guest
      entry path which would normally create that checkpoint is replaced by
      either a transition to fake suspend state, if the guest is in suspend
      state, or a rollback to the pre-transactional state if the guest is in
      transactional state.  Fake suspend state is indicated by a flag in the
      PACA plus a new bit in the PSSCR.  The new PSSCR bit is write-only and
      reads back as 0.
      
      On exit from the guest, if the guest is in fake suspend state, we still
      do the treclaim instruction as we would in real suspend state, in order
      to get into non-transactional state, but we do not save the resulting
      register state since there was no checkpoint.
      
      Emulation of the instructions that cause a softpatch interrupt is
      handled in two paths.  If the guest is in real suspend mode, we call
      kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
      transitioning to transactional state.  This is called before we do the
      treclaim in the guest exit path; because we haven't done treclaim, we
      can get back to the guest with the transaction still active.  If the
      instruction is a case that kvmhv_p9_tm_emulation_early() doesn't
      handle, or if the guest is in fake suspend state, then we proceed to
      do the complete guest exit path and subsequently call
      kvmhv_p9_tm_emulation() in host context with the MMU on.  This handles
      all the cases including the cases that generate program interrupts
      (illegal instruction or TM Bad Thing) and facility unavailable
      interrupts.
      
      The emulation is reasonably straightforward and is mostly concerned
      with checking for exception conditions and updating the state of
      registers such as MSR and CR0.  The treclaim emulation takes care to
      ensure that the TEXASR register gets updated as if it were the guest
      treclaim instruction that had done failure recording, not the treclaim
      done in hypervisor state in the guest exit path.
      
      With this, the KVM_CAP_PPC_HTM capability returns true (1) even if
      transactional memory is not available to host userspace.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4bb3c7a0