1. 22 Aug, 2015 9 commits
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD · cdeee518
      Paul Mackerras authored
      This adds implementations for the H_CLEAR_REF (test and clear reference
      bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.
      
      When clearing the reference or change bit in the guest view of the HPTE,
      we also have to clear it in the real HPTE so that we can detect future
      references or changes.  When we do so, we transfer the R or C bit value
      to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
      kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
      has been referenced and/or changed.
      
      These hypercalls are not used by Linux guests.  These implementations
      have been tested using a FreeBSD guest.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      cdeee518
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix bug in dirty page tracking · 08fe1e7b
      Paul Mackerras authored
      This fixes a bug in the tracking of pages that get modified by the
      guest.  If the guest creates a large-page HPTE, writes to memory
      somewhere within the large page, and then removes the HPTE, we only
      record the modified state for the first normal page within the large
      page, when in fact the guest might have modified some other normal
      page within the large page.
      
      To fix this we use some unused bits in the rmap entry to record the
      order (log base 2) of the size of the page that was modified, when
      removing an HPTE.  Then in kvm_test_clear_dirty_npages() we use that
      order to return the correct number of modified pages.
      
      The same thing could in principle happen when removing a HPTE at the
      host's request, i.e. when paging out a page, except that we never
      page out large pages, and the guest can only create large-page HPTEs
      if the guest RAM is backed by large pages.  However, we also fix
      this case for the sake of future-proofing.
      
      The reference bit is also subject to the same loss of information.  We
      don't make the same fix here for the reference bit because there isn't
      an interface for userspace to find out which pages the guest has
      referenced, whereas there is one for userspace to find out which pages
      the guest has modified.  Because of this loss of information, the
      kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
      say that a page has not been referenced when it has, but that doesn't
      matter greatly because we never page or swap out large pages.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      08fe1e7b
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE · 1e5bf454
      Paul Mackerras authored
      The reference (R) and change (C) bits in a HPT entry can be set by
      hardware at any time up until the HPTE is invalidated and the TLB
      invalidation sequence has completed.  This means that when removing
      a HPTE, we need to read the HPTE after the invalidation sequence has
      completed in order to obtain reliable values of R and C.  The code
      in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd32
      ("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
      read after invalidation as a side effect of other changes.  This
      restores the read of the HPTE after invalidation.
      
      The user-visible effect of this bug would be that when migrating a
      guest, there is a small probability that a page modified by the guest
      and then unmapped by the guest might not get re-transmitted and thus
      the destination might end up with a stale copy of the page.
      
      Fixes: 6f22bd32Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      1e5bf454
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8 · b4deba5c
      Paul Mackerras authored
      This builds on the ability to run more than one vcore on a physical
      core by using the micro-threading (split-core) modes of the POWER8
      chip.  Previously, only vcores from the same VM could be run together,
      and (on POWER8) only if they had just one thread per core.  With the
      ability to split the core on guest entry and unsplit it on guest exit,
      we can run up to 8 vcpu threads from up to 4 different VMs, and we can
      run multiple vcores with 2 or 4 vcpus per vcore.
      
      Dynamic micro-threading is only available if the static configuration
      of the cores is whole-core mode (unsplit), and only on POWER8.
      
      To manage this, we introduce a new kvm_split_mode struct which is
      shared across all of the subcores in the core, with a pointer in the
      paca on each thread.  In addition we extend the core_info struct to
      have information on each subcore.  When deciding whether to add a
      vcore to the set already on the core, we now have two possibilities:
      (a) piggyback the vcore onto an existing subcore, or (b) start a new
      subcore.
      
      Currently, when any vcpu needs to exit the guest and switch to host
      virtual mode, we interrupt all the threads in all subcores and switch
      the core back to whole-core mode.  It may be possible in future to
      allow some of the subcores to keep executing in the guest while
      subcore 0 switches to the host, but that is not implemented in this
      patch.
      
      This adds a module parameter called dynamic_mt_modes which controls
      which micro-threading (split-core) modes the code will consider, as a
      bitmap.  In other words, if it is 0, no micro-threading mode is
      considered; if it is 2, only 2-way micro-threading is considered; if
      it is 4, only 4-way, and if it is 6, both 2-way and 4-way
      micro-threading mode will be considered.  The default is 6.
      
      With this, we now have secondary threads which are the primary thread
      for their subcore and therefore need to do the MMU switch.  These
      threads will need to be started even if they have no vcpu to run, so
      we use the vcore pointer in the PACA rather than the vcpu pointer to
      trigger them.
      
      It is now possible for thread 0 to find that an exit has been
      requested before it gets to switch the subcore state to the guest.  In
      that case we haven't added the guest's timebase offset to the
      timebase, so we need to be careful not to subtract the offset in the
      guest exit path.  In fact we just skip the whole path that switches
      back to host context, since we haven't switched to the guest context.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      b4deba5c
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Make use of unused threads when running guests · ec257165
      Paul Mackerras authored
      When running a virtual core of a guest that is configured with fewer
      threads per core than the physical cores have, the extra physical
      threads are currently unused.  This makes it possible to use them to
      run one or more other virtual cores from the same guest when certain
      conditions are met.  This applies on POWER7, and on POWER8 to guests
      with one thread per virtual core.  (It doesn't apply to POWER8 guests
      with multiple threads per vcore because they require a 1-1 virtual to
      physical thread mapping in order to be able to use msgsndp and the
      TIR.)
      
      The idea is that we maintain a list of preempted vcores for each
      physical cpu (i.e. each core, since the host runs single-threaded).
      Then, when a vcore is about to run, it checks to see if there are
      any vcores on the list for its physical cpu that could be
      piggybacked onto this vcore's execution.  If so, those additional
      vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
      threads are started as well as the original vcore, which is called
      the master vcore.
      
      After the vcores have exited the guest, the extra ones are put back
      onto the preempted list if any of their VCPUs are still runnable and
      not idle.
      
      This means that vcpu->arch.ptid is no longer necessarily the same as
      the physical thread that the vcpu runs on.  In order to make it easier
      for code that wants to send an IPI to know which CPU to target, we
      now store that in a new field in struct vcpu_arch, called thread_cpu.
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Tested-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      ec257165
    • Tudor Laurentiu's avatar
      KVM: PPC: add missing pt_regs initialization · 845ac985
      Tudor Laurentiu authored
      On this switch branch the regs initialization
      doesn't happen so add it.
      This was found with the help of a static
      code analysis tool.
      Signed-off-by: default avatarLaurentiu Tudor <Laurentiu.Tudor@freescale.com>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      845ac985
    • Thomas Huth's avatar
      KVM: PPC: Fix warnings from sparse · 5358a963
      Thomas Huth authored
      When compiling the KVM code for POWER with "make C=1", sparse
      complains about functions missing proper prototypes and a 64-bit
      constant missing the ULL prefix. Let's fix this by making the
      functions static or by including the proper header with the
      prototypes, and by appending a ULL prefix to the constant
      PPC_MPPE_ADDRESS_MASK.
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      5358a963
    • Thomas Huth's avatar
      KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig · 129fd423
      Thomas Huth authored
      Since the PPC970 support has been removed from the kvm-hv kernel
      module recently, we should also reflect this change in the help
      text of the corresponding Kconfig option.
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      129fd423
    • Tudor Laurentiu's avatar
      KVM: PPC: fix suspicious use of conditional operator · f5ffe330
      Tudor Laurentiu authored
      This was signaled by a static code analysis tool.
      Signed-off-by: default avatarLaurentiu Tudor <Laurentiu.Tudor@freescale.com>
      Reviewed-by: default avatarScott Wood <scottwood@freescale.com>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      f5ffe330
  2. 14 Aug, 2015 1 commit
  3. 13 Aug, 2015 1 commit
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-20150812' of... · ae6c0aa6
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-20150812' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: fix and feature for kvm/next (4.3)
      
      1. error handling for irq routes
      2. Gracefully handle STP time changes
         s390 supports a protocol for syncing different systems via the stp
         protocol that will steer the TOD clocks to keep all participating
         clocks below the round trip time between the system. In case of
         specific out of sync event Linux can opt-in to accept sync checks.
         This will result in non-monotonic jumps of the TOD clock, which
         Linux will correct via time offsets to keep the wall clock time
         monotonic. Now: KVM guests also base their time on the host TOD,
         so we need to fixup the offset for them as well.
      ae6c0aa6
  4. 11 Aug, 2015 2 commits
  5. 07 Aug, 2015 1 commit
  6. 05 Aug, 2015 10 commits
  7. 04 Aug, 2015 1 commit
  8. 30 Jul, 2015 2 commits
  9. 29 Jul, 2015 13 commits