An error occurred fetching the project authors.
  1. 23 Mar, 2018 1 commit
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9 · 4bb3c7a0
      Paul Mackerras authored
      POWER9 has hardware bugs relating to transactional memory and thread
      reconfiguration (changes to hardware SMT mode).  Specifically, the core
      does not have enough storage to store a complete checkpoint of all the
      architected state for all four threads.  The DD2.2 version of POWER9
      includes hardware modifications designed to allow hypervisor software
      to implement workarounds for these problems.  This patch implements
      those workarounds in KVM code so that KVM guests see a full, working
      transactional memory implementation.
      
      The problems center around the use of TM suspended state, where the
      CPU has a checkpointed state but execution is not transactional.  The
      workaround is to implement a "fake suspend" state, which looks to the
      guest like suspended state but the CPU does not store a checkpoint.
      In this state, any instruction that would cause a transition to
      transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
      checkpointed state (treclaim) causes a "soft patch" interrupt (vector
      0x1500) to the hypervisor so that it can be emulated.  The trechkpt
      instruction also causes a soft patch interrupt.
      
      On POWER9 DD2.2, we avoid returning to the guest in any state which
      would require a checkpoint to be present.  The trechkpt in the guest
      entry path which would normally create that checkpoint is replaced by
      either a transition to fake suspend state, if the guest is in suspend
      state, or a rollback to the pre-transactional state if the guest is in
      transactional state.  Fake suspend state is indicated by a flag in the
      PACA plus a new bit in the PSSCR.  The new PSSCR bit is write-only and
      reads back as 0.
      
      On exit from the guest, if the guest is in fake suspend state, we still
      do the treclaim instruction as we would in real suspend state, in order
      to get into non-transactional state, but we do not save the resulting
      register state since there was no checkpoint.
      
      Emulation of the instructions that cause a softpatch interrupt is
      handled in two paths.  If the guest is in real suspend mode, we call
      kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
      transitioning to transactional state.  This is called before we do the
      treclaim in the guest exit path; because we haven't done treclaim, we
      can get back to the guest with the transaction still active.  If the
      instruction is a case that kvmhv_p9_tm_emulation_early() doesn't
      handle, or if the guest is in fake suspend state, then we proceed to
      do the complete guest exit path and subsequently call
      kvmhv_p9_tm_emulation() in host context with the MMU on.  This handles
      all the cases including the cases that generate program interrupts
      (illegal instruction or TM Bad Thing) and facility unavailable
      interrupts.
      
      The emulation is reasonably straightforward and is mostly concerned
      with checking for exception conditions and updating the state of
      registers such as MSR and CR0.  The treclaim emulation takes care to
      ensure that the TEXASR register gets updated as if it were the guest
      treclaim instruction that had done failure recording, not the treclaim
      done in hypervisor state in the guest exit path.
      
      With this, the KVM_CAP_PPC_HTM capability returns true (1) even if
      transactional memory is not available to host userspace.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4bb3c7a0
  2. 02 Nov, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: default avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: default avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  3. 12 May, 2017 1 commit
    • Paul Mackerras's avatar
      KVM: PPC: Book3S PR: Don't include SPAPR TCE code on non-pseries platforms · 76d837a4
      Paul Mackerras authored
      Commit e91aa8e6 ("KVM: PPC: Enable IOMMU_API for KVM_BOOK3S_64
      permanently", 2017-03-22) enabled the SPAPR TCE code for all 64-bit
      Book 3S kernel configurations in order to simplify the code and
      reduce #ifdefs.  However, 64-bit Book 3S PPC platforms other than
      pseries and powernv don't implement the necessary IOMMU callbacks,
      leading to build failures like the following (for a pasemi config):
      
      scripts/kconfig/conf  --silentoldconfig Kconfig
      warning: (KVM_BOOK3S_64) selects SPAPR_TCE_IOMMU which has unmet direct dependencies (IOMMU_SUPPORT && (PPC_POWERNV || PPC_PSERIES))
      
      ...
      
        CC [M]  arch/powerpc/kvm/book3s_64_vio.o
      /home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_64_vio.c: In function ‘kvmppc_clear_tce’:
      /home/paulus/kernel/kvm/arch/powerpc/kvm/book3s_64_vio.c:363:2: error: implicit declaration of function ‘iommu_tce_xchg’ [-Werror=implicit-function-declaration]
        iommu_tce_xchg(tbl, entry, &hpa, &dir);
        ^
      
      To fix this, we make the inclusion of the SPAPR TCE support, and the
      code that uses it in book3s_vio.c and book3s_vio_hv.c, depend on
      the inclusion of support for the pseries and/or powernv platforms.
      This means that when running a 'pseries' guest on those platforms,
      the guest won't have in-kernel acceleration of the PAPR TCE hypercalls,
      but at least now they compile.
      Reviewed-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      76d837a4
  4. 27 Apr, 2017 1 commit
  5. 31 Jan, 2017 1 commit
  6. 09 Sep, 2016 1 commit
    • Paolo Bonzini's avatar
      powerpc: move hmi.c to arch/powerpc/kvm/ · 3f257774
      Paolo Bonzini authored
      hmi.c functions are unused unless sibling_subcore_state is nonzero, and
      that in turn happens only if KVM is in use.  So move the code to
      arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
      rather than CONFIG_PPC_BOOK3S_64.  The sibling_subcore_state is also
      included in struct paca_struct only if KVM is supported by the kernel.
      
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: kvm-ppc@vger.kernel.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      3f257774
  7. 25 Aug, 2016 1 commit
    • Paul Mackerras's avatar
      KVM: PPC: Always select KVM_VFIO, plus Makefile cleanup · 4b3d173d
      Paul Mackerras authored
      As discussed recently on the kvm mailing list, David Gibson's
      intention in commit 178a7875 ("vfio: Enable VFIO device for
      powerpc", 2016-02-01) was to have the KVM VFIO device built in
      on all powerpc platforms.  This patch adds the "select KVM_VFIO"
      statement that makes this happen.
      
      Currently, arch/powerpc/kvm/Makefile doesn't include vfio.o for
      the 64-bit kvm module, because the list of objects doesn't use
      the $(common-objs-y) list.  The reason it doesn't is because we
      don't necessarily want coalesced_mmio.o or emulate.o (for example
      if HV KVM is the only target), and common-objs-y includes both.
      
      Since this is confusing, this patch adjusts the definitions so that
      we now use $(common-objs-y) in the list for the 64-bit kvm.ko
      module, emulate.o is removed from common-objs-y and added in the
      places that need it, and the inclusion of coalesced_mmio.o now
      depends on CONFIG_KVM_MMIO.
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      4b3d173d
  8. 22 Aug, 2016 1 commit
    • Paolo Bonzini's avatar
      powerpc: move hmi.c to arch/powerpc/kvm/ · 7c379526
      Paolo Bonzini authored
      hmi.c functions are unused unless sibling_subcore_state is nonzero, and
      that in turn happens only if KVM is in use.  So move the code to
      arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
      rather than CONFIG_PPC_BOOK3S_64.  The sibling_subcore_state is also
      included in struct paca_struct only if KVM is supported by the kernel.
      
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: kvm-ppc@vger.kernel.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7c379526
  9. 18 Jul, 2016 1 commit
    • Arnd Bergmann's avatar
      Kbuild: arch: look for generated headers in obtree · 58ab5e0c
      Arnd Bergmann authored
      There are very few files that need add an -I$(obj) gcc for the preprocessor
      or the assembler. For C files, we add always these for both the objtree and
      srctree, but for the other ones we require the Makefile to add them, and
      Kbuild then adds it for both trees.
      
      As a preparation for changing the meaning of the -I$(obj) directive to
      only refer to the srctree, this changes the two instances in arch/x86 to use
      an explictit $(objtree) prefix where needed, otherwise we won't find the
      headers any more, as reported by the kbuild 0day builder.
      
      arch/x86/realmode/rm/realmode.lds.S:75:20: fatal error: pasyms.h: No such file or directory
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      58ab5e0c
  10. 22 Mar, 2016 1 commit
    • Paolo Bonzini's avatar
      KVM: PPC: do not compile in vfio.o unconditionally · 0af574be
      Paolo Bonzini authored
      Build on 32-bit PPC fails with the following error:
      
       int kvm_vfio_ops_init(void)
            ^
       In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0:
       arch/powerpc/kvm/../../../virt/kvm/vfio.h:8:90: note: previous definition of ‘kvm_vfio_ops_init’ was here
       arch/powerpc/kvm/../../../virt/kvm/vfio.c:292:6: error: redefinition of ‘kvm_vfio_ops_exit’
       void kvm_vfio_ops_exit(void)
                   ^
       In file included from arch/powerpc/kvm/../../../virt/kvm/vfio.c:21:0:
       arch/powerpc/kvm/../../../virt/kvm/vfio.h:12:91: note: previous definition of ‘kvm_vfio_ops_exit’ was here
       scripts/Makefile.build:258: recipe for target arch/powerpc/kvm/../../../virt/kvm/vfio.o failed
       make[3]: *** [arch/powerpc/kvm/../../../virt/kvm/vfio.o] Error 1
      
      Check whether CONFIG_KVM_VFIO is set before including vfio.o
      in the build.
      Reported-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      Tested-by: default avatarPranith Kumar <bobby.prani@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      0af574be
  11. 15 Feb, 2016 1 commit
    • David Gibson's avatar
      vfio: Enable VFIO device for powerpc · 178a7875
      David Gibson authored
      ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is
      used to handle any necessary interactions between KVM and VFIO.
      
      Currently that device is built on x86 and ARM, but not powerpc, although
      powerpc does support both KVM and VFIO.  This makes things awkward in
      userspace
      
      Currently qemu prints an alarming error message if you attempt to use VFIO
      and it can't initialize the KVM VFIO device.  We don't want to remove the
      warning, because lack of the KVM VFIO device could mean coherency problems
      on x86.  On powerpc, however, the error is harmless but looks disturbing,
      and a test based on host architecture in qemu would be ugly, and break if
      we do need the KVM VFIO device for something important in future.
      
      There's nothing preventing the KVM VFIO device from being built for
      powerpc, so this patch turns it on.  It won't actually do anything, since
      we don't define any of the arch_*() hooks, but it will make qemu happy and
      we can extend it in future if we need to.
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Reviewed-by: default avatarEric Auger <eric.auger@linaro.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      178a7875
  12. 07 Aug, 2014 1 commit
  13. 30 Jul, 2014 1 commit
  14. 28 Jul, 2014 2 commits
    • Alexander Graf's avatar
      KVM: PPC: Separate loadstore emulation from priv emulation · d69614a2
      Alexander Graf authored
      Today the instruction emulator can get called via 2 separate code paths. It
      can either be called by MMIO emulation detection code or by privileged
      instruction traps.
      
      This is bad, as both code paths prepare the environment differently. For MMIO
      emulation we already know the virtual address we faulted on, so instructions
      there don't have to actually fetch that information.
      
      Split out the two separate use cases into separate files.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      d69614a2
    • Alexander Graf's avatar
      KVM: PPC: Remove 440 support · b2677b8d
      Alexander Graf authored
      The 440 target hasn't been properly functioning for a few releases and
      before I was the only one who fixes a very serious bug that indicates to
      me that nobody used it before either.
      
      Furthermore KVM on 440 is slow to the extent of unusable.
      
      We don't have to carry along completely unused code. Remove 440 and give
      us one less thing to worry about.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      b2677b8d
  15. 17 Oct, 2013 3 commits
  16. 08 Jul, 2013 1 commit
    • Aneesh Kumar K.V's avatar
      powerpc/kvm: Contiguous memory allocator based hash page table allocation · fa61a4e3
      Aneesh Kumar K.V authored
      Powerpc architecture uses a hash based page table mechanism for mapping virtual
      addresses to physical address. The architecture require this hash page table to
      be physically contiguous. With KVM on Powerpc currently we use early reservation
      mechanism for allocating guest hash page table. This implies that we need to
      reserve a big memory region to ensure we can create large number of guest
      simultaneously with KVM on Power. Another disadvantage is that the reserved memory
      is not available to rest of the subsystems and and that implies we limit the total
      available memory in the host.
      
      This patch series switch the guest hash page table allocation to use
      contiguous memory allocator.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      fa61a4e3
  17. 19 May, 2013 1 commit
  18. 26 Apr, 2013 5 commits
  19. 24 Jan, 2013 1 commit
  20. 06 Dec, 2012 2 commits
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking · b4072df4
      Paul Mackerras authored
      Currently, if a machine check interrupt happens while we are in the
      guest, we exit the guest and call the host's machine check handler,
      which tends to cause the host to panic.  Some machine checks can be
      triggered by the guest; for example, if the guest creates two entries
      in the SLB that map the same effective address, and then accesses that
      effective address, the CPU will take a machine check interrupt.
      
      To handle this better, when a machine check happens inside the guest,
      we call a new function, kvmppc_realmode_machine_check(), while still in
      real mode before exiting the guest.  On POWER7, it handles the cases
      that the guest can trigger, either by flushing and reloading the SLB,
      or by flushing the TLB, and then it delivers the machine check interrupt
      directly to the guest without going back to the host.  On POWER7, the
      OPAL firmware patches the machine check interrupt vector so that it
      gets control first, and it leaves behind its analysis of the situation
      in a structure pointed to by the opal_mc_evt field of the paca.  The
      kvmppc_realmode_machine_check() function looks at this, and if OPAL
      reports that there was no error, or that it has handled the error, we
      also go straight back to the guest with a machine check.  We have to
      deliver a machine check to the guest since the machine check interrupt
      might have trashed valid values in SRR0/1.
      
      If the machine check is one we can't handle in real mode, and one that
      OPAL hasn't already handled, or on PPC970, we exit the guest and call
      the host's machine check handler.  We do this by jumping to the
      machine_check_fwnmi label, rather than absolute address 0x200, because
      we don't want to re-execute OPAL's handler on POWER7.  On PPC970, the
      two are equivalent because address 0x200 just contains a branch.
      
      Then, if the host machine check handler decides that the system can
      continue executing, kvmppc_handle_exit() delivers a machine check
      interrupt to the guest -- once again to let the guest know that SRR0/1
      have been modified.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      [agraf: fix checkpatch warnings]
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      b4072df4
    • Alexander Graf's avatar
      KVM: PPC: Support eventfd · 0e673fb6
      Alexander Graf authored
      In order to support the generic eventfd infrastructure on PPC, we need
      to call into the generic KVM in-kernel device mmio code.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      0e673fb6
  21. 06 May, 2012 1 commit
  22. 08 Apr, 2012 2 commits
  23. 25 Sep, 2011 2 commits
    • Paul Mackerras's avatar
      KVM: PPC: Assemble book3s{,_hv}_rmhandlers.S separately · 177339d7
      Paul Mackerras authored
      This makes arch/powerpc/kvm/book3s_rmhandlers.S and
      arch/powerpc/kvm/book3s_hv_rmhandlers.S be assembled as
      separate compilation units rather than having them #included in
      arch/powerpc/kernel/exceptions-64s.S.  We no longer have any
      conditional branches between the exception prologs in
      exceptions-64s.S and the KVM handlers, so there is no need to
      keep their contents close together in the vmlinux image.
      
      In their current location, they are using up part of the limited
      space between the first-level interrupt handlers and the firmware
      NMI data area at offset 0x7000, and with some kernel configurations
      this area will overflow (e.g. allyesconfig), leading to an
      "attempt to .org backwards" error when compiling exceptions-64s.S.
      
      Moving them out requires that we add some #includes that the
      book3s_{,hv_}rmhandlers.S code was previously getting implicitly
      via exceptions-64s.S.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      177339d7
    • Alexander Graf's avatar
      KVM: PPC: Add PAPR hypercall code for PR mode · 0254f074
      Alexander Graf authored
      When running a PAPR guest, we need to handle a few hypercalls in kernel space,
      most prominently the page table invalidation (to sync the shadows).
      
      So this patch adds handling for a few PAPR hypercalls to PR mode KVM. I tried
      to share the code with HV mode, but it ended up being a lot easier this way
      around, as the two differ too much in those details.
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      
      ---
      
      v1 -> v2:
      
        - whitespace fix
      0254f074
  24. 12 Jul, 2011 5 commits
    • Paul Mackerras's avatar
      KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests · aa04b4cc
      Paul Mackerras authored
      This adds infrastructure which will be needed to allow book3s_hv KVM to
      run on older POWER processors, including PPC970, which don't support
      the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
      Offset (RMO) facility.  These processors require a physically
      contiguous, aligned area of memory for each guest.  When the guest does
      an access in real mode (MMU off), the address is compared against a
      limit value, and if it is lower, the address is ORed with an offset
      value (from the Real Mode Offset Register (RMOR)) and the result becomes
      the real address for the access.  The size of the RMA has to be one of
      a set of supported values, which usually includes 64MB, 128MB, 256MB
      and some larger powers of 2.
      
      Since we are unlikely to be able to allocate 64MB or more of physically
      contiguous memory after the kernel has been running for a while, we
      allocate a pool of RMAs at boot time using the bootmem allocator.  The
      size and number of the RMAs can be set using the kvm_rma_size=xx and
      kvm_rma_count=xx kernel command line options.
      
      KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
      of the pool of preallocated RMAs.  The capability value is 1 if the
      processor can use an RMA but doesn't require one (because it supports
      the VRMA facility), or 2 if the processor requires an RMA for each guest.
      
      This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
      pool and returns a file descriptor which can be used to map the RMA.  It
      also returns the size of the RMA in the argument structure.
      
      Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
      ioctl calls from userspace.  To cope with this, we now preallocate the
      kvm->arch.ram_pginfo array when the VM is created with a size sufficient
      for up to 64GB of guest memory.  Subsequently we will get rid of this
      array and use memory associated with each memslot instead.
      
      This moves most of the code that translates the user addresses into
      host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
      to kvmppc_core_prepare_memory_region.  Also, instead of having to look
      up the VMA for each page in order to check the page size, we now check
      that the pages we get are compound pages of 16MB.  However, if we are
      adding memory that is mapped to an RMA, we don't bother with calling
      get_user_pages_fast and instead just offset from the base pfn for the
      RMA.
      
      Typically the RMA gets added after vcpus are created, which makes it
      inconvenient to have the LPCR (logical partition control register) value
      in the vcpu->arch struct, since the LPCR controls whether the processor
      uses RMA or VRMA for the guest.  This moves the LPCR value into the
      kvm->arch struct and arranges for the MER (mediated external request)
      bit, which is the only bit that varies between vcpus, to be set in
      assembly code when going into the guest if there is a pending external
      interrupt request.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      aa04b4cc
    • David Gibson's avatar
      KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode · 54738c09
      David Gibson authored
      This improves I/O performance for guests using the PAPR
      paravirtualization interface by making the H_PUT_TCE hcall faster, by
      implementing it in real mode.  H_PUT_TCE is used for updating virtual
      IOMMU tables, and is used both for virtual I/O and for real I/O in the
      PAPR interface.
      
      Since this moves the IOMMU tables into the kernel, we define a new
      KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
      ioctl returns a file descriptor which can be used to mmap the newly
      created table.  The qemu driver models use them in the same way as
      userspace managed tables, but they can be updated directly by the
      guest with a real-mode H_PUT_TCE implementation, reducing the number
      of host/guest context switches during guest IO.
      
      There are certain circumstances where it is useful for userland qemu
      to write to the TCE table even if the kernel H_PUT_TCE path is used
      most of the time.  Specifically, allowing this will avoid awkwardness
      when we need to reset the table.  More importantly, we will in the
      future need to write the table in order to restore its state after a
      checkpoint resume or migration.
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      54738c09
    • Paul Mackerras's avatar
      KVM: PPC: Handle some PAPR hcalls in the kernel · a8606e20
      Paul Mackerras authored
      This adds the infrastructure for handling PAPR hcalls in the kernel,
      either early in the guest exit path while we are still in real mode,
      or later once the MMU has been turned back on and we are in the full
      kernel context.  The advantage of handling hcalls in real mode if
      possible is that we avoid two partition switches -- and this will
      become more important when we support SMT4 guests, since a partition
      switch means we have to pull all of the threads in the core out of
      the guest.  The disadvantage is that we can only access the kernel
      linear mapping, not anything vmalloced or ioremapped, since the MMU
      is off.
      
      This also adds code to handle the following hcalls in real mode:
      
      H_ENTER       Add an HPTE to the hashed page table
      H_REMOVE      Remove an HPTE from the hashed page table
      H_READ        Read HPTEs from the hashed page table
      H_PROTECT     Change the protection bits in an HPTE
      H_BULK_REMOVE Remove up to 4 HPTEs from the hashed page table
      H_SET_DABR    Set the data address breakpoint register
      
      Plus code to handle the following hcalls in the kernel:
      
      H_CEDE        Idle the vcpu until an interrupt or H_PROD hcall arrives
      H_PROD        Wake up a ceded vcpu
      H_REGISTER_VPA Register a virtual processor area (VPA)
      
      The code that runs in real mode has to be in the base kernel, not in
      the module, if KVM is compiled as a module.  The real-mode code can
      only access the kernel linear mapping, not vmalloc or ioremap space.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      a8606e20
    • Paul Mackerras's avatar
      KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948
      Paul Mackerras authored
      This adds support for KVM running on 64-bit Book 3S processors,
      specifically POWER7, in hypervisor mode.  Using hypervisor mode means
      that the guest can use the processor's supervisor mode.  That means
      that the guest can execute privileged instructions and access privileged
      registers itself without trapping to the host.  This gives excellent
      performance, but does mean that KVM cannot emulate a processor
      architecture other than the one that the hardware implements.
      
      This code assumes that the guest is running paravirtualized using the
      PAPR (Power Architecture Platform Requirements) interface, which is the
      interface that IBM's PowerVM hypervisor uses.  That means that existing
      Linux distributions that run on IBM pSeries machines will also run
      under KVM without modification.  In order to communicate the PAPR
      hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
      to include/linux/kvm.h.
      
      Currently the choice between book3s_hv support and book3s_pr support
      (i.e. the existing code, which runs the guest in user mode) has to be
      made at kernel configuration time, so a given kernel binary can only
      do one or the other.
      
      This new book3s_hv code doesn't support MMIO emulation at present.
      Since we are running paravirtualized guests, this isn't a serious
      restriction.
      
      With the guest running in supervisor mode, most exceptions go straight
      to the guest.  We will never get data or instruction storage or segment
      interrupts, alignment interrupts, decrementer interrupts, program
      interrupts, single-step interrupts, etc., coming to the hypervisor from
      the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
      exception entry path so that we don't have to do the KVM test on entry
      to those exception handlers.
      
      We do however get hypervisor decrementer, hypervisor data storage,
      hypervisor instruction storage, and hypervisor emulation assist
      interrupts, so we have to handle those.
      
      In hypervisor mode, real-mode accesses can access all of RAM, not just
      a limited amount.  Therefore we put all the guest state in the vcpu.arch
      and use the shadow_vcpu in the PACA only for temporary scratch space.
      We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
      anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
      We don't have a shared page with the guest, but we still need a
      kvm_vcpu_arch_shared struct to store the values of various registers,
      so we include one in the vcpu_arch struct.
      
      The POWER7 processor has a restriction that all threads in a core have
      to be in the same partition.  MMU-on kernel code counts as a partition
      (partition 0), so we have to do a partition switch on every entry to and
      exit from the guest.  At present we require the host and guest to run
      in single-thread mode because of this hardware restriction.
      
      This code allocates a hashed page table for the guest and initializes
      it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
      require that the guest memory is allocated using 16MB huge pages, in
      order to simplify the low-level memory management.  This also means that
      we can get away without tracking paging activity in the host for now,
      since huge pages can't be paged or swapped.
      
      This also adds a few new exports needed by the book3s_hv code.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      de56a948
    • Paul Mackerras's avatar
      KVM: PPC: Split out code from book3s.c into book3s_pr.c · f05ed4d5
      Paul Mackerras authored
      In preparation for adding code to enable KVM to use hypervisor mode
      on 64-bit Book 3S processors, this splits book3s.c into two files,
      book3s.c and book3s_pr.c, where book3s_pr.c contains the code that is
      specific to running the guest in problem state (user mode) and book3s.c
      contains code which should apply to all Book 3S processors.
      
      In doing this, we abstract some details, namely the interrupt offset,
      updating the interrupt pending flag, and detecting if the guest is
      in a critical section.  These are all things that will be different
      when we use hypervisor mode.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAlexander Graf <agraf@suse.de>
      f05ed4d5
  25. 13 Oct, 2010 1 commit
  26. 01 Aug, 2010 1 commit