1. 08 Aug, 2017 6 commits
    • Michael Ellerman's avatar
      powerpc/mm/hash64: Make vmalloc 56T on hash · 21a0e8c1
      Michael Ellerman authored
      On 64-bit book3s, with the hash MMU, we currently define the kernel
      virtual space (vmalloc, ioremap etc.), to be 16T in size. This is a
      leftover from pre v3.7 when our user VM was also 16T.
      
      Of that 16T we split it 50/50, with half used for PCI IO and ioremap
      and the other 8T for vmalloc.
      
      We never bothered to make it any bigger because 8T of vmalloc ought to
      be enough for anybody. But it turns out that's not true, the per cpu
      allocator wants large amounts of vmalloc space, not to make large
      allocations, but to allow a large stride between allocations, because
      we use pcpu_embed_first_chunk().
      
      With a bit of juggling we can increase the entire kernel virtual space
      to 64T. The only real complication is the check of the address in the
      SLB miss handler, see the comment in the code.
      
      Although we could continue to split virtual space 50/50 as we do now,
      no one seems to be running out of PCI IO or ioremap space. So instead
      keep that as 8T, and use the remaining 56T for vmalloc.
      
      In future we should be able to increase the kernel virtual space to
      512T, the code already supports that, it just needs testing on older
      hardware.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      21a0e8c1
    • Michael Ellerman's avatar
      powerpc/mm/slb: Move comment next to the code it's referring to · b5048de0
      Michael Ellerman authored
      There is a comment in slb_allocate() referring to the load of
      paca->vmalloc_sllp, but it's several lines prior in the assembly.
      We're about to change this code, and we want to add another comment,
      so move the comment immediately prior to the instruction it's talking
      about.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b5048de0
    • Michael Ellerman's avatar
      powerpc/mm/book3s64: Make KERN_IO_START a variable · 63ee9b2f
      Michael Ellerman authored
      Currently KERN_IO_START is defined as:
      
       #define KERN_IO_START  (KERN_VIRT_START + (KERN_VIRT_SIZE >> 1))
      
      Although it looks like a constant, both the components are actually
      variables, to allow us to have a different value between Radix and
      Hash with a single kernel.
      
      However that still requires both Radix and Hash to place the kernel IO
      region at the same location relative to the start and end of the
      kernel virtual region (namely 1/2 way through it), and we'd like to
      change that.
      
      So split KERN_IO_START out into its own variable, and initialise it
      for Radix and Hash. In the medium term we should be able to
      reconsolidate this, by doing a more involved rearrangement of the
      location of the regions.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      63ee9b2f
    • Matt Brown's avatar
      powerpc/powernv: Use darn instruction for get_random_seed() on Power9 · e66ca3db
      Matt Brown authored
      This adds powernv_get_random_darn() which utilises the darn instruction,
      introduced in ISA v3.0/POWER9.
      
      The darn instruction can potentially return an error, which is supported
      by the get_random_seed() API, in normal usage if we see an error we just
      return that to the caller.
      
      However when detecting whether darn is functional at boot we try up to
      10 times, before deciding that darn doesn't work and failing the
      registration of get_random_seed(). That way an intermittent failure
      at boot doesn't deprive the system of randomness until the next reboot.
      Signed-off-by: default avatarMatt Brown <matthew.brown.dev@gmail.com>
      [mpe: Move init into a function, tweak change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e66ca3db
    • Christophe Leroy's avatar
      powerpc/32: Fix boot failure on non 6xx platforms · 64d0a506
      Christophe Leroy authored
      Commit d300627c ("powerpc/6xx: Handle DABR match before calling
      do_page_fault") breaks non 6xx platforms.
      
        Failed to execute /init (error -14)
        Starting init: /bin/sh exists but couldn't execute it (error -14)
        Kernel panic - not syncing: No working init found.  Try passing init= ...
        CPU: 0 PID: 1 Comm: init Not tainted 4.13.0-rc3-s3k-dev-00143-g7aa62e972a56 #56
        Call Trace:
          panic+0x108/0x250 (unreliable)
          rootfs_mount+0x0/0x58
          ret_from_kernel_thread+0x5c/0x64
        Rebooting in 180 seconds..
      
      This is because in handle_page_fault(), the call to do_page_fault() has been
      mistakenly enclosed inside an #ifdef CONFIG_6xx
      
      Fixes: d300627c ("powerpc/6xx: Handle DABR match before calling do_page_fault")
      Brown-paper-bag-to-be-worn-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      64d0a506
    • Frederic Barrat's avatar
      powerpc/powernv: Enable PCI peer-to-peer · 25529100
      Frederic Barrat authored
      P9 has support for PCI peer-to-peer, enabling a device to write in the
      MMIO space of another device directly, without interrupting the CPU.
      
      This patch adds support for it on powernv, by adding a new API to be
      called by drivers. The pnv_pci_set_p2p(...) call configures an
      'initiator', i.e the device which will issue the MMIO operation, and a
      'target', i.e. the device on the receiving side.
      
      P9 really only supports MMIO stores for the time being but that's
      expected to change in the future, so the API allows to define both
      load and store operations.
      
        /* PCI p2p descriptor */
        #define OPAL_PCI_P2P_ENABLE           0x1
        #define OPAL_PCI_P2P_LOAD             0x2
        #define OPAL_PCI_P2P_STORE            0x4
      
        int pnv_pci_set_p2p(struct pci_dev *initiator, struct pci_dev *target,
                            u64 desc)
      
      It uses a new OPAL call, as the configuration magic is done on the
      PHBs by skiboot.
      Signed-off-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Reviewed-by: default avatarRussell Currey <ruscur@russell.cc>
      [mpe: Drop unrelated OPAL calls, s/uint64_t/u64/, minor formatting]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      25529100
  2. 03 Aug, 2017 22 commits
  3. 02 Aug, 2017 5 commits
  4. 01 Aug, 2017 3 commits
    • Victor Aoqui's avatar
      powerpc/kernel: Avoid preemption check in iommu_range_alloc() · 75f327c6
      Victor Aoqui authored
      Replace the __this_cpu_read() with raw_cpu_read() in
      iommu_range_alloc(). Otherwise we get a warning about using
      __this_cpu_read() in preemptible code:
      
        BUG: using __this_cpu_read() in preemptible
        caller is iommu_range_alloc+0xa8/0x3d0
      
      Preemption doesn't need to be disabled since according to the comment
      any CPU can safely use any IOMMU pool.
      Signed-off-by: default avatarVictor Aoqui <victora@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      75f327c6
    • Gautham R. Shenoy's avatar
      powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug · 24be85a2
      Gautham R. Shenoy authored
      Currently we use the stop-api provided by the firmware to program the
      SLW engine to restore the values of hypervisor resources that get lost
      on deeper idle states (such as winkle). Since the deep states were
      only used for CPU-Hotplug on POWER8 systems, we would program the LPCR
      to have the PECE1 bit since Hotplugged CPUs shouldn't be spuriously
      woken up by decrementer.
      
      On POWER9, some of the deep platform idle states such as stop4 can be
      used in cpuidle as well. In this case, we want the CPU in stop4 to be
      woken up by the decrementer when some timer on the CPU expires.
      
      In this patch, we program the stop-api for LPCR with PECE1
      bit cleared only when we are offlining the CPU and set it
      back once the CPU is online.
      Signed-off-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      24be85a2
    • Gautham R. Shenoy's avatar
      powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle · e1c1cfed
      Gautham R. Shenoy authored
      The stop4 idle state on POWER9 is a deep idle state which loses
      hypervisor resources, but whose latency is low enough that it can be
      exposed via cpuidle.
      
      Until now, the deep idle states which lose hypervisor resources (eg:
      winkle) were only exposed via CPU-Hotplug.  Hence currently on wakeup
      from such states, barring a few SPRs which need to be restored to
      their older value, rest of the SPRS are reinitialized to their values
      corresponding to that at boot time.
      
      When stop4 is used in the context of cpuidle, we want these additional
      SPRs to be restored to their older value, to ensure that the context
      on the CPU coming back from idle is same as it was before going idle.
      
      In this patch, we define a SPR save area in PACA (since we have used
      up the volatile register space in the stack) and on POWER9, we restore
      SPRN_PID, SPRN_LDBAR, SPRN_FSCR, SPRN_HFSCR, SPRN_MMCRA, SPRN_MMCR1,
      SPRN_MMCR2 to the values they had before entering stop.
      Signed-off-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e1c1cfed
  5. 31 Jul, 2017 3 commits
    • Rui Teng's avatar
      powerpc/mm: Fix check of multiple 16G pages from device tree · 23493c12
      Rui Teng authored
      The offset of hugepage block will not be 16G, if the expected
      page is more than one. Calculate the totol size instead of the
      hardcode value.
      
      Fixes: 4792adba ("powerpc: Don't use a 16G page if beyond mem= limits")
      Signed-off-by: default avatarRui Teng <rui.teng@linux.vnet.ibm.com>
      Tested-by: default avatarAnshuman Khandual <khandual@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      23493c12
    • Michael Ellerman's avatar
      powerpc/udbg: Reduce the footgun potential of EARLY_DEBUG_LPAR(_HVSI) · 9227f043
      Michael Ellerman authored
      For debugging very early boot problems we have CONFIG_PPC_EARLY_DEBUG,
      which allows configuring the kernel such that it unconditionally writes
      to a particular type of console, regardless of whether that console
      exists or not. This is useful sometimes when the kernel crashes before
      it can even determine what platform it's on, and therefore what consoles
      exist.
      
      However if you boot a kernel built this way on a different platform, it
      will generally crash because it writes to a console that doesn't exist.
      
      A particularly nasty instance of this is if you enable the hypervisor
      console early debug, and then boot that kernel on bare metal. The result
      is that the kernel calls "the hypervisor" very early in boot, but the
      kernel *is* the hypervisor, so we jump to the system call handler and
      start executing all sorts of code that isn't ready to be run. This may
      lead to a machine check or check stop depending on how lucky you are.
      
      Luckily there is an easy way to avoid this particular case. We simply
      read the MSR before installing the hooks, and if we see MSR_HV is set
      then we are the hypervisor and we definitely should not use the
      hypervisor console.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9227f043
    • Michael Ellerman's avatar
      powerpc/configs: Add a powernv_be_defconfig · 3603c52f
      Michael Ellerman authored
      Although pretty much everyone using powernv is running little endian,
      we should still test we can build for big endian. So add a
      powernv_be_defconfig, which is autogenerated by flipping the endian
      symbol in powernv_defconfig.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      3603c52f
  6. 25 Jul, 2017 1 commit