1. 04 Dec, 2018 19 commits
    • Christophe Leroy's avatar
      powerpc/8xx: Use hardware assistance in TLB handlers · 6a8f911b
      Christophe Leroy authored
      Today, on the 8xx the TLB handlers do SW tablewalk by doing all
      the calculation in ASM, in order to match with the Linux page
      table structure.
      
      The 8xx offers hardware assistance which allows significant size
      reduction of the TLB handlers, hence also reduces the time spent
      in the handlers.
      
      However, using this HW assistance implies some constraints on the
      page table structure:
      - Regardless of the main page size used (4k or 16k), the
      level 1 table (PGD) contains 1024 entries and each PGD entry covers
      a 4Mbytes area which is managed by a level 2 table (PTE) containing
      also 1024 entries each describing a 4k page.
      - 16k pages require 4 identifical entries in the L2 table
      - 512k pages PTE have to be spread every 128 bytes in the L2 table
      - 8M pages PTE are at the address pointed by the L1 entry and each
      8M page require 2 identical entries in the PGD.
      
      This patch modifies the TLB handlers to use HW assistance for 4K PAGES.
      
      Before that patch, the mean time spent in TLB miss handlers is:
      - ITLB miss: 80 ticks
      - DTLB miss: 62 ticks
      After that patch, the mean time spent in TLB miss handlers is:
      - ITLB miss: 72 ticks
      - DTLB miss: 54 ticks
      So the improvement is 10% for ITLB and 13% for DTLB misses
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6a8f911b
    • Christophe Leroy's avatar
      powerpc/8xx: Temporarily disable 16k pages and hugepages · 5af543be
      Christophe Leroy authored
      In preparation of making use of hardware assistance in TLB handlers,
      this patch temporarily disables 16K pages and hugepages. The reason
      is that when using HW assistance in 4K pages mode, the linux model
      fit with the HW model for 4K pages and 8M pages.
      
      However for 16K pages and 512K mode some additional work is needed
      to get linux model fit with HW model.
      For the 8M pages, they will naturaly come back when we switch to
      HW assistance, without any additional handling.
      In order to keep the following patch smaller, the removal of the
      current special handling for 8M pages gets removed here as well.
      
      Therefore the 4K pages mode will be implemented first and without
      support for 512k hugepages. Then the 512k hugepages will be brought
      back. And the 16K pages will be implemented in the following step.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      5af543be
    • Christophe Leroy's avatar
      powerpc/8xx: Move SW perf counters in first 32kb of memory · 8cfe4f52
      Christophe Leroy authored
      In order to simplify time critical exceptions handling 8xx
      specific SW perf counters, this patch moves the counters into
      the beginning of memory. This is possible because .text is readable
      and the counters are never modified outside of the handlers.
      
      By doing this, we avoid having to set a second register with
      the upper part of the address of the counters.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8cfe4f52
    • Christophe Leroy's avatar
      powerpc/mm: remove unnecessary test in pgtable_cache_init() · 32bff4b9
      Christophe Leroy authored
      pgtable_cache_add() gracefully handles the case when a cache that
      size already exists by returning early with the following test:
      
      	if (PGT_CACHE(shift))
      		return; /* Already have a cache of this size */
      
      It is then not needed to test the existence of the cache before.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      32bff4b9
    • Christophe Leroy's avatar
      powerpc/mm: fix a warning when a cache is common to PGD and hugepages · 1e03c7e2
      Christophe Leroy authored
      While implementing TLB miss HW assistance on the 8xx, the following
      warning was encountered:
      
      [  423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c
      [  423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671
      [  423.733075] NIP:  c0108f90 LR: c0109ad0 CTR: 00000004
      [  423.733121] REGS: c455bba0 TRAP: 0700   Not tainted  (4.18.0-rc8-00664-g2dfff9121c55)
      [  423.733147] MSR:  00021032 <ME,IR,DR,RI>  CR: 24224848  XER: 20000000
      [  423.733319]
      [  423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30
      [  423.733319] GPR08: 00000001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 00000040
      [  423.733319] GPR16: 73f00000 00210d00 00000000 00000001 c455a000 00000100 00000200 c455a000
      [  423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 00000000 00009032
      [  423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c
      [  423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c
      [  423.734283] Call Trace:
      [  423.734326] [c455bc50] [00000100] 0x100 (unreliable)
      [  423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c
      [  423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc
      [  423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c
      [  423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c
      [  423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc
      [  423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140
      [  423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8
      [  423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc
      [  423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38
      [  423.735271] Instruction dump:
      [  423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010
      [  423.735536] 71280004 41a2ff88 4840c571 4bffff80 <0fe00000> 4bfffeb8 81340010 712a0004
      [  423.735757] ---[ end trace e9b222919a470790 ]---
      
      This warning occurs when calling kmem_cache_zalloc() on a
      cache having a constructor.
      
      In this case it happens because PGD cache and 512k hugepte cache are
      the same size (4k). While a cache with constructor is created for
      the PGD, hugepages create cache without constructor and uses
      kmem_cache_zalloc(). As both expect a cache with the same size,
      the hugepages reuse the cache created for PGD, hence the conflict.
      
      In order to avoid this conflict, this patch:
      - modifies pgtable_cache_add() so that a zeroising constructor is
      added for any cache size.
      - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc()
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1e03c7e2
    • Christophe Leroy's avatar
      powerpc/mm: replace hugetlb_cache by PGT_CACHE(PTE_T_ORDER) · 03566562
      Christophe Leroy authored
      Instead of opencoding cache handling for the special case
      of hugepage tables having a single pte_t element, this
      patch makes use of the common pgtable_cache helpers
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      03566562
    • Christophe Leroy's avatar
      powerpc/mm: enable the use of page table cache of order 0 · 129dd323
      Christophe Leroy authored
      hugepages uses a cache of order 0. Lets allow page tables
      of order 0 in the common part in order to avoid open coding
      in hugetlb
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      129dd323
    • Christophe Leroy's avatar
      powerpc/mm: Extend pte_fragment functionality to PPC32 · 32ea4c14
      Christophe Leroy authored
      In order to allow the 8xx to handle pte_fragments, this patch
      extends the use of pte_fragments to PPC32 platforms.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      32ea4c14
    • Christophe Leroy's avatar
      powerpc/mm: add helpers to get/set mm.context->pte_frag · a74791dd
      Christophe Leroy authored
      In order to handle pte_fragment functions with single fragment
      without adding pte_frag in all mm_context_t, this patch creates
      two helpers which do nothing on platforms using a single fragment.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a74791dd
    • Christophe Leroy's avatar
      powerpc/mm: Move pgtable_t into platform headers · d09780f3
      Christophe Leroy authored
      This patch move pgtable_t into platform headers.
      
      It gets rid of the CONFIG_PPC_64K_PAGES case for PPC64
      as nohash/64 doesn't support CONFIG_PPC_64K_PAGES.
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d09780f3
    • Christophe Leroy's avatar
      powerpc/mm: move platform specific mmu-xxx.h in platform directories · 994da93d
      Christophe Leroy authored
      The purpose of this patch is to move platform specific
      mmu-xxx.h files in platform directories like pte-xxx.h files.
      
      In the meantime this patch creates common nohash and
      nohash/32 + nohash/64 mmu.h files for future common parts.
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      994da93d
    • Christophe Leroy's avatar
      powerpc/mm: Avoid useless lock with single page fragments · 2a146533
      Christophe Leroy authored
      There is no point in taking the page table lock as pte_frag or
      pmd_frag are always NULL when we have only one fragment.
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2a146533
    • Christophe Leroy's avatar
      powerpc/mm: Move pte_fragment_alloc() to a common location · a95d133c
      Christophe Leroy authored
      In preparation of next patch which generalises the use of
      pte_fragment_alloc() for all, this patch moves the related functions
      in a place that is common to all subarches.
      
      The 8xx will need that for supporting 16k pages, as in that mode
      page tables still have a size of 4k.
      
      Since pte_fragment with only once fragment is not different
      from what is done in the general case, we can easily migrate all
      subarchs to pte fragments.
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a95d133c
    • Christophe Leroy's avatar
      powerpc/8xx: Remove PTE_ATOMIC_UPDATES · ddfc20a3
      Christophe Leroy authored
      commit 1bc54c03 ("powerpc: rework 4xx PTE access and TLB miss")
      introduced non atomic PTE updates and started the work of removing
      PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the
      8xx with the following comment:
      /* Until my rework is finished, 8xx still needs atomic PTE updates */
      
      commit fe11dc3f ("powerpc/8xx: Update TLB asm so it behaves as
      linux mm expects") removed all PTE updates done in TLB miss handlers
      
      Therefore, atomic PTE updates are not needed anymore for the 8xx
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ddfc20a3
    • Christophe Leroy's avatar
      powerpc/book3s32: Remove CONFIG_BOOKE dependent code · a43ccc4b
      Christophe Leroy authored
      BOOK3S/32 cannot be BOOKE, so remove useless code
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a43ccc4b
    • Stephen Rothwell's avatar
      powerpc: annotate implicit fall throughs · 8ad94021
      Stephen Rothwell authored
      There is a plan to build the kernel with -Wimplicit-fallthrough and these
      places in the code produced warnings, but because we build arch/powerpc
      with -Werror, they became errors.  Fix them up.
      
      This patch produces no change in behaviour, but should be reviewed in
      case these are actually bugs not intentional fallthoughs.
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8ad94021
    • Breno Leitao's avatar
      powerpc/mm: remove unused function prototype · f91203e7
      Breno Leitao authored
      Commit f384796c ("powerpc/mm: Add support for handling > 512TB address
      in SLB miss") removed function slb_miss_bad_addr(struct pt_regs *regs), but
      kept its declaration in the prototype file. This patch simply removes the
      function definition.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f91203e7
    • Breno Leitao's avatar
      powerpc/pseries/cpuidle: Fix preempt warning · 2b038cbc
      Breno Leitao authored
      When booting a pseries kernel with PREEMPT enabled, it dumps the
      following warning:
      
         BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
         caller is pseries_processor_idle_init+0x5c/0x22c
         CPU: 13 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc3-00090-g12201a0128bc-dirty #828
         Call Trace:
         [c000000429437ab0] [c0000000009c8878] dump_stack+0xec/0x164 (unreliable)
         [c000000429437b00] [c0000000005f2f24] check_preemption_disabled+0x154/0x160
         [c000000429437b90] [c000000000cab8e8] pseries_processor_idle_init+0x5c/0x22c
         [c000000429437c10] [c000000000010ed4] do_one_initcall+0x64/0x300
         [c000000429437ce0] [c000000000c54500] kernel_init_freeable+0x3f0/0x500
         [c000000429437db0] [c0000000000112dc] kernel_init+0x2c/0x160
         [c000000429437e20] [c00000000000c1d0] ret_from_kernel_thread+0x5c/0x6c
      
      This happens because the code calls get_lppaca() which calls
      get_paca() and it checks if preemption is disabled through
      check_preemption_disabled().
      
      Preemption should be disabled because the per CPU variable may make no
      sense if there is a preemption (and a CPU switch) after it reads the
      per CPU data and when it is used.
      
      In this device driver specifically, it is not a problem, because this
      code just needs to have access to one lppaca struct, and it does not
      matter if it is the current per CPU lppaca struct or not (i.e. when
      there is a preemption and a CPU migration).
      
      That said, the most appropriate fix seems to be related to avoiding
      the debug_smp_processor_id() call at get_paca(), instead of calling
      preempt_disable() before get_paca().
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2b038cbc
    • Breno Leitao's avatar
      powerpc/xmon: Fix invocation inside lock region · 8d4a8622
      Breno Leitao authored
      Currently xmon needs to get devtree_lock (through rtas_token()) during its
      invocation (at crash time). If there is a crash while devtree_lock is being
      held, then xmon tries to get the lock but spins forever and never get into
      the interactive debugger, as in the following case:
      
      	int *ptr = NULL;
      	raw_spin_lock_irqsave(&devtree_lock, flags);
      	*ptr = 0xdeadbeef;
      
      This patch avoids calling rtas_token(), thus trying to get the same lock,
      at crash time. This new mechanism proposes getting the token at
      initialization time (xmon_init()) and just consuming it at crash time.
      
      This would allow xmon to be possible invoked independent of devtree_lock
      being held or not.
      Signed-off-by: default avatarBreno Leitao <leitao@debian.org>
      Reviewed-by: default avatarThiago Jung Bauermann <bauerman@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8d4a8622
  2. 26 Nov, 2018 17 commits
  3. 25 Nov, 2018 4 commits