1. 04 Apr, 2018 3 commits
    • Aneesh Kumar K.V's avatar
      powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix · fb4e5dbd
      Aneesh Kumar K.V authored
      With split PTL (page table lock) config, we allocate the level
      4 (leaf) page table using pte fragment framework instead of slab cache
      like other levels. This was done to enable us to have split page table
      lock at the level 4 of the page table. We use page->plt backing the
      all the level 4 pte fragment for the lock.
      
      Currently with Radix, we use only 16 fragments out of the allocated
      page. In radix each fragment is 256 bytes which means we use only 4k
      out of the allocated 64K page wasting 60k of the allocated memory.
      This was done earlier to keep it closer to hash.
      
      This patch update the pte fragment count to 256, thereby using the
      full 64K page and reducing the memory usage. Performance tests shows
      really low impact even with THP disabled. With THP disabled we will be
      contenting further less on level 4 ptl and hence the impact should be
      further low.
      
        256 threads:
          without patch (10 runs of ./ebizzy  -m -n 1000 -s 131072 -S 100)
            median = 15678.5
            stdev = 42.1209
      
          with patch:
            median = 15354
            stdev = 194.743
      
      This is with THP disabled. With THP enabled the impact of the patch
      will be less.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      fb4e5dbd
    • Aneesh Kumar K.V's avatar
      powerpc/mm/keys: Update documentation and remove unnecessary check · f2ed480f
      Aneesh Kumar K.V authored
      Adds more code comments. We also remove an unnecessary pkey check
      after we check for pkey error in this patch.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f2ed480f
    • Nicholas Piggin's avatar
      powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead · b9ee31e1
      Nicholas Piggin authored
      When stop is executed with EC=ESL=0, it appears to execute like a
      normal instruction (resuming from NIP when woken by interrupt). So all
      the save/restore handling can be avoided completely. In particular NV
      GPRs do not have to be saved, and MSR does not have to be switched
      back to kernel MSR.
      
      So move the test for EC=ESL=0 sleep states out to power9_idle_stop,
      and return directly to the caller after stop in that case.
      
      This improves performance for ping-pong benchmark with the stop0_lite
      idle state by 2.54% for 2 threads in the same core, and 2.57% for
      different cores. Performance increase with HV_POSSIBLE defined will be
      improved further by avoiding the hwsync.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b9ee31e1
  2. 03 Apr, 2018 11 commits
  3. 01 Apr, 2018 5 commits
  4. 31 Mar, 2018 21 commits