1. 28 Sep, 2022 2 commits
  2. 16 Sep, 2022 2 commits
    • Thomas Richter's avatar
      s390/pai: Add support for PAI Extension 1 NNPA counters · c432fefe
      Thomas Richter authored
      PMU device driver perf_paiext supports Processor Activity
      Instrumentation Extension (PAIE1), available with IBM z16:
      - maps a 512 byte block to lowcore address 0x1508 called PAIE1 control
        block.
      - maps a 1024 byte block at PAIE1 control block entry with index 2.
      - uses control register bit 14 to enable PAIE1 control block lookup.
      - turn PAIE1 nnpa counting on and off by setting bit 63 in
        PAIE1 control block entry with index 2.
      - creates a sample with raw data on each context switch out when
        at context switch some mapped counters have a value of nonzero.
      This device driver only supports CPU wide context, no task context
      is allowed.
      
      Support for counting:
      - one or more counters can be specified using
        perf stat -e pai_ext/xxx/
        where xxx stands for the counter event name. Multiple invocation
        of this command is possible. The counter names are listed in
        /sys/devices/pai_ext/events directory.
      - one special counters can be specified using
        perf stat -e pai_ext/NNPA_ALL/
        which returns the sum of all incremented nnpa counters.
      - multiple counting events can run in parallel.
      
      Support for Sampling:
      - one event pai_ext/NNPA_ALL/ is reserved for sampling.
        The event collects data at context switch out and saves them in
        the ring buffer.
      - no multiple invocations are possible.
      
      The PAIE1 nnpa counter events are system wide. No task context is
      supported.  Therefore some restrictions documented in function
      paiext_busy() apply.
      
      Extend qpaci assembly instruction to query supported memory mapped nnpa
      counters. It returns the number of counters (no holes allowed in that
      range).
      
      PAIE1 nnpa counter events can not be created when a CPU hot plug
      add is processed. This means a CPU hot plug add does not get
      the necessary PAIE1 event to record PAIE1 nnpa counter increments
      on the newly added CPU. CPU hot plug remove removes the event and
      terminates the counting of PAIE1 counters immediately.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Reviewed-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      c432fefe
    • Alexander Gordeev's avatar
      s390/mm: fix no previous prototype warnings in maccess.c · 9267bdd8
      Alexander Gordeev authored
      Fix -Wmissing-prototypes warnings caused by missing maccess.h include.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: 2f0e8aae ("s390/mm: rework memcpy_real() to avoid DAT-off mode")
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      9267bdd8
  3. 14 Sep, 2022 9 commits
    • Alexander Gordeev's avatar
      s390/mm: uninline copy_oldmem_kernel() function · fba07cd4
      Alexander Gordeev authored
      Uninline copy_oldmem_kernel() function and make it consistent
      with a very similar memcpy_real() implementation, by moving
      to code to crash_dump.c, where it actually belongs.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      fba07cd4
    • Alexander Gordeev's avatar
      s390/mm,ptdump: add real memory copy page markers · c0ceb944
      Alexander Gordeev authored
      Add "Real Memory Copy Area Start" and "Real Memory Copy Area End"
      markers that fence the page used for real memory copying.
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      c0ceb944
    • Alexander Gordeev's avatar
      s390/mm: rework memcpy_real() to avoid DAT-off mode · 2f0e8aae
      Alexander Gordeev authored
      Function memcpy_real() is an univeral data mover that does not
      require DAT mode to be able reading from a physical address.
      Its advantage is an ability to read from any address, even
      those for which no kernel virtual mapping exists.
      
      Although memcpy_real() is interrupt-safe, there are no handlers
      that make use of this function. The compiler instrumentation
      have to be disabled and separate no-DAT stack used to allow
      execution of the function once DAT mode is disabled.
      
      Rework memcpy_real() to overcome these shortcomings. As result,
      data copying (which is primarily reading out a crashed system
      memory by a user process) is executed on a regular stack with
      enabled interrupts. Also, use of memcpy_real_buf swap buffer
      becomes unnecessary and the swapping is eliminated.
      
      The above is achieved by using a fixed virtual address range
      that spans a single page and remaps that page repeatedly when
      memcpy_real() is called for a particular physical address.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      2f0e8aae
    • Alexander Gordeev's avatar
      s390/dump: save IPL CPU registers once DAT is available · 14a3a262
      Alexander Gordeev authored
      Function smp_save_dump_cpus() collects CPU state of a crashed
      system for secondary CPUs and for the IPL CPU very differently.
      The Signal Processor stop-and-store-status orders are used for
      the former while Hardware System Area requests and memcpy_real()
      routine are called for the latter. In addition a system reset is
      triggered, which pins smp_save_dump_cpus() function call before
      CPU and device initialization.
      
      Move the collection of IPL CPU state to a later stage when DAT
      becomes available. That is needed to allow a follow-up rework of
      memcpy_real() routine.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      14a3a262
    • Niklas Schnelle's avatar
      s390/pci: convert high_memory to physical address · 2187582c
      Niklas Schnelle authored
      We use high_memory as a measure for amount of memory available in
      determining the required minimum size of our IOVA space with the
      assumption that one rarely maps more than the available memory for DMA.
      In special cases like mapping significant amounts of memory more than
      once this can still be tuned with the s390_iommu_apterture kernel
      parameter. In this use case high_memory is treated as a physical
      address. As high_memory is a virtual address however this means we need
      to convert it using virt_to_phys() before use
      
      Note that at the moment physical and virtual addresses are identical so
      this mismatch does not currently cause trouble.
      Reviewed-by: default avatarMatthew Rosato <mjrosato@linux.ibm.com>
      Signed-off-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      2187582c
    • Alexander Gordeev's avatar
      s390/smp,ptdump: add absolute lowcore markers · 50787755
      Alexander Gordeev authored
      Add "Lowcore Area Start" and "Lowcore Area End" markers
      that fence pages where absolute lowcore resides.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      50787755
    • Alexander Gordeev's avatar
      s390/smp: rework absolute lowcore access · 4df29d2b
      Alexander Gordeev authored
      Temporary unsetting of the prefix page in memcpy_absolute() routine
      poses a risk of executing code path with unexpectedly disabled prefix
      page. This rework avoids the prefix page uninstalling and disabling
      of normal and machine check interrupts when accessing the absolute
      zero memory.
      
      Although memcpy_absolute() routine can access the whole memory, it is
      only used to update the absolute zero lowcore. This rework therefore
      introduces a new mechanism for the absolute zero lowcore access and
      scraps memcpy_absolute() routine for good.
      
      Instead, an area is reserved in the virtual memory that is used for
      the absolute lowcore access only. That area holds an array of 8KB
      virtual mappings - one per CPU. Whenever a CPU is brought online, the
      corresponding item is mapped to the real address of the previously
      installed prefix page.
      
      The absolute zero lowcore access works like this: a CPU calls the
      new primitive get_abs_lowcore() to obtain its 8KB mapping as a
      pointer to the struct lowcore. Virtual address references to that
      pointer get translated to the real addresses of the prefix page,
      which in turn gets swapped with the absolute zero memory addresses
      due to prefixing. Once the pointer is not needed it must be released
      with put_abs_lowcore() primitive:
      
      	struct lowcore *abs_lc;
      	unsigned long flags;
      
      	abs_lc = get_abs_lowcore(&flags);
      	abs_lc->... = ...;
      	put_abs_lowcore(abs_lc, flags);
      
      To ensure the described mechanism works large segment- and region-
      table entries must be avoided for the 8KB mappings. Failure to do
      so results in usage of Region-Frame Absolute Address (RFAA) or
      Segment-Frame Absolute Address (SFAA) large page fields. In that
      case absolute addresses would be used to address the prefix page
      instead of the real ones and the prefixing would get bypassed.
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      4df29d2b
    • Alexander Gordeev's avatar
      s390/smp: call smp_reinit_ipl_cpu() before scheduler is available · 6cbd7cc2
      Alexander Gordeev authored
      Currently smp_reinit_ipl_cpu() is a pre-SMP early initcall.
      That ensures no CPU is running in parallel, but still not
      enough to assume the code is exclusive, since the scheduling
      is already available.
      
      Move the function call to arch_call_rest_init() callback
      to ensure no thread could be preempted and allow lockless
      allocation of the kernel page tables. That is needed to
      allow a follow-up rework of the absolute lowcore access
      mechanism.
      Suggested-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      6cbd7cc2
    • Vasily Gorbik's avatar
      Merge branch 'fixes' into features · d61bb30e
      Vasily Gorbik authored
      * fixes:
        s390/smp: enforce lowcore protection on CPU restart
        s390/boot: fix absolute zero lowcore corruption on boot
        s390/hugetlb: fix prepare_hugepage_range() check for 2 GB hugepages
        s390: update defconfigs
        s390: fix nospec table alignments
        s390/mm: remove useless hugepage address alignment
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      d61bb30e
  4. 07 Sep, 2022 4 commits
  5. 30 Aug, 2022 7 commits
  6. 28 Aug, 2022 16 commits