1. 06 Dec, 2022 6 commits
    • Heiko Carstens's avatar
      s390/nmi: rework register validation handling · f9e5938a
      Heiko Carstens authored
      If a machine check happens in kernel mode, and the machine check
      interruption code indicates that e.g. vector register contents in the
      machine check area are not valid, the logic is to kill current.
      
      The idea behind this was that if within kernel context vector
      registers are not used then it is sufficient to kill the current user
      space process to avoid that it continues with potentially corrupt
      register contents. This however does not necessarily work, since the
      current code does not take into account that a machine check can also
      happen when a kernel thread is running (= no user space context), and
      in addition there is no way to distinguish between the "previous" and
      "next" user process task, if the machine check happens when a task
      switch happens.
      
      Given that machine checks with invalid saved register contents in the
      machine check save area are extremely rare, simplify the logic: if
      register contents are invalid and the previous context was kernel
      mode, stop the whole machine. If the previous context was user mode,
      kill the corresponding task.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      f9e5938a
    • Heiko Carstens's avatar
      s390/nmi: use vector instruction macros instead of byte patterns · 5720aab2
      Heiko Carstens authored
      Use vector instruction macros instead of byte patterns to increase
      readability. The generated code is nearly identical:
      
      - 1e8:  e7 0f 10 00 00 36       vlm     %v0,%v15,0(%r1)
      - 1ee:  e7 0f 11 00 0c 36       vlm     %v16,%v31,256(%r1)
      + 1e8:  e7 0f 10 00 30 36       vlm     %v0,%v15,0(%r1),3
      + 1ee:  e7 0f 11 00 3c 36       vlm     %v16,%v31,256(%r1),3
      
      By using the VLM macro the alignment hint is automatically specified
      too. Even though from a performance perspective it doesn't matter at
      all for the machine check code, this shows yet another benefit when
      using the macros.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      5720aab2
    • Heiko Carstens's avatar
      s390/vx: add vx-insn.h wrapper include file · 706f2ada
      Heiko Carstens authored
      The vector instruction macros can also be used in inline assemblies. For
      this the magic
      
      asm(".include \"asm/vx-insn.h\"\n");
      
      must be added to C files in order to avoid that the pre-processor
      eliminates the __ASSEMBLY__ guarded macros. This however comes with the
      problem that changes to asm/vx-insn.h do not cause a recompile of C files
      which have only this magic statement instead of a proper include statement.
      This can be observed with the arch/s390/kernel/fpu.c file.
      
      In order to fix this problem and also to avoid that the include must
      be specified twice, add a wrapper include header file which will do
      all necessary steps.
      
      This way only the vx-insn.h header file needs to be included and changes to
      the new vx-insn-asm.h header file cause a recompile of all dependent files
      like it should.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      706f2ada
    • Sven Schnelle's avatar
      s390/ipl: use octal values instead of S_* macros · a70f7276
      Sven Schnelle authored
      octal values are easier to read and checkpatch also recommends
      to use them, so replace all the S_* macros with their counterparts.
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      a70f7276
    • Sven Schnelle's avatar
      s390/ipl: add eckd dump support · e2d2a296
      Sven Schnelle authored
      This adds support to use ECKD disks as dump device
      to linux. The new dump type is called 'eckd_dump', parameters
      are the same as for eckd ipl.
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      e2d2a296
    • Sven Schnelle's avatar
      s390/ipl: add eckd support · 87fd22e0
      Sven Schnelle authored
      This adds support to IPL from ECKD DASDs to linux.
      It introduces a few sysfs files in /sys/firmware/reipl/eckd:
      
      bootprog: the boot program selector
      clear:    whether to issue a diag308 LOAD_NORMAL or LOAD_CLEAR
      device:   the device to ipl from
      br_chr:   Cylinder/Head/Record number to read the bootrecord from.
                Might be '0' or 'auto' if it should be read from the
      	  volume label.
      scpdata:  data to be passed to the ipl'd program.
      
      The new ipl type is called 'eckd'.
      Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Reviewed-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      87fd22e0
  2. 05 Dec, 2022 2 commits
  3. 02 Dec, 2022 2 commits
  4. 01 Dec, 2022 1 commit
  5. 29 Nov, 2022 3 commits
  6. 23 Nov, 2022 8 commits
  7. 16 Nov, 2022 4 commits
  8. 10 Nov, 2022 1 commit
    • Gerald Schaefer's avatar
      s390: select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP · 00a34d5a
      Gerald Schaefer authored
      Enable HUGETLB_PAGE_OPTIMIZE_VMEMMAP for s390.
      
      With this, vmemmap pages used to back struct pages for compound tail
      pages of hugetlb pages are freed and remapped to compound head page
      frame as RO, see also Documentation/vm/vmemmap_dedup.rst.
      
      For 1M hugetlb pages, this results in freeing 3 of 4 vmemmap pages,
      saving 12K of memory for each 1M hugetlb page (~1.2%).
      /sys/kernel/debug/kernel_page_tables will show the impact:
      
      ---[ vmemmap Area Start ]---
      [...]
      0x0000037202d84000-0x0000037202d85000         4K PTE RW NX
      0x0000037202d85000-0x0000037202d88000        12K PTE RO NX
      
      For 2G hugetlb pages, this results in freeing 8191 of 8192 vmemmap
      pages, saving 32764K of memory for each 2G hugetlb page (~1.6%)
      /sys/kernel/debug/kernel_page_tables will show the impact:
      
      ---[ vmemmap Area Start ]---
      [...]
      0x000003720a000000-0x000003720a001000         4K PTE RW NX
      0x000003720a001000-0x000003720c000000     32764K PTE RO NX
      
      The memory savings come with some costs:
      - vmemmap mapping for compound hugetlb pages is not a PMD mapping any
        more, but split to 4K PTE mappings, and it will not be coalesced back
        to PMD mapping after freeing hugetlb pages from the pool.
        Apart from theoretical performance impact, this will also (slightly)
        relativize the memory savings because of additional 2K PTE pagetable
        allocations.
      - Workload using "on the fly" hugetlb allocations via
        "nr_overcommit_hugepages" instead of using the hugetlb pool via
        "nr_hugepages" will suffer from considerably increased fault handling
        time, see also description from commit 78f39084
        ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl").
      - Freeing hugetlb pages from the pool will require re-allocation of the
        freed struct pages, and therefore needs some memory available to the
        kernel. This might fail in memory constrained scenarios.
      - For the same reason, memory offline might fail even for ZONE_MOVABLE
        when hugetlb pages are present (but not for s390, since we do not
        support ARCH_ENABLE_HUGEPAGE_MIGRATION, and therefore cannot have
        hugetlb pages in ZONE_MOVABLE).
      - General increased complexity and overhead in kernel handling of
        compound (head) pages.
      
      Therefore, this feature is disabled by default, and has to be enabled
      explicitly either by adding "hugetlb_free_vmemmap=on" kernel parameter,
      or during run-time via "/proc/sys/vm/hugetlb_optimize_vmemmap" sysctl.
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      00a34d5a
  9. 26 Oct, 2022 6 commits
    • Thomas Richter's avatar
      s390/pai: rename structure member users to active_events · 58354c7d
      Thomas Richter authored
      Rename structure member users to active_events to make it consistent
      with PMU pai_ext. Also use the same prefix syntax for increment and
      decrement operators in both PMUs.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      58354c7d
    • Thomas Richter's avatar
      s390/pai: rework pai_crypto mapped buffer reference count · d3db4ac3
      Thomas Richter authored
      Rework the mapped buffer reference count in PMU pai_crypto
      to match the same technique as in PMU pai_ext.
      This simplifies the logic.
      Do not count the individual number of counter and sampling
      processes. Remember the type of access and the total number of
      references to the buffer.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      d3db4ac3
    • Thomas Richter's avatar
      s390/pai: move enum definition to header file · 4c787963
      Thomas Richter authored
      Move enum definition to header file. This is done in preparation
      for a follow on patch where this enum will be used in another source
      file.
      Also change the enum name from paiext_mode to paievt_mode
      to indicate this enum is now used for several events.
      Make naming consistent and rename PAI_MODE_COUNTER to PAI_MODE_COUNTING.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      4c787963
    • Thomas Richter's avatar
      s390/con3215: Fix white space errors · 55af33fd
      Thomas Richter authored
      Adjust white space according to coding guidelines.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      55af33fd
    • Thomas Richter's avatar
      s390/con3215: Drop console data printout when buffer full · 1f3307cf
      Thomas Richter authored
      Using z/VM the 3270 terminal emulator also emulates an IBM 3215 console
      which outputs line by line. When the screen is full, the console enters
      the MORE... state and waits for the operator to confirm the data
      on the screen by pressing a clear key. If this does not happen in the
      default time frame (currently 50 seconds) the console enters the HOLDING
      state.
      It then waits another time frame (currently 10 seconds) before the output
      continues on the next screen. When the operator presses the clear key
      during these wait times, the output continues immediately.
      
      This may lead to a very long boot time when the console
      has to print many messages, also the system may hang because of the
      console's limited buffer space and the system waits for the console
      output to drain and finally to finish. This problem can only occur
      when a terminal emulator is actually connected to the 3215 console
      driver. If not z/VM simply drops console output.
      
      Remedy this rare situation and add a kernel boot command line parameter
      con3215_drop. It can be set to 0 (do not drop) or 1 (do drop) which is
      the default. This instructs the kernel drop console data when the
      console buffer is full. This speeds up the boot time considerable and
      also does not hang the system anymore.
      
      Add a sysfs attribute file for console IBM 3215 named con_drop.
      This allows for changing the behavior after the boot, for example when
      during interactive debugging a panic/crash is expected.
      
      Here is a test of the new behavior using the following test program:
       #/bin/bash
       declare -i cnt=4
      
       mode=$(cat /sys/bus/ccw/drivers/3215/con_drop)
       [ $mode = yes ] && cnt=25
      
       echo "cons_drop $(cat /sys/bus/ccw/drivers/3215/con_drop)"
       echo "vmcp term more 5 2"
       vmcp term more 5 2
       echo "Run $cnt iterations of "'echo t > /proc/sysrq-trigger'
      
       for i in $(seq $cnt)
       do
      	echo "$i. command 'echo t > /proc/sysrq-trigger' at $(date +%F,%T)"
      	echo t > /proc/sysrq-trigger
      	sleep 1
       done
       echo "droptest done" > /dev/kmsg
       #
      
      Output with sysfs attribute con_drop set to 1:
       # ./droptest.sh
       cons_drop yes
       vmcp term more 5 2
       Run 25 iterations of echo t > /proc/sysrq-trigger
       1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:09
       2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:10
       3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:11
       4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:12
       5. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:13
       6. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:14
       7. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:15
       8. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:16
       9. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:17
       10. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:18
       11. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:19
       12. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:20
       13. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:21
       14. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:22
       15. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:23
       16. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:24
       17. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:25
       18. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:26
       19. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:27
       20. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:28
       21. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:29
       22. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:30
       23. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:31
       24. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:32
       25. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:15:33
       #
      
      There are no hangs anymore.
      
      Output with sysfs attribute con_drop set to 0 and identical
      setting for z/VM console 'term more 5 2'. Sometimes hitting the
      clear key at the x3270 console to progress output.
      
       # ./droptest.sh
       cons_drop no
       vmcp term more 5 2
       Run 4 iterations of echo t > /proc/sysrq-trigger
       1. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:20:58
       2. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:24:32
       3. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:28:04
       4. command 'echo t > /proc/sysrq-trigger' at 2022-09-02,10:31:37
       #
      
      Details:
      Enable function raw3215_write() to handle tab expansion and newlines
      and feed it with input not larger than the console buffer of 65536
      bytes. Function raw3125_putchar() just forwards its character for
      output to raw3215_write().
      
      This moves tab to blank conversion to one function raw3215_write()
      which also does call raw3215_make_room() to wait for enough free
      buffer space.
      
      Function handle_write() loops over all its input and segments input
      into chunks of console buffer size (should the input be larger).
      
      Rework tab expansion handling logic to avoid code duplication.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Acked-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      1f3307cf
    • Thomas Richter's avatar
      s390/con3215: Simplify console write operation · 655ae931
      Thomas Richter authored
      The functions con3215_write() and tty3215_write() have nearly
      identical function bodies and a slightly different function prototype.
      Create function handle_write() to handle the common function
      body and maintain the function prototypes.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      655ae931
  10. 23 Oct, 2022 7 commits