1. 22 Feb, 2024 2 commits
  2. 21 Feb, 2024 1 commit
  3. 20 Feb, 2024 12 commits
    • Josh Poimboeuf's avatar
      s390: compile relocatable kernel without -fPIE · 778666df
      Josh Poimboeuf authored
      On s390, currently kernel uses the '-fPIE' compiler flag for compiling
      vmlinux.  This has a few problems:
      
        - It uses dynamic symbols (.dynsym), for which the linker refuses to
          allow more than 64k sections.  This can break features which use
          '-ffunction-sections' and '-fdata-sections', including kpatch-build
          [1] and Function Granular KASLR.
      
        - It unnecessarily uses GOT relocations, adding an extra layer of
          indirection for many memory accesses.
      
      Instead of using '-fPIE', resolve all the relocations at link time and
      then manually adjust any absolute relocations (R_390_64) during boot.
      
      This is done by first telling the linker to preserve all relocations
      during the vmlinux link.  (Note this is harmless: they are later
      stripped in the vmlinux.bin link.)
      
      Then use the 'relocs' tool to find all absolute relocations (R_390_64)
      which apply to allocatable sections.  The offsets of those relocations
      are saved in a special section which is then used to adjust the
      relocations during boot.
      
      (Note: For some reason, Clang occasionally creates a GOT reference, even
      without '-fPIE'.  So Clang-compiled kernels have a GOT, which needs to
      be adjusted.)
      
      On my mostly-defconfig kernel, this reduces kernel text size by ~1.3%.
      
      [1] https://github.com/dynup/kpatch/issues/1284
      [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
      [3] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
      
      Compiler consideration:
      
      Gcc recently implemented an optimization [2] for loading symbols without
      explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
      symbols to reside on a 2-byte boundary, enabling the use of the larl
      instruction. However, kernel linker scripts may still generate unaligned
      symbols. To address this, a new -munaligned-symbols option has been
      introduced [3] in recent gcc versions. This option has to be used with
      future gcc versions.
      
      Older Clang lacks support for handling unaligned symbols generated
      by kernel linker scripts when the kernel is built without -fPIE. However,
      future versions of Clang will include support for the -munaligned-symbols
      option. When the support is unavailable, compile the kernel with -fPIE
      to maintain the existing behavior.
      
      In addition to it:
      move vmlinux.relocs to safe relocation
      
      When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire
      uncompressed vmlinux.bin is positioned in the bzImage decompressor
      image at the default kernel LMA of 0x100000, enabling it to be executed
      in-place. However, the size of .vmlinux.relocs could be large enough to
      cause an overlap with the uncompressed kernel at the address 0x100000.
      To address this issue, .vmlinux.relocs is positioned after the
      .rodata.compressed in the bzImage. Nevertheless, in this configuration,
      vmlinux.relocs will overlap with the .bss section of vmlinux.bin. To
      overcome that, move vmlinux.relocs to a safe location before clearing
      .bss and handling relocs.
      
      Compile warning fix from Sumanth Korikkar:
      
      When kernel is built with CONFIG_LD_ORPHAN_WARN and -fno-PIE, there are
      several warnings:
      
      ld: warning: orphan section `.rela.iplt' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.head.text' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.init.text' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.rodata.cst8' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      
      Orphan sections are sections that exist in an object file but don't have
      a corresponding output section in the final executable. ld raises a
      warning when it identifies such sections.
      
      Eliminate the warning by placing all .rela orphan sections in .rela.dyn
      and raise an error when size of .rela.dyn is greater than zero. i.e.
      Dont just neglect orphan sections.
      
      This is similar to adjustment performed in x86, where kernel is built
      with -fno-PIE.
      commit 5354e845 ("x86/build: Add asserts for unwanted sections")
      
      [sumanthk@linux.ibm.com: rebased Josh Poimboeuf patches and move
       vmlinux.relocs to safe location]
      [hca@linux.ibm.com: merged compile warning fix from Sumanth]
      Tested-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-4-sumanthk@linux.ibm.com
      Link: https://lore.kernel.org/r/20240219132734.22881-5-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      778666df
    • Josh Poimboeuf's avatar
      s390: add relocs tool · 55dc65b4
      Josh Poimboeuf authored
      This 'relocs' tool is copied from the x86 version, ported for s390, and
      greatly simplified to remove unnecessary features.
      
      It reads vmlinux and outputs assembly to create a .vmlinux.relocs_64
      section which contains the offsets of all R_390_64 relocations which
      apply to allocatable sections.
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-3-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      55dc65b4
    • Sumanth Korikkar's avatar
      s390/vdso64: filter out munaligned-symbols flag for vdso · 8192a1b3
      Sumanth Korikkar authored
      Gcc recently implemented an optimization [1] for loading symbols without
      explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
      symbols to reside on a 2-byte boundary, enabling the use of the larl
      instruction. However, kernel linker scripts may still generate unaligned
      symbols. To address this, a new -munaligned-symbols option has been
      introduced [2] in recent gcc versions.
      
      [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
      [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
      
      However, when -munaligned-symbols  is used in vdso code, it leads to the
      following compilation error:
      `.data.rel.ro.local' referenced in section `.text' of
      arch/s390/kernel/vdso64/vdso64_generic.o: defined in discarded section
      `.data.rel.ro.local' of arch/s390/kernel/vdso64/vdso64_generic.o
      
      vdso linker script discards .data section to make it lightweight.
      However, -munaligned-symbols in vdso object files references literal
      pool and accesses _vdso_data. Hence, compile vdso code without
      -munaligned-symbols.  This means in the future, vdso code should deal
      with alignment of newly introduced unaligned linker symbols.
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-2-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      8192a1b3
    • Nathan Chancellor's avatar
      s390/boot: add 'alloc' to info.bin .vmlinux.info section flags · 9ea30fd1
      Nathan Chancellor authored
      When attempting to boot a kernel compiled with OBJCOPY=llvm-objcopy,
      there is a crash right at boot:
      
        Out of memory allocating 6d7800 bytes 8 aligned in range 0:20000000
        Reserved memory ranges:
        0000000000000000 a394c3c30d90cdaf DECOMPRESSOR
        Usable online memory ranges (info source: sclp read info [3]):
        0000000000000000 0000000020000000
        Usable online memory total: 20000000 Reserved: a394c3c30d90cdaf Free: 0
        Call Trace:
        (sp:0000000000033e90 [<0000000000012fbc>] physmem_alloc_top_down+0x5c/0x104)
         sp:0000000000033f00 [<0000000000011d56>] startup_kernel+0x3a6/0x77c
         sp:0000000000033f60 [<00000000000100f4>] startup_normal+0xd4/0xd4
      
      GNU objcopy does not have any issues. Looking at differences between the
      object files in each build reveals info.bin does not get properly
      populated with llvm-objcopy, which results in an empty .vmlinux.info
      section.
      
        $ file {gnu,llvm}-objcopy/arch/s390/boot/info.bin
        gnu-objcopy/arch/s390/boot/info.bin:  data
        llvm-objcopy/arch/s390/boot/info.bin: empty
      
        $ llvm-readelf --section-headers {gnu,llvm}-objcopy/arch/s390/boot/vmlinux | rg 'File:|\.vmlinux\.info|\.decompressor\.syms'
        File: gnu-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
        File: llvm-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000000 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034000 035000 000b00 00  WA  0   0  1
      
      Ulrich points out that llvm-objcopy only copies sections marked as alloc
      with a binary output target, whereas the .vmlinux.info section is only
      marked as load. Add 'alloc' in addition to 'load', so that both objcopy
      implementations work properly:
      
        $ file {gnu,llvm}-objcopy/arch/s390/boot/info.bin
        gnu-objcopy/arch/s390/boot/info.bin:  data
        llvm-objcopy/arch/s390/boot/info.bin: data
      
        $ llvm-readelf --section-headers {gnu,llvm}-objcopy/arch/s390/boot/vmlinux | rg 'File:|\.vmlinux\.info|\.decompressor\.syms'
        File: gnu-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
        File: llvm-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
      
      Closes: https://github.com/ClangBuiltLinux/linux/issues/1996
      Link: https://github.com/llvm/llvm-project/commit/3c02cb7492fc78fb678264cebf57ff88e478e14fSuggested-by: default avatarUlrich Weigand <ulrich.weigand@de.ibm.com>
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/r/20240216-s390-fix-boot-with-llvm-objcopy-v1-1-0ac623daf42b@kernel.orgSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      9ea30fd1
    • Gerd Bayer's avatar
      s390/pci: fix three typos in comments · d0c8fd21
      Gerd Bayer authored
      Found and fixed these while working on synchronizing the state
      handling of zpci_dev's.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d0c8fd21
    • Gerd Bayer's avatar
      s390/pci: remove hotplug slot when releasing the device · 6ee600bf
      Gerd Bayer authored
      Centralize the removal so all paths are covered and the hotplug slot
      will remain active until the device is really destroyed.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      6ee600bf
    • Gerd Bayer's avatar
      s390/pci: introduce lock to synchronize state of zpci_dev's · bcb5d6c7
      Gerd Bayer authored
      There's a number of tasks that need the state of a zpci device
      to be stable. Other tasks need to be synchronized as they change the state.
      
      State changes could be generated by the system as availability or error
      events, or be requested by the user through manipulations in sysfs.
      Some other actions accessible through sysfs - like device resets - need the
      state to be stable.
      
      Unsynchronized state handling could lead to unusable devices. This has
      been observed in cases of concurrent state changes through systemd udev
      rules and DPM boot control. Some breakage can be provoked by artificial
      tests, e.g. through repetitively injecting "recover" on a PCI function
      through sysfs while running a "hotplug remove/add" in a loop through a
      PCI slot's "power" attribute in sysfs. After a few iterations this could
      result in a kernel oops.
      
      So introduce a new mutex "state_lock" to guard the state property of the
      struct zpci_dev. Acquire this lock in all task that modify the state:
      
      - hotplug add and remove, through the PCI hotplug slot entry,
      - avaiability events, as reported by the platform,
      - error events, as reported by the platform,
      - during device resets, explicit through sysfs requests or
        implict through the common PCI layer.
      
      Break out an inner _do_recover() routine out of recover_store() to
      separte the necessary synchronizations from the actual manipulations of
      the zpci_dev required for the reset.
      
      With the following changes I was able to run the inject loops for hours
      without hitting an error.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      bcb5d6c7
    • Gerd Bayer's avatar
      s390/pci: rename lock member in struct zpci_dev · 0d48566d
      Gerd Bayer authored
      Since this guards only the Function Measurement Block, rename from
      generic lock to fmb_lock in preparation to introduce another lock
      that guards the state member
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      0d48566d
    • Thomas Richter's avatar
      s390/pai: adjust whitespace indentation · 29f6fe17
      Thomas Richter authored
      Adjust whitespace indentation. No functional change.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      29f6fe17
    • Thomas Richter's avatar
      s390/pai: simplify event start function for perf stat · 82cb9b61
      Thomas Richter authored
      When an event is started, read the current value of the
      PAI counter. This value is saved in event::hw.prev_count.
      When an event is stopped, this value is subtracted from the current
      value read out at event stop time. The difference is the delta
      of this counter.
      
      Simplify the logic and read the event value every time the event is
      started. This scheme is identical to other device drivers.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      82cb9b61
    • Thomas Richter's avatar
      s390/pai: save PAI counter value page in event structure · fe861b0c
      Thomas Richter authored
      When the PAI events ALL_CRYPTO or ALL_NNPA are created
      for system wide sampling, all PAI counters are monitored.
      On each process schedule out, the values of all PAI counters
      are investigated. Non-zero values are saved in the event's ring
      buffer as raw data. This scheme expects the start value of each counter
      to be reset to zero after each read operation performed by the PAI
      PMU device driver. This allows for only one active event at any one
      time as it relies on the start value of counters to be reset to zero.
      
      Create a save area for each installed PAI XXXX_ALL event and save all
      PAI counter values in this save area. Instead of clearing the
      PAI counter lowcore area to zero after each read operation,
      copy them from the lowcore area to the event's save area at process
      schedule out time.
      The delta of each PAI counter is calculated by subtracting the
      old counter's value stored in the event's save area from the current
      value stored in the lowcore area.
      
      With this scheme, mulitple events of the PAI counters XXXX_ALL
      can be handled at the same time. This will be addressed in a
      follow-on patch.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      fe861b0c
    • Holger Dengler's avatar
      s390/ap: explicitly include ultravisor header · d065bdb4
      Holger Dengler authored
      The ap_bus is using inline functions of the ultravisor (uv) in-kernel
      API. The related header file is implicitly included via several other
      headers. Replace this by an explicit include of the ultravisor header
      in the ap_bus file.
      Signed-off-by: default avatarHolger Dengler <dengler@linux.ibm.com>
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d065bdb4
  4. 16 Feb, 2024 25 commits