1. 07 Mar, 2024 4 commits
    • Harald Freudenberger's avatar
      s390/ap: rearm APQNs bindings complete completion · 778412ab
      Harald Freudenberger authored
      The APQN bindings complete completion was used to reflect
      that 1st the AP bus initial scan is done and 2nd all the
      detected APQNs have been bound to a device driver.
      This was a single-shot action. However, as the AP bus
      supports hot-plug it may be that new APQNs appear reflected
      as new AP queue and card devices which need to be bound
      to appropriate device drivers. So the condition that
      all existing AP queue devices are bound to device drivers
      may go away for a certain time.
      
      This patch now checks during AP bus scan for maybe new AP
      devices appearing and does a re-init of the internal completion
      variable. So the AP bus function ap_wait_apqn_bindings_complete()
      now may block on this condition variable even later after
      initial scan is through when new APQNs appear which need to
      get bound.
      
      This patch also moves the check for binding complete invocation
      from the probe function to the end of the AP bus scan function.
      This change also covers some weird scenarios where during a
      card hotplug the binding of the card device was sufficient for
      binding complete but the queue devices where still in the
      process of being discovered.
      
      As of now this change has no impact on existing code. The
      behavior change in the now later bindings complete should not
      impact any code (and has been tested so far). The only
      exploiter is the zcrypt function zcrypt_wait_api_operational()
      which only initial calls ap_wait_apqn_bindings_complete().
      
      However, this new behavior of the AP bus wait for APQNs bindings
      complete function will be used in a later patch exploiting
      this for the zcrypt API layer.
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Reviewed-by: default avatarHolger Dengler <dengler@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      778412ab
    • Heiko Carstens's avatar
      s390/configs: increase number of LOCKDEP_BITS · bbe37e3e
      Heiko Carstens authored
      Set LOCKDEP_BITS to 16 and LOCKDEP_CHAINS_BITS to 17, since test
      systems frequently run out of lockdep entries and lockdep chains.
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      bbe37e3e
    • Jason J. Herne's avatar
      s390/vfio-ap: handle hardware checkstop state on queue reset operation · a681226c
      Jason J. Herne authored
      Update vfio_ap_mdev_reset_queue() to handle an unexpected checkstop (hardware error) the
      same as the deconfigured case. This prevents unexpected and unhelpful warnings in the
      event of a hardware error.
      
      We also stop lying about a queue's reset response code. This was originally done so we
      could force vfio_ap_mdev_filter_matrix to pass a deconfigured device through to the guest
      for the hotplug scenario. vfio_ap_mdev_filter_matrix is instead modified to allow
      passthrough for all queues with reset state normal, deconfigured, or checkstopped. In the
      checkstopped case we choose to pass the device through and let the error state be
      reflected at the guest level.
      Signed-off-by: default avatar"Jason J. Herne" <jjherne@linux.ibm.com>
      Reviewed-by: default avatarAnthony Krowiak <akrowiak@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240215153144.14747-1-jjherne@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      a681226c
    • Thomas Richter's avatar
      s390/pai: change sampling event assignment for PMU device driver · e22033fd
      Thomas Richter authored
      Currently only one PAI sampling event can be created and active
      at any one time. The PMU device drivers store a pointer to this
      event in their data structures even when the event is created
      for counting and the PMU device driver reference to this counting
      event is never needed.
      Change this and assign the pointer to the PMU device driver
      only when a sampling event is created.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      e22033fd
  2. 26 Feb, 2024 5 commits
  3. 25 Feb, 2024 1 commit
  4. 22 Feb, 2024 3 commits
  5. 21 Feb, 2024 1 commit
  6. 20 Feb, 2024 12 commits
    • Josh Poimboeuf's avatar
      s390: compile relocatable kernel without -fPIE · 778666df
      Josh Poimboeuf authored
      On s390, currently kernel uses the '-fPIE' compiler flag for compiling
      vmlinux.  This has a few problems:
      
        - It uses dynamic symbols (.dynsym), for which the linker refuses to
          allow more than 64k sections.  This can break features which use
          '-ffunction-sections' and '-fdata-sections', including kpatch-build
          [1] and Function Granular KASLR.
      
        - It unnecessarily uses GOT relocations, adding an extra layer of
          indirection for many memory accesses.
      
      Instead of using '-fPIE', resolve all the relocations at link time and
      then manually adjust any absolute relocations (R_390_64) during boot.
      
      This is done by first telling the linker to preserve all relocations
      during the vmlinux link.  (Note this is harmless: they are later
      stripped in the vmlinux.bin link.)
      
      Then use the 'relocs' tool to find all absolute relocations (R_390_64)
      which apply to allocatable sections.  The offsets of those relocations
      are saved in a special section which is then used to adjust the
      relocations during boot.
      
      (Note: For some reason, Clang occasionally creates a GOT reference, even
      without '-fPIE'.  So Clang-compiled kernels have a GOT, which needs to
      be adjusted.)
      
      On my mostly-defconfig kernel, this reduces kernel text size by ~1.3%.
      
      [1] https://github.com/dynup/kpatch/issues/1284
      [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
      [3] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
      
      Compiler consideration:
      
      Gcc recently implemented an optimization [2] for loading symbols without
      explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
      symbols to reside on a 2-byte boundary, enabling the use of the larl
      instruction. However, kernel linker scripts may still generate unaligned
      symbols. To address this, a new -munaligned-symbols option has been
      introduced [3] in recent gcc versions. This option has to be used with
      future gcc versions.
      
      Older Clang lacks support for handling unaligned symbols generated
      by kernel linker scripts when the kernel is built without -fPIE. However,
      future versions of Clang will include support for the -munaligned-symbols
      option. When the support is unavailable, compile the kernel with -fPIE
      to maintain the existing behavior.
      
      In addition to it:
      move vmlinux.relocs to safe relocation
      
      When the kernel is built with CONFIG_KERNEL_UNCOMPRESSED, the entire
      uncompressed vmlinux.bin is positioned in the bzImage decompressor
      image at the default kernel LMA of 0x100000, enabling it to be executed
      in-place. However, the size of .vmlinux.relocs could be large enough to
      cause an overlap with the uncompressed kernel at the address 0x100000.
      To address this issue, .vmlinux.relocs is positioned after the
      .rodata.compressed in the bzImage. Nevertheless, in this configuration,
      vmlinux.relocs will overlap with the .bss section of vmlinux.bin. To
      overcome that, move vmlinux.relocs to a safe location before clearing
      .bss and handling relocs.
      
      Compile warning fix from Sumanth Korikkar:
      
      When kernel is built with CONFIG_LD_ORPHAN_WARN and -fno-PIE, there are
      several warnings:
      
      ld: warning: orphan section `.rela.iplt' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.head.text' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.init.text' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      ld: warning: orphan section `.rela.rodata.cst8' from
      `arch/s390/kernel/head64.o' being placed in section `.rela.dyn'
      
      Orphan sections are sections that exist in an object file but don't have
      a corresponding output section in the final executable. ld raises a
      warning when it identifies such sections.
      
      Eliminate the warning by placing all .rela orphan sections in .rela.dyn
      and raise an error when size of .rela.dyn is greater than zero. i.e.
      Dont just neglect orphan sections.
      
      This is similar to adjustment performed in x86, where kernel is built
      with -fno-PIE.
      commit 5354e845 ("x86/build: Add asserts for unwanted sections")
      
      [sumanthk@linux.ibm.com: rebased Josh Poimboeuf patches and move
       vmlinux.relocs to safe location]
      [hca@linux.ibm.com: merged compile warning fix from Sumanth]
      Tested-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-4-sumanthk@linux.ibm.com
      Link: https://lore.kernel.org/r/20240219132734.22881-5-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      778666df
    • Josh Poimboeuf's avatar
      s390: add relocs tool · 55dc65b4
      Josh Poimboeuf authored
      This 'relocs' tool is copied from the x86 version, ported for s390, and
      greatly simplified to remove unnecessary features.
      
      It reads vmlinux and outputs assembly to create a .vmlinux.relocs_64
      section which contains the offsets of all R_390_64 relocations which
      apply to allocatable sections.
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@kernel.org>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-3-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      55dc65b4
    • Sumanth Korikkar's avatar
      s390/vdso64: filter out munaligned-symbols flag for vdso · 8192a1b3
      Sumanth Korikkar authored
      Gcc recently implemented an optimization [1] for loading symbols without
      explicit alignment, aligning with the IBM Z ELF ABI. This ABI mandates
      symbols to reside on a 2-byte boundary, enabling the use of the larl
      instruction. However, kernel linker scripts may still generate unaligned
      symbols. To address this, a new -munaligned-symbols option has been
      introduced [2] in recent gcc versions.
      
      [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622872.html
      [2] https://gcc.gnu.org/pipermail/gcc-patches/2023-August/625986.html
      
      However, when -munaligned-symbols  is used in vdso code, it leads to the
      following compilation error:
      `.data.rel.ro.local' referenced in section `.text' of
      arch/s390/kernel/vdso64/vdso64_generic.o: defined in discarded section
      `.data.rel.ro.local' of arch/s390/kernel/vdso64/vdso64_generic.o
      
      vdso linker script discards .data section to make it lightweight.
      However, -munaligned-symbols in vdso object files references literal
      pool and accesses _vdso_data. Hence, compile vdso code without
      -munaligned-symbols.  This means in the future, vdso code should deal
      with alignment of newly introduced unaligned linker symbols.
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240219132734.22881-2-sumanthk@linux.ibm.comSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      8192a1b3
    • Nathan Chancellor's avatar
      s390/boot: add 'alloc' to info.bin .vmlinux.info section flags · 9ea30fd1
      Nathan Chancellor authored
      When attempting to boot a kernel compiled with OBJCOPY=llvm-objcopy,
      there is a crash right at boot:
      
        Out of memory allocating 6d7800 bytes 8 aligned in range 0:20000000
        Reserved memory ranges:
        0000000000000000 a394c3c30d90cdaf DECOMPRESSOR
        Usable online memory ranges (info source: sclp read info [3]):
        0000000000000000 0000000020000000
        Usable online memory total: 20000000 Reserved: a394c3c30d90cdaf Free: 0
        Call Trace:
        (sp:0000000000033e90 [<0000000000012fbc>] physmem_alloc_top_down+0x5c/0x104)
         sp:0000000000033f00 [<0000000000011d56>] startup_kernel+0x3a6/0x77c
         sp:0000000000033f60 [<00000000000100f4>] startup_normal+0xd4/0xd4
      
      GNU objcopy does not have any issues. Looking at differences between the
      object files in each build reveals info.bin does not get properly
      populated with llvm-objcopy, which results in an empty .vmlinux.info
      section.
      
        $ file {gnu,llvm}-objcopy/arch/s390/boot/info.bin
        gnu-objcopy/arch/s390/boot/info.bin:  data
        llvm-objcopy/arch/s390/boot/info.bin: empty
      
        $ llvm-readelf --section-headers {gnu,llvm}-objcopy/arch/s390/boot/vmlinux | rg 'File:|\.vmlinux\.info|\.decompressor\.syms'
        File: gnu-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
        File: llvm-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000000 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034000 035000 000b00 00  WA  0   0  1
      
      Ulrich points out that llvm-objcopy only copies sections marked as alloc
      with a binary output target, whereas the .vmlinux.info section is only
      marked as load. Add 'alloc' in addition to 'load', so that both objcopy
      implementations work properly:
      
        $ file {gnu,llvm}-objcopy/arch/s390/boot/info.bin
        gnu-objcopy/arch/s390/boot/info.bin:  data
        llvm-objcopy/arch/s390/boot/info.bin: data
      
        $ llvm-readelf --section-headers {gnu,llvm}-objcopy/arch/s390/boot/vmlinux | rg 'File:|\.vmlinux\.info|\.decompressor\.syms'
        File: gnu-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
        File: llvm-objcopy/arch/s390/boot/vmlinux
          [12] .vmlinux.info     PROGBITS        0000000000034000 035000 000078 00  WA  0   0  1
          [13] .decompressor.syms PROGBITS       0000000000034078 035078 000b00 00  WA  0   0  1
      
      Closes: https://github.com/ClangBuiltLinux/linux/issues/1996
      Link: https://github.com/llvm/llvm-project/commit/3c02cb7492fc78fb678264cebf57ff88e478e14fSuggested-by: default avatarUlrich Weigand <ulrich.weigand@de.ibm.com>
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: https://lore.kernel.org/r/20240216-s390-fix-boot-with-llvm-objcopy-v1-1-0ac623daf42b@kernel.orgSigned-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      9ea30fd1
    • Gerd Bayer's avatar
      s390/pci: fix three typos in comments · d0c8fd21
      Gerd Bayer authored
      Found and fixed these while working on synchronizing the state
      handling of zpci_dev's.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d0c8fd21
    • Gerd Bayer's avatar
      s390/pci: remove hotplug slot when releasing the device · 6ee600bf
      Gerd Bayer authored
      Centralize the removal so all paths are covered and the hotplug slot
      will remain active until the device is really destroyed.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      6ee600bf
    • Gerd Bayer's avatar
      s390/pci: introduce lock to synchronize state of zpci_dev's · bcb5d6c7
      Gerd Bayer authored
      There's a number of tasks that need the state of a zpci device
      to be stable. Other tasks need to be synchronized as they change the state.
      
      State changes could be generated by the system as availability or error
      events, or be requested by the user through manipulations in sysfs.
      Some other actions accessible through sysfs - like device resets - need the
      state to be stable.
      
      Unsynchronized state handling could lead to unusable devices. This has
      been observed in cases of concurrent state changes through systemd udev
      rules and DPM boot control. Some breakage can be provoked by artificial
      tests, e.g. through repetitively injecting "recover" on a PCI function
      through sysfs while running a "hotplug remove/add" in a loop through a
      PCI slot's "power" attribute in sysfs. After a few iterations this could
      result in a kernel oops.
      
      So introduce a new mutex "state_lock" to guard the state property of the
      struct zpci_dev. Acquire this lock in all task that modify the state:
      
      - hotplug add and remove, through the PCI hotplug slot entry,
      - avaiability events, as reported by the platform,
      - error events, as reported by the platform,
      - during device resets, explicit through sysfs requests or
        implict through the common PCI layer.
      
      Break out an inner _do_recover() routine out of recover_store() to
      separte the necessary synchronizations from the actual manipulations of
      the zpci_dev required for the reset.
      
      With the following changes I was able to run the inject loops for hours
      without hitting an error.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      bcb5d6c7
    • Gerd Bayer's avatar
      s390/pci: rename lock member in struct zpci_dev · 0d48566d
      Gerd Bayer authored
      Since this guards only the Function Measurement Block, rename from
      generic lock to fmb_lock in preparation to introduce another lock
      that guards the state member
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Reviewed-by: default avatarNiklas Schnelle <schnelle@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      0d48566d
    • Thomas Richter's avatar
      s390/pai: adjust whitespace indentation · 29f6fe17
      Thomas Richter authored
      Adjust whitespace indentation. No functional change.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      29f6fe17
    • Thomas Richter's avatar
      s390/pai: simplify event start function for perf stat · 82cb9b61
      Thomas Richter authored
      When an event is started, read the current value of the
      PAI counter. This value is saved in event::hw.prev_count.
      When an event is stopped, this value is subtracted from the current
      value read out at event stop time. The difference is the delta
      of this counter.
      
      Simplify the logic and read the event value every time the event is
      started. This scheme is identical to other device drivers.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      82cb9b61
    • Thomas Richter's avatar
      s390/pai: save PAI counter value page in event structure · fe861b0c
      Thomas Richter authored
      When the PAI events ALL_CRYPTO or ALL_NNPA are created
      for system wide sampling, all PAI counters are monitored.
      On each process schedule out, the values of all PAI counters
      are investigated. Non-zero values are saved in the event's ring
      buffer as raw data. This scheme expects the start value of each counter
      to be reset to zero after each read operation performed by the PAI
      PMU device driver. This allows for only one active event at any one
      time as it relies on the start value of counters to be reset to zero.
      
      Create a save area for each installed PAI XXXX_ALL event and save all
      PAI counter values in this save area. Instead of clearing the
      PAI counter lowcore area to zero after each read operation,
      copy them from the lowcore area to the event's save area at process
      schedule out time.
      The delta of each PAI counter is calculated by subtracting the
      old counter's value stored in the event's save area from the current
      value stored in the lowcore area.
      
      With this scheme, mulitple events of the PAI counters XXXX_ALL
      can be handled at the same time. This will be addressed in a
      follow-on patch.
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarSumanth Korikkar <sumanthk@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      fe861b0c
    • Holger Dengler's avatar
      s390/ap: explicitly include ultravisor header · d065bdb4
      Holger Dengler authored
      The ap_bus is using inline functions of the ultravisor (uv) in-kernel
      API. The related header file is implicitly included via several other
      headers. Replace this by an explicit include of the ultravisor header
      in the ap_bus file.
      Signed-off-by: default avatarHolger Dengler <dengler@linux.ibm.com>
      Reviewed-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      d065bdb4
  7. 16 Feb, 2024 14 commits