1. 02 Dec, 2022 5 commits
    • Nicholas Piggin's avatar
      505ea330
    • Nicholas Piggin's avatar
      powerpc/64: Add module check for ELF ABI version · de3d098d
      Nicholas Piggin authored
      Override the generic module ELF check to provide a check for the ELF ABI
      version. This becomes important if we allow big-endian ELF ABI V2 builds
      but it doesn't hurt to check now.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20221128041539.1742489-3-npiggin@gmail.com
      de3d098d
    • Nicholas Piggin's avatar
      module: add module_elf_check_arch for module-specific checks · f9231a99
      Nicholas Piggin authored
      The elf_check_arch() function is also used to test compatibility of
      usermode binaries. Kernel modules may have more specific requirements,
      for example powerpc would like to test for ABI version compatibility.
      
      Add a weak module_elf_check_arch() that defaults to true, and call it
      from elf_validity_check().
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      [np: added changelog, adjust name, rebase]
      Acked-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20221128041539.1742489-2-npiggin@gmail.com
      f9231a99
    • Benjamin Gray's avatar
      powerpc/code-patching: Consolidate and cache per-cpu patching context · 2f228ee1
      Benjamin Gray authored
      With the temp mm context support, there are CPU local variables to hold
      the patch address and pte. Use these in the non-temp mm path as well
      instead of adding a level of indirection through the text_poke_area
      vm_struct and pointer chasing the pte.
      
      As both paths use these fields now, there is no need to let unreferenced
      variables be dropped by the compiler, so it is cleaner to merge them
      into a single context struct. This has the additional benefit of
      removing a redundant CPU local pointer, as only one of cpu_patching_mm /
      text_poke_area is ever used, while remaining well-typed. It also groups
      each CPU's data into a single cacheline.
      Signed-off-by: default avatarBenjamin Gray <bgray@linux.ibm.com>
      [mpe: Shorten name to 'area' as suggested by Christophe]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20221109045112.187069-10-bgray@linux.ibm.com
      2f228ee1
    • Christopher M. Riedl's avatar
      powerpc/code-patching: Use temporary mm for Radix MMU · c28c15b6
      Christopher M. Riedl authored
      x86 supports the notion of a temporary mm which restricts access to
      temporary PTEs to a single CPU. A temporary mm is useful for situations
      where a CPU needs to perform sensitive operations (such as patching a
      STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
      said mappings to other CPUs. Another benefit is that other CPU TLBs do
      not need to be flushed when the temporary mm is torn down.
      
      Mappings in the temporary mm can be set in the userspace portion of the
      address-space.
      
      Interrupts must be disabled while the temporary mm is in use. HW
      breakpoints, which may have been set by userspace as watchpoints on
      addresses now within the temporary mm, are saved and disabled when
      loading the temporary mm. The HW breakpoints are restored when unloading
      the temporary mm. All HW breakpoints are indiscriminately disabled while
      the temporary mm is in use - this may include breakpoints set by perf.
      
      Use the `poking_init` init hook to prepare a temporary mm and patching
      address. Initialize the temporary mm using mm_alloc(). Choose a
      randomized patching address inside the temporary mm userspace address
      space. The patching address is randomized between PAGE_SIZE and
      DEFAULT_MAP_WINDOW-PAGE_SIZE.
      
      Bits of entropy with 64K page size on BOOK3S_64:
      
      	bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
      
      	PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
      	bits of entropy = log2(128TB / 64K)
      	bits of entropy = 31
      
      The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
      operates - by default the space above DEFAULT_MAP_WINDOW is not
      available. Currently the Hash MMU does not use a temporary mm so
      technically this upper limit isn't necessary; however, a larger
      randomization range does not further "harden" this overall approach and
      future work may introduce patching with a temporary mm on Hash as well.
      
      Randomization occurs only once during initialization for each CPU as it
      comes online.
      
      The patching page is mapped with PAGE_KERNEL to set EAA[0] for the PTE
      which ignores the AMR (so no need to unlock/lock KUAP) according to
      PowerISA v3.0b Figure 35 on Radix.
      
      Based on x86 implementation:
      
      commit 4fc19708
      ("x86/alternatives: Initialize temporary mm for patching")
      
      and:
      
      commit b3fd8e83
      ("x86/alternatives: Use temporary mm for text poking")
      
      From: Benjamin Gray <bgray@linux.ibm.com>
      
      Synchronisation is done according to ISA 3.1B Book 3 Chapter 13
      "Synchronization Requirements for Context Alterations". Switching the mm
      is a change to the PID, which requires a CSI before and after the change,
      and a hwsync between the last instruction that performs address
      translation for an associated storage access.
      
      Instruction fetch is an associated storage access, but the instruction
      address mappings are not being changed, so it should not matter which
      context they use. We must still perform a hwsync to guard arbitrary
      prior code that may have accessed a userspace address.
      
      TLB invalidation is local and VA specific. Local because only this core
      used the patching mm, and VA specific because we only care that the
      writable mapping is purged. Leaving the other mappings intact is more
      efficient, especially when performing many code patches in a row (e.g.,
      as ftrace would).
      Signed-off-by: default avatarChristopher M. Riedl <cmr@bluescreens.de>
      Signed-off-by: default avatarBenjamin Gray <bgray@linux.ibm.com>
      [mpe: Use mm_alloc() per 107b6828a7cd ("x86/mm: Use mm_alloc() in poking_init()")]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20221109045112.187069-9-bgray@linux.ibm.com
      c28c15b6
  2. 30 Nov, 2022 16 commits
  3. 25 Nov, 2022 1 commit
  4. 24 Nov, 2022 18 commits