1. 21 Sep, 2022 1 commit
    • Mark Rutland's avatar
      arm64: avoid BUILD_BUG_ON() in alternative-macros · 0072dc1b
      Mark Rutland authored
      Nathan reports that the build fails when using clang and LTO:
      
      |  In file included from kernel/bounds.c:10:
      |  In file included from ./include/linux/page-flags.h:10:
      |  In file included from ./include/linux/bug.h:5:
      |  In file included from ./arch/arm64/include/asm/bug.h:26:
      |  In file included from ./include/asm-generic/bug.h:5:
      |  In file included from ./include/linux/compiler.h:248:
      |  In file included from ./arch/arm64/include/asm/rwonce.h:11:
      |  ./arch/arm64/include/asm/alternative-macros.h:224:2: error: call to undeclared function 'BUILD_BUG_ON'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
      |          BUILD_BUG_ON(feature >= ARM64_NCAPS);
      |          ^
      |  ./arch/arm64/include/asm/alternative-macros.h:241:2: error: call to undeclared function 'BUILD_BUG_ON'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
      |          BUILD_BUG_ON(feature >= ARM64_NCAPS);
      |          ^
      |  2 errors generated.
      
      ... the problem being that when LTO is enabled, <asm/rwonce.h> includes
      <asm/alternative-macros.h>, and causes a circular include dependency
      through <linux/bug.h>. This manifests as BUILD_BUG_ON() not being
      defined when used within <asm/alternative-macros.h>.
      
      This patch avoids the problem and simplifies the include dependencies by
      using compiletime_assert() instead of BUILD_BUG_ON().
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: 21fb26bf ("arm64: alternatives: add alternative_has_feature_*()")
      Reported-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Link: http://lore.kernel.org/r/YyigTrxhE3IRPzjs@dev-arch.thelio-3990X
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220920140044.1709073-1-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      0072dc1b
  2. 16 Sep, 2022 8 commits
    • Mark Rutland's avatar
      arm64: alternatives: add shared NOP callback · d926079f
      Mark Rutland authored
      For each instance of an alternative, the compiler outputs a distinct
      copy of the alternative instructions into a subsection. As the compiler
      doesn't have special knowledge of alternatives, it cannot coalesce these
      to save space.
      
      In a defconfig kernel built with GCC 12.1.0, there are approximately
      10,000 instances of alternative_has_feature_likely(), where the
      replacement instruction is always a NOP. As NOPs are
      position-independent, we don't need a unique copy per alternative
      sequence.
      
      This patch adds a callback to patch an alternative sequence with NOPs,
      and make use of this in alternative_has_feature_likely(). So that this
      can be used for other sites in future, this is written to patch multiple
      instructions up to the original sequence length.
      
      For NVHE, an alias is added to image-vars.h.
      
      For modules, the callback is exported. Note that as modules are loaded
      within 2GiB of the kernel, an alt_instr entry in a module can always
      refer directly to the callback, and no special handling is necessary.
      
      When building with GCC 12.1.0, the vmlinux is ~158KiB smaller, though
      the resulting Image size is unchanged due to alignment constraints and
      padding:
      
      | % ls -al vmlinux-*
      | -rwxr-xr-x 1 mark mark 134644592 Sep  1 14:52 vmlinux-after
      | -rwxr-xr-x 1 mark mark 134486232 Sep  1 14:50 vmlinux-before
      | % ls -al Image-*
      | -rw-r--r-- 1 mark mark 37108224 Sep  1 14:52 Image-after
      | -rw-r--r-- 1 mark mark 37108224 Sep  1 14:50 Image-before
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-9-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      d926079f
    • Mark Rutland's avatar
      arm64: alternatives: add alternative_has_feature_*() · 21fb26bf
      Mark Rutland authored
      Currrently we use a mixture of alternative sequences and static branches
      to handle features detected at boot time. For ease of maintenance we
      generally prefer to use static branches in C code, but this has a few
      downsides:
      
      * Each static branch has metadata in the __jump_table section, which is
        not discarded after features are finalized. This wastes some space,
        and slows down the patching of other static branches.
      
      * The static branches are patched at a different point in time from the
        alternatives, so changes are not atomic. This leaves a transient
        period where there could be a mismatch between the behaviour of
        alternatives and static branches, which could be problematic for some
        features (e.g. pseudo-NMI).
      
      * More (instrumentable) kernel code is executed to patch each static
        branch, which can be risky when patching certain features (e.g.
        irqflags management for pseudo-NMI).
      
      * When CONFIG_JUMP_LABEL=n, static branches are turned into a load of a
        flag and a conditional branch. This means it isn't safe to use such
        static branches in an alternative address space (e.g. the NVHE/PKVM
        hyp code), where the generated address isn't safe to acccess.
      
      To deal with these issues, this patch introduces new
      alternative_has_feature_*() helpers, which work like static branches but
      are patched using alternatives. This ensures the patching is performed
      at the same time as other alternative patching, allows the metadata to
      be freed after patching, and is safe for use in alternative address
      spaces.
      
      Note that all supported toolchains have asm goto support, and since
      commit:
      
        a0a12c3e ("asm goto: eradicate CC_HAS_ASM_GOTO)"
      
      ... the CC_HAS_ASM_GOTO Kconfig symbol has been removed, so no feature
      check is necessary, and we can always make use of asm goto.
      
      Additionally, note that:
      
      * This has no impact on cpus_have_cap(), which is a dynamic check.
      
      * This has no functional impact on cpus_have_const_cap(). The branches
        are patched slightly later than before this patch, but these branches
        are not reachable until caps have been finalised.
      
      * It is now invalid to use cpus_have_final_cap() in the window between
        feature detection and patching. All existing uses are only expected
        after patching anyway, so this should not be a problem.
      
      * The LSE atomics will now be enabled during alternatives patching
        rather than immediately before. As the LL/SC an LSE atomics are
        functionally equivalent this should not be problematic.
      
      When building defconfig with GCC 12.1.0, the resulting Image is 64KiB
      smaller:
      
      | % ls -al Image-*
      | -rw-r--r-- 1 mark mark 37108224 Aug 23 09:56 Image-after
      | -rw-r--r-- 1 mark mark 37173760 Aug 23 09:54 Image-before
      
      According to bloat-o-meter.pl:
      
      | add/remove: 44/34 grow/shrink: 602/1294 up/down: 39692/-61108 (-21416)
      | Function                                     old     new   delta
      | [...]
      | Total: Before=16618336, After=16596920, chg -0.13%
      | add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-1296 (-1296)
      | Data                                         old     new   delta
      | arm64_const_caps_ready                        16       -     -16
      | cpu_hwcap_keys                              1280       -   -1280
      | Total: Before=8987120, After=8985824, chg -0.01%
      | add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0)
      | RO Data                                      old     new   delta
      | Total: Before=18408, After=18408, chg +0.00%
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-8-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      21fb26bf
    • Mark Rutland's avatar
      arm64: alternatives: have callbacks take a cap · 4c0bd995
      Mark Rutland authored
      Today, callback alternatives are special-cased within
      __apply_alternatives(), and are applied alongside patching for system
      capabilities as ARM64_NCAPS is not part of the boot_capabilities feature
      mask.
      
      This special-casing is less than ideal. Giving special meaning to
      ARM64_NCAPS for this requires some structures and loops to use
      ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS.
      It's also not immediately clear callback alternatives are only applied
      when applying alternatives for system-wide features.
      
      To make this a bit clearer, changes the way that callback alternatives
      are identified to remove the special-casing of ARM64_NCAPS, and to allow
      callback alternatives to be associated with a cpucap as with all other
      alternatives.
      
      New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which
      are always detected alongside boot cpu capabilities and system
      capabilities respectively. All existing callback alternatives are made
      to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point
      during the boot flow as before.
      
      Subsequent patches will make more use of these new cpucaps.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      4c0bd995
    • Mark Rutland's avatar
      arm64: alternatives: make alt_region const · b723edf3
      Mark Rutland authored
      We never alter a struct alt_region after creation, and we open-code the
      bounds of the kernel alternatives region in two functions. The
      duplication is a bit unfortunate for clarity (and in future we're likely
      to have more functions altering alternative regions), and to avoid
      accidents it would be good to make the structure const.
      
      This patch adds a shared struct `kernel_alternatives` alt_region for the
      main kernel image, and marks the alt_regions as const to prevent
      unintentional modification.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-6-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      b723edf3
    • Mark Rutland's avatar
      arm64: alternatives: hoist print out of __apply_alternatives() · c5ba0326
      Mark Rutland authored
      Printing in the middle of __apply_alternatives() is potentially unsafe
      and not all that helpful given these days we practically always patch
      *something*.
      
      Hoist the print out of __apply_alternatives(), and add separate prints
      to __apply_alternatives() and apply_alternatives_all(), which will make
      it easier to spot if either patching call goes wrong.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-5-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c5ba0326
    • Mark Rutland's avatar
      arm64: alternatives: proton-pack: prepare for cap changes · 747ad8d5
      Mark Rutland authored
      The spectre patching callbacks use cpus_have_final_cap(), and subsequent
      patches will make it invalid to call cpus_have_final_cap() before
      alternatives patching has completed.
      
      In preparation for said change, this patch modifies the spectre patching
      callbacks use cpus_have_cap(). This is not subject to patching, and will
      dynamically check the cpu_hwcaps array, which is functionally equivalent
      to the existing behaviour.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarJoey Gouly <joey.gouly@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-4-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      747ad8d5
    • Mark Rutland's avatar
      arm64: alternatives: kvm: prepare for cap changes · 34bbfdfb
      Mark Rutland authored
      The KVM patching callbacks use cpus_have_final_cap() internally within
      has_vhe(), and subsequent patches will make it invalid to call
      cpus_have_final_cap() before alternatives patching has completed, and
      will mean that cpus_have_const_cap() will always fall back to dynamic
      checks prior to alternatives patching.
      
      In preparation for said change, this patch modifies the KVM patching
      callbacks to use cpus_have_cap() directly. This is not subject to
      patching, and will dynamically check the cpu_hwcaps array, which is
      functionally equivalent to the existing behaviour.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-3-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      34bbfdfb
    • Mark Rutland's avatar
      arm64: cpufeature: make cpus_have_cap() noinstr-safe · 92b4b561
      Mark Rutland authored
      Currently it isn't safe to use cpus_have_cap() from noinstr code as
      test_bit() is explicitly instrumented, and were cpus_have_cap() placed
      out-of-line, cpus_have_cap() itself could be instrumented.
      
      Make cpus_have_cap() noinstr safe by marking it __always_inline and
      using arch_test_bit().
      
      Aside from the prevention of instrumentation, there should be no
      functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Ard Biesheuvel <ardb@kernel.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Joey Gouly <joey.gouly@arm.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Reviewed-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Link: https://lore.kernel.org/r/20220912162210.3626215-2-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      92b4b561
  3. 28 Aug, 2022 25 commits
  4. 27 Aug, 2022 6 commits