1. 20 Apr, 2023 5 commits
    • Will Deacon's avatar
      Merge branch 'for-next/kdump' into for-next/core · f8863bc8
      Will Deacon authored
      * for-next/kdump:
        arm64: kdump: defer the crashkernel reservation for platforms with no DMA memory zones
        arm64: kdump: do not map crashkernel region specifically
        arm64: kdump : take off the protection on crashkernel memory region
      f8863bc8
    • Will Deacon's avatar
      Merge branch 'for-next/ftrace' into for-next/core · ea88dc92
      Will Deacon authored
      * for-next/ftrace:
        arm64: ftrace: Simplify get_ftrace_plt
        arm64: ftrace: Add direct call support
        ftrace: selftest: remove broken trace_direct_tramp
        ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
        ftrace: Store direct called addresses in their ops
        ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
        ftrace: Remove the legacy _ftrace_direct API
        ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
        ftrace: Let unregister_ftrace_direct_multi() call ftrace_free_filter()
      ea88dc92
    • Will Deacon's avatar
      Merge branch 'for-next/cpufeature' into for-next/core · 31eb87cf
      Will Deacon authored
      * for-next/cpufeature:
        arm64/cpufeature: Use helper macro to specify ID register for capabilites
        arm64/cpufeature: Consistently use symbolic constants for min_field_value
        arm64/cpufeature: Pull out helper for CPUID register definitions
      31eb87cf
    • Will Deacon's avatar
      Merge branch 'for-next/asm' into for-next/core · 0f6563a3
      Will Deacon authored
      * for-next/asm:
        arm64: uaccess: remove unnecessary earlyclobber
        arm64: uaccess: permit put_{user,kernel} to use zero register
        arm64: uaccess: permit __smp_store_release() to use zero register
        arm64: atomics: lse: improve cmpxchg implementation
      0f6563a3
    • Will Deacon's avatar
      Merge branch 'for-next/acpi' into for-next/core · 67eacd61
      Will Deacon authored
      * for-next/acpi:
        ACPI: AGDI: Improve error reporting for problems during .remove()
      67eacd61
  2. 17 Apr, 2023 4 commits
  3. 11 Apr, 2023 6 commits
  4. 28 Mar, 2023 4 commits
    • Mark Rutland's avatar
      arm64: uaccess: remove unnecessary earlyclobber · 17242086
      Mark Rutland authored
      Currently the asm constraints for __get_mem_asm() mark the value
      register as an earlyclobber operand. This means that the compiler can't
      reuse the same register for both the address and value, even when the
      value is not subsequently used.
      
      There's no need for the value register to be marked as earlyclobber, as
      it's only written to after the address register is consumed, even when
      the access faults.
      
      Remove the unnecessary earlyclobber.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230314153700.787701-5-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      17242086
    • Mark Rutland's avatar
      arm64: uaccess: permit put_{user,kernel} to use zero register · 4a3f806e
      Mark Rutland authored
      Currently the asm constraints for __put_mem_asm() require that the value
      is placed in a "real" GPR (i.e. one other than [XW]ZR or SP). This means
      that for cases such as:
      
      	__put_user(0, addr)
      
      ... the compiler has to move '0' into "real" GPR, e.g.
      
      	mov	xN, #0
      	sttr	xN, [<addr>]
      
      This is unfortunate, as using the zero register would require fewer
      instructions and save a "real" GPR for other usage, allowing the
      compiler to generate:
      
      	sttr	xzr, [<addr>]
      
      Modify the asm constaints for __put_mem_asm() to permit the use of the
      zero register for the value.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230314153700.787701-4-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      4a3f806e
    • Mark Rutland's avatar
      arm64: uaccess: permit __smp_store_release() to use zero register · 39c8275d
      Mark Rutland authored
      Currently the asm constraints for __smp_store_release() require that the
      value is placed in a "real" GPR (i.e. one other than [XW]ZR or SP).
      This means that for cases such as:
      
          __smp_store_release(ptr, 0)
      
      ... the compiler has to move '0' into "real" GPR, e.g.
      
          mov     xN, #0
          stlr    xN, [<addr>]
      
      This is unfortunate, as using the zero register would require fewer
      instructions and save a "real" GPR for other usage, allowing the
      compiler to generate:
      
          stlr    xzr, [<addr>]
      
      Modify the asm constaints for __smp_store_release() to permit the use of
      the zero register for the value.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230314153700.787701-3-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      39c8275d
    • Mark Rutland's avatar
      arm64: atomics: lse: improve cmpxchg implementation · e5cacb54
      Mark Rutland authored
      For historical reasons, the LSE implementation of cmpxchg*() hard-codes
      the GPRs to use, and shuffles registers around with MOVs. This is no
      longer necessary, and can be simplified.
      
      When the LSE cmpxchg implementation was added in commit:
      
        c342f782 ("arm64: cmpxchg: patch in lse instructions when supported by the CPU")
      
      ... the LL/SC implementation of cmpxchg() would be placed out-of-line,
      and the in-line assembly for cmpxchg would default to:
      
      	NOP
      	BL	<ll_sc_cmpxchg*_implementation>
      	NOP
      
      The LL/SC implementation of each cmpxchg() function accepted arguments
      as per AAPCS64 rules, to it was necessary to place the pointer in x0,
      the older value in X1, and the new value in x2, and acquire the return
      value from x0. The LL/SC implementation required a temporary register
      (e.g. for the STXR status value). As the LL/SC implementation preserved
      the old value, the LSE implementation does likewise.
      
      Since commit:
      
        addfc386 ("arm64: atomics: avoid out-of-line ll/sc atomics")
      
      ... the LSE and LL/SC implementations of cmpxchg are inlined as separate
      asm blocks, with another branch choosing between thw two. Due to this,
      it is no longer necessary for the LSE implementation to match the
      register constraints of the LL/SC implementation. This was partially
      dealt with by removing the hard-coded use of x30 in commit:
      
        3337cb5a ("arm64: avoid using hard-coded registers for LSE atomics")
      
      ... but we didn't clean up the hard-coding of x0, x1, and x2.
      
      This patch simplifies the LSE implementation of cmpxchg, removing the
      register shuffling and directly clobbering the 'old' argument. This
      gives the compiler greater freedom for register allocation, and avoids
      redundant work.
      
      The new constraints permit 'old' (Rs) and 'new' (Rt) to be allocated to
      the same register when the initial values of the two are the same, e.g.
      resulting in:
      
      	CAS	X0, X0, [X1]
      
      This is safe as Rs is only written back after the initial values of Rs
      and Rt are consumed, and there are no UNPREDICTABLE behaviours to avoid
      when Rs == Rt.
      
      The new constraints also permit 'new' to be allocated to the zero
      register, avoiding a MOV in a few cases. The same cannot be done for
      'old' as it is both an input and output, and any caller of cmpxchg()
      should care about the output value. Note that for CAS* the use of the
      zero register never affects the ordering (while for SWP* the use of the
      zero regsiter for the 'old' value drops any ACQUIRE semantic).
      
      Compared to v6.2-rc4, a defconfig vmlinux is ~116KiB smaller, though the
      resulting Image is the same size due to internal alignment and padding:
      
        [mark@lakrids:~/src/linux]% ls -al vmlinux-*
        -rwxr-xr-x 1 mark mark 137269304 Jan 16 11:59 vmlinux-after
        -rwxr-xr-x 1 mark mark 137387936 Jan 16 10:54 vmlinux-before
        [mark@lakrids:~/src/linux]% ls -al Image-*
        -rw-r--r-- 1 mark mark 38711808 Jan 16 11:59 Image-after
        -rw-r--r-- 1 mark mark 38711808 Jan 16 10:54 Image-before
      
      This patch does not touch cmpxchg_double*() as that requires contiguous
      register pairs, and separate patches will replace it with cmpxchg128*().
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20230314153700.787701-2-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      e5cacb54
  5. 21 Mar, 2023 7 commits
  6. 19 Mar, 2023 14 commits