1. 12 Jul, 2018 6 commits
  2. 11 Jul, 2018 2 commits
  3. 10 Jul, 2018 1 commit
    • Arnd Bergmann's avatar
      arm64: make flatmem depend on !NUMA · 54501ac1
      Arnd Bergmann authored
      Building without NUMA but with FLATMEM results in a link error
      because mem_map[] is not available:
      
      aarch64-linux-ld -EB -maarch64elfb --no-undefined -X -pie -shared -Bsymbolic --no-apply-dynamic-relocs --build-id -o .tmp_vmlinux1 -T ./arch/arm64/kernel/vmlinux.lds --whole-archive built-in.a --no-whole-archive --start-group arch/arm64/lib/lib.a lib/lib.a --end-group
      init/do_mounts.o: In function `mount_block_root':
      do_mounts.c:(.init.text+0x1e8): undefined reference to `mem_map'
      arch/arm64/kernel/vdso.o: In function `vdso_init':
      vdso.c:(.init.text+0xb4): undefined reference to `mem_map'
      
      This uses the same trick as the other architectures, making flatmem
      depend on !NUMA to avoid the broken configuration.
      
      Fixes: e7d4bac4 ("arm64: add ARM64-specific support for flatmem")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      54501ac1
  4. 09 Jul, 2018 3 commits
    • Lorenzo Pieralisi's avatar
      arm64: numa: rework ACPI NUMA initialization · e1896249
      Lorenzo Pieralisi authored
      Current ACPI ARM64 NUMA initialization code in
      
      acpi_numa_gicc_affinity_init()
      
      carries out NUMA nodes creation and cpu<->node mappings at the same time
      in the arch backend so that a single SRAT walk is needed to parse both
      pieces of information.  This implies that the cpu<->node mappings must
      be stashed in an array (sized NR_CPUS) so that SMP code can later use
      the stashed values to avoid another SRAT table walk to set-up the early
      cpu<->node mappings.
      
      If the kernel is configured with a NR_CPUS value less than the actual
      processor entries in the SRAT (and MADT), the logic in
      acpi_numa_gicc_affinity_init() is broken in that the cpu<->node mapping
      is only carried out (and stashed for future use) only for a number of
      SRAT entries up to NR_CPUS, which do not necessarily correspond to the
      possible cpus detected at SMP initialization in
      acpi_map_gic_cpu_interface() (ie MADT and SRAT processor entries order
      is not enforced), which leaves the kernel with broken cpu<->node
      mappings.
      
      Furthermore, given the current ACPI NUMA code parsing logic in
      acpi_numa_gicc_affinity_init(), PXM domains for CPUs that are not parsed
      because they exceed NR_CPUS entries are not mapped to NUMA nodes (ie the
      PXM corresponding node is not created in the kernel) leaving the system
      with a broken NUMA topology.
      
      Rework the ACPI ARM64 NUMA initialization process so that the NUMA
      nodes creation and cpu<->node mappings are decoupled. cpu<->node
      mappings are moved to SMP initialization code (where they are needed),
      at the cost of an extra SRAT walk so that ACPI NUMA mappings can be
      batched before being applied, fixing current parsing pitfalls.
      Acked-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Tested-by: default avatarJohn Garry <john.garry@huawei.com>
      Fixes: d8b47fca ("arm64, ACPI, NUMA: NUMA support based on SRAT and
      SLIT")
      Link: http://lkml.kernel.org/r/1527768879-88161-2-git-send-email-xiexiuqi@huawei.comReported-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Punit Agrawal <punit.agrawal@arm.com>
      Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
      Cc: Jeremy Linton <jeremy.linton@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      e1896249
    • Nikunj Kela's avatar
      arm64: add ARM64-specific support for flatmem · e7d4bac4
      Nikunj Kela authored
      Flatmem is useful in reducing kernel memory usage.
      One usecase is in kdump kernel. We are able to save
      ~14M by moving to flatmem scheme.
      
      Cc: xe-kernel@external.cisco.com
      Cc: Nikunj Kela <nkela@cisco.com>
      Signed-off-by: default avatarNikunj Kela <nkela@cisco.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      e7d4bac4
    • Will Deacon's avatar
      MAINTAINERS: arm64: Remove boot/dts/ directory from arm64 entry · d7c7118c
      Will Deacon authored
      The arm-soc tree does a good job handling .dts files, so exclude them
      from the ARM64 entry in MAINTAINERS.
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      d7c7118c
  5. 06 Jul, 2018 12 commits
  6. 05 Jul, 2018 14 commits
    • Will Deacon's avatar
      arm64: insn: Don't fallback on nosync path for general insn patching · 693350a7
      Will Deacon authored
      Patching kernel instructions at runtime requires other CPUs to undergo
      a context synchronisation event via an explicit ISB or an IPI in order
      to ensure that the new instructions are visible. This is required even
      for "hotpatch" instructions such as NOP and BL, so avoid optimising in
      this case and always go via stop_machine() when performing general
      patching.
      
      ftrace isn't quite as strict, so it can continue to call the nosync
      code directly.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      693350a7
    • Will Deacon's avatar
      arm64: IPI each CPU after invalidating the I-cache for kernel mappings · 3b8c9f1c
      Will Deacon authored
      When invalidating the instruction cache for a kernel mapping via
      flush_icache_range(), it is also necessary to flush the pipeline for
      other CPUs so that instructions fetched into the pipeline before the
      I-cache invalidation are discarded. For example, if module 'foo' is
      unloaded and then module 'bar' is loaded into the same area of memory,
      a CPU could end up executing instructions from 'foo' when branching into
      'bar' if these instructions were fetched into the pipeline before 'foo'
      was unloaded.
      
      Whilst this is highly unlikely to occur in practice, particularly as
      any exception acts as a context-synchronizing operation, following the
      letter of the architecture requires us to execute an ISB on each CPU
      in order for the new instruction stream to be visible.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3b8c9f1c
    • Mark Rutland's avatar
      arm64: remove unused COMPAT_PSR definitions · 7373fed2
      Mark Rutland authored
      Now that users have been migrated to PSR_AA32, kill the unused
      COMPAT_PSR definitions.
      
      The only difference we need a definition for is COMPAT_PSR_DIT_BIT,
      which differs from PSR_AA32_DIT_BIT.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      7373fed2
    • Mark Rutland's avatar
      kvm/arm: use PSR_AA32 definitions · 256c0960
      Mark Rutland authored
      Some code cares about the SPSR_ELx format for exceptions taken from
      AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
      the SPSR_ELx format, and not in the AArch32 PSR format.
      
      To separate these from cases where we care about the AArch32 PSR format,
      migrate these cases to use the PSR_AA32_* definitions rather than
      COMPAT_PSR_*.
      
      There should be no functional change as a result of this patch.
      
      Note that arm64 KVM does not support a compat KVM API, and always uses
      the SPSR_ELx format, even for AArch32 guests.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      256c0960
    • Mark Rutland's avatar
      arm64: use PSR_AA32 definitions · d64567f6
      Mark Rutland authored
      Some code cares about the SPSR_ELx format for exceptions taken from
      AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
      the SPSR_ELx format, and not in the AArch32 PSR format.
      
      To separate these from cases where we care about the AArch32 PSR format,
      migrate these cases to use the PSR_AA32_* definitions rather than
      COMPAT_PSR_*.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      d64567f6
    • Mark Rutland's avatar
      arm64: ptrace: map SPSR_ELx<->PSR for compat tasks · 76fc52bd
      Mark Rutland authored
      The SPSR_ELx format for exceptions taken from AArch32 is slightly
      different to the AArch32 PSR format.
      
      Map between the two in the compat ptrace code.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: 7206dc93 ("arm64: Expose Arm v8.4 features")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Suzuki Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      76fc52bd
    • Mark Rutland's avatar
      arm64: compat: map SPSR_ELx<->PSR for signals · 25dc2c80
      Mark Rutland authored
      The SPSR_ELx format for exceptions taken from AArch32 differs from the
      AArch32 PSR format. Thus, we must translate between the two when setting
      up a compat sigframe, or restoring context from a compat sigframe.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: 7206dc93 ("arm64: Expose Arm v8.4 features")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Suzuki Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      25dc2c80
    • Mark Rutland's avatar
      arm64: don't zero DIT on signal return · 12651321
      Mark Rutland authored
      Currently valid_user_regs() treats SPSR_ELx.DIT as a RES0 bit, causing
      it to be zeroed upon exception return, rather than preserved. Thus, code
      relying on DIT will not function as expected, and may expose an
      unexpected timing sidechannel.
      
      Let's remove DIT from the set of RES0 bits, such that it is preserved.
      At the same time, the related comment is updated to better describe the
      situation, and to take into account the most recent documentation of
      SPSR_ELx, in ARM DDI 0487C.a.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Fixes: 7206dc93 ("arm64: Expose Arm v8.4 features")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      12651321
    • Mark Rutland's avatar
      arm64: add PSR_AA32_* definitions · 25086263
      Mark Rutland authored
      The AArch32 CPSR/SPSR format is *almost* identical to the AArch64
      SPSR_ELx format for exceptions taken from AArch32, but the two have
      diverged with the addition of DIT, and we need to treat the two as
      logically distinct.
      
      This patch adds new definitions for the SPSR_ELx format for exceptions
      taken from AArch32, with a consistent PSR_AA32_ prefix. The existing
      COMPAT_PSR_ definitions will be used for the PSR format as seen from
      AArch32.
      
      Definitions of DIT are provided for both, and inline functions are
      provided to map between the two formats. Note that for SPSR_ELx, the
      (RES0) J bit has been re-allocated as the DIT bit.
      
      Once users of the COMPAT_PSR definitions have been migrated over to the
      PSR_AA32 definitions, the (majority of) the former will be removed, so
      no efforts is made to avoid duplication until then.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoffer Dall <christoffer.dall@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Suzuki Poulose <suzuki.poulose@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      25086263
    • Suzuki K Poulose's avatar
      arm64: Handle mismatched cache type · 314d53d2
      Suzuki K Poulose authored
      Track mismatches in the cache type register (CTR_EL0), other
      than the D/I min line sizes and trap user accesses if there are any.
      
      Fixes: be68a8aa ("arm64: cpufeature: Fix CTR_EL0 field definitions")
      Cc: <stable@vger.kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      314d53d2
    • Suzuki K Poulose's avatar
      arm64: Fix mismatched cache line size detection · 4c4a39dd
      Suzuki K Poulose authored
      If there is a mismatch in the I/D min line size, we must
      always use the system wide safe value both in applications
      and in the kernel, while performing cache operations. However,
      we have been checking more bits than just the min line sizes,
      which triggers false negatives. We may need to trap the user
      accesses in such cases, but not necessarily patch the kernel.
      
      This patch fixes the check to do the right thing as advertised.
      A new capability will be added to check mismatches in other
      fields and ensure we trap the CTR accesses.
      
      Fixes: be68a8aa ("arm64: cpufeature: Fix CTR_EL0 field definitions")
      Cc: <stable@vger.kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      4c4a39dd
    • Will Deacon's avatar
      arm64: kconfig: Ensure spinlock fastpaths are inlined if !PREEMPT · 5d168964
      Will Deacon authored
      When running with CONFIG_PREEMPT=n, the spinlock fastpaths fit inside
      64 bytes, which typically coincides with the L1 I-cache line size.
      
      Inline the spinlock fastpaths, like we do already for rwlocks.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      5d168964
    • Will Deacon's avatar
      arm64: locking: Replace ticket lock implementation with qspinlock · c1109047
      Will Deacon authored
      It's fair to say that our ticket lock has served us well over time, but
      it's time to bite the bullet and start using the generic qspinlock code
      so we can make use of explicit MCS queuing and potentially better PV
      performance in future.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c1109047
    • Will Deacon's avatar
      arm64: barrier: Implement smp_cond_load_relaxed · 598865c5
      Will Deacon authored
      We can provide an implementation of smp_cond_load_relaxed using READ_ONCE
      and __cmpwait_relaxed.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      598865c5
  7. 04 Jul, 2018 2 commits
    • Toshi Kani's avatar
      x86/mm: Add TLB purge to free pmd/pte page interfaces · 5e0fb5df
      Toshi Kani authored
      ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates
      a pud / pmd map.  The following preconditions are met at their entry.
       - All pte entries for a target pud/pmd address range have been cleared.
       - System-wide TLB purges have been peformed for a target pud/pmd address
         range.
      
      The preconditions assure that there is no stale TLB entry for the range.
      Speculation may not cache TLB entries since it requires all levels of page
      entries, including ptes, to have P & A-bits set for an associated address.
      However, speculation may cache pud/pmd entries (paging-structure caches)
      when they have P-bit set.
      
      Add a system-wide TLB purge (INVLPG) to a single page after clearing
      pud/pmd entry's P-bit.
      
      SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches,
      states that:
        INVLPG invalidates all paging-structure caches associated with the
        current PCID regardless of the liner addresses to which they correspond.
      
      Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces")
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: mhocko@suse.com
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: cpandya@codeaurora.org
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20180627141348.21777-4-toshi.kani@hpe.com
      5e0fb5df
    • Chintan Pandya's avatar
      ioremap: Update pgtable free interfaces with addr · 785a19f9
      Chintan Pandya authored
      The following kernel panic was observed on ARM64 platform due to a stale
      TLB entry.
      
       1. ioremap with 4K size, a valid pte page table is set.
       2. iounmap it, its pte entry is set to 0.
       3. ioremap the same address with 2M size, update its pmd entry with
          a new value.
       4. CPU may hit an exception because the old pmd entry is still in TLB,
          which leads to a kernel panic.
      
      Commit b6bdb751 ("mm/vmalloc: add interfaces to free unmapped page
      table") has addressed this panic by falling to pte mappings in the above
      case on ARM64.
      
      To support pmd mappings in all cases, TLB purge needs to be performed
      in this case on ARM64.
      
      Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
      so that TLB purge can be added later in seprate patches.
      
      [toshi.kani@hpe.com: merge changes, rewrite patch description]
      Fixes: 28ee90fe ("x86/mm: implement free pmd/pte page interfaces")
      Signed-off-by: default avatarChintan Pandya <cpandya@codeaurora.org>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: mhocko@suse.com
      Cc: akpm@linux-foundation.org
      Cc: hpa@zytor.com
      Cc: linux-mm@kvack.org
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
      785a19f9