Commit fa5fd7c6 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Here is the core arm64 queue for 4.5.  As you might expect, the
  Christmas break resulted in a number of patches not making the final
  cut, so 4.6 is likely to be larger than usual.  There's still some
  useful stuff here, however, and it's detailed below.

  The EFI changes have been Reviewed-by Matt and the memblock change got
  an "OK" from akpm.

  Summary:

   - Support for a separate IRQ stack, although we haven't reduced the
     size of our thread stack just yet since we don't have enough data
     to determine a safe value

   - Refactoring of our EFI initialisation and runtime code into
     drivers/firmware/efi/ so that it can be reused by arch/arm/.

   - Ftrace improvements when unwinding in the function graph tracer

   - Document our silicon errata handling process

   - Cache flushing optimisation when mapping executable pages

   - Support for hugetlb mappings using the contiguous hint in the pte"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (45 commits)
  arm64: head.S: use memset to clear BSS
  efi: stub: define DISABLE_BRANCH_PROFILING for all architectures
  arm64: entry: remove pointless SPSR mode check
  arm64: mm: move pgd_cache initialisation to pgtable_cache_init
  arm64: module: avoid undefined shift behavior in reloc_data()
  arm64: module: fix relocation of movz instruction with negative immediate
  arm64: traps: address fallout from printk -> pr_* conversion
  arm64: ftrace: fix a stack tracer's output under function graph tracer
  arm64: pass a task parameter to unwind_frame()
  arm64: ftrace: modify a stack frame in a safe way
  arm64: remove irq_count and do_softirq_own_stack()
  arm64: hugetlb: add support for PTE contiguous bit
  arm64: Use PoU cache instr for I/D coherency
  arm64: Defer dcache flush in __cpu_copy_user_page
  arm64: reduce stack use in irq_handler
  arm64: mm: ensure that the zero page is visible to the page table walker
  arm64: Documentation: add list of software workarounds for errata
  arm64: mm: place __cpu_setup in .text
  arm64: cmpxchg: Don't incldue linux/mmdebug.h
  arm64: mm: fold alternatives into .init
  ...
parents 19e2fc40 2a803c4d
Silicon Errata and Software Workarounds
=======================================
Author: Will Deacon <will.deacon@arm.com>
Date : 27 November 2015
It is an unfortunate fact of life that hardware is often produced with
so-called "errata", which can cause it to deviate from the architecture
under specific circumstances. For hardware produced by ARM, these
errata are broadly classified into the following categories:
Category A: A critical error without a viable workaround.
Category B: A significant or critical error with an acceptable
workaround.
Category C: A minor error that is not expected to occur under normal
operation.
For more information, consult one of the "Software Developers Errata
Notice" documents available on infocenter.arm.com (registration
required).
As far as Linux is concerned, Category B errata may require some special
treatment in the operating system. For example, avoiding a particular
sequence of code, or configuring the processor in a particular way. A
less common situation may require similar actions in order to declassify
a Category A erratum into a Category C erratum. These are collectively
known as "software workarounds" and are only required in the minority of
cases (e.g. those cases that both require a non-secure workaround *and*
can be triggered by Linux).
For software workarounds that may adversely impact systems unaffected by
the erratum in question, a Kconfig entry is added under "Kernel
Features" -> "ARM errata workarounds via the alternatives framework".
These are enabled by default and patched in at runtime when an affected
CPU is detected. For less-intrusive workarounds, a Kconfig option is not
available and the code is structured (preferably with a comment) in such
a way that the erratum will not be hit.
This approach can make it slightly onerous to determine exactly which
errata are worked around in an arbitrary kernel source tree, so this
file acts as a registry of software workarounds in the Linux Kernel and
will be updated when new workarounds are committed and backported to
stable kernels.
| Implementor | Component | Erratum ID | Kconfig |
+----------------+-----------------+-----------------+-------------------------+
| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
| ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 |
| ARM | Cortex-A53 | #819472 | ARM64_ERRATUM_819472 |
| ARM | Cortex-A53 | #845719 | ARM64_ERRATUM_845719 |
| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
| ARM | Cortex-A57 | #852523 | N/A |
| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
| | | | |
| Cavium | ThunderX ITS | #22375, #24313 | CAVIUM_ERRATUM_22375 |
| Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 |
......@@ -9,7 +9,7 @@
| alpha: | .. |
| arc: | TODO |
| arm: | ok |
| arm64: | .. |
| arm64: | ok |
| avr32: | TODO |
| blackfin: | TODO |
| c6x: | TODO |
......
......@@ -70,6 +70,7 @@ config ARM64
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_GENERIC_DMA_COHERENT
select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_IRQ_TIME_ACCOUNTING
select HAVE_MEMBLOCK
select HAVE_PATA_PLATFORM
select HAVE_PERF_EVENTS
......@@ -529,9 +530,6 @@ config HW_PERF_EVENTS
config SYS_SUPPORTS_HUGETLBFS
def_bool y
config ARCH_WANT_GENERAL_HUGETLB
def_bool y
config ARCH_WANT_HUGE_PMD_SHARE
def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
......
......@@ -19,7 +19,6 @@ struct alt_instr {
void __init apply_alternatives_all(void);
void apply_alternatives(void *start, size_t length);
void free_alternatives_memory(void);
#define ALTINSTR_ENTRY(feature) \
" .word 661b - .\n" /* label */ \
......
......@@ -193,6 +193,17 @@ lr .req x30 // link register
str \src, [\tmp, :lo12:\sym]
.endm
/*
* @sym: The name of the per-cpu variable
* @reg: Result of per_cpu(sym, smp_processor_id())
* @tmp: scratch register
*/
.macro this_cpu_ptr, sym, reg, tmp
adr_l \reg, \sym
mrs \tmp, tpidr_el1
add \reg, \reg, \tmp
.endm
/*
* Annotate a function as position independent, i.e., safe to be called before
* the kernel virtual mapping is activated.
......
......@@ -68,6 +68,7 @@
extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
extern void flush_icache_range(unsigned long start, unsigned long end);
extern void __flush_dcache_area(void *addr, size_t len);
extern void __clean_dcache_area_pou(void *addr, size_t len);
extern long __flush_cache_user_range(unsigned long start, unsigned long end);
static inline void flush_cache_mm(struct mm_struct *mm)
......
......@@ -19,7 +19,6 @@
#define __ASM_CMPXCHG_H
#include <linux/bug.h>
#include <linux/mmdebug.h>
#include <asm/atomic.h>
#include <asm/barrier.h>
......
......@@ -2,7 +2,9 @@
#define _ASM_EFI_H
#include <asm/io.h>
#include <asm/mmu_context.h>
#include <asm/neon.h>
#include <asm/tlbflush.h>
#ifdef CONFIG_EFI
extern void efi_init(void);
......@@ -10,6 +12,8 @@ extern void efi_init(void);
#define efi_init()
#endif
int efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md);
#define efi_call_virt(f, ...) \
({ \
efi_##f##_t *__f; \
......@@ -63,6 +67,11 @@ extern void efi_init(void);
* Services are enabled and the EFI_RUNTIME_SERVICES bit set.
*/
static inline void efi_set_pgd(struct mm_struct *mm)
{
switch_mm(NULL, mm, NULL);
}
void efi_virtmap_load(void);
void efi_virtmap_unload(void);
......
......@@ -28,6 +28,8 @@ struct dyn_arch_ftrace {
extern unsigned long ftrace_graph_call;
extern void return_to_handler(void);
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/*
......
......@@ -26,36 +26,7 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
return *ptep;
}
static inline void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
set_pte_at(mm, addr, ptep, pte);
}
static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
ptep_clear_flush(vma, addr, ptep);
}
static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
ptep_set_wrprotect(mm, addr, ptep);
}
static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
return ptep_get_and_clear(mm, addr, ptep);
}
static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t pte, int dirty)
{
return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
}
static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long addr, unsigned long end,
......@@ -97,4 +68,19 @@ static inline void arch_clear_hugepage_flags(struct page *page)
clear_bit(PG_dcache_clean, &page->flags);
}
extern pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
struct page *page, int writable);
#define arch_make_huge_pte arch_make_huge_pte
extern void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte);
extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t pte, int dirty);
extern pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep);
extern void huge_ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep);
extern void huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep);
#endif /* __ASM_HUGETLB_H */
#ifndef __ASM_IRQ_H
#define __ASM_IRQ_H
#define IRQ_STACK_SIZE THREAD_SIZE
#define IRQ_STACK_START_SP THREAD_START_SP
#ifndef __ASSEMBLER__
#include <linux/percpu.h>
#include <asm-generic/irq.h>
#include <asm/thread_info.h>
struct pt_regs;
DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
/*
* The highest address on the stack, and the first to be used. Used to
* find the dummy-stack frame put down by el?_irq() in entry.S, which
* is structured as follows:
*
* ------------
* | | <- irq_stack_ptr
* top ------------
* | x19 | <- irq_stack_ptr - 0x08
* ------------
* | x29 | <- irq_stack_ptr - 0x10
* ------------
*
* where x19 holds a copy of the task stack pointer where the struct pt_regs
* from kernel_entry can be found.
*
*/
#define IRQ_STACK_PTR(cpu) ((unsigned long)per_cpu(irq_stack, cpu) + IRQ_STACK_START_SP)
/*
* The offset from irq_stack_ptr where entry.S will store the original
* stack pointer. Used by unwind_frame() and dump_backtrace().
*/
#define IRQ_STACK_TO_TASK_STACK(ptr) (*((unsigned long *)((ptr) - 0x08)))
extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
static inline int nr_legacy_irqs(void)
......@@ -12,4 +47,14 @@ static inline int nr_legacy_irqs(void)
return 0;
}
static inline bool on_irq_stack(unsigned long sp, int cpu)
{
/* variable names the same as kernel/stacktrace.c */
unsigned long low = (unsigned long)per_cpu(irq_stack, cpu);
unsigned long high = low + IRQ_STACK_START_SP;
return (low <= sp && sp <= high);
}
#endif /* !__ASSEMBLER__ */
#endif
......@@ -90,7 +90,23 @@
/*
* Contiguous page definitions.
*/
#define CONT_PTES (_AC(1, UL) << CONT_SHIFT)
#ifdef CONFIG_ARM64_64K_PAGES
#define CONT_PTE_SHIFT 5
#define CONT_PMD_SHIFT 5
#elif defined(CONFIG_ARM64_16K_PAGES)
#define CONT_PTE_SHIFT 7
#define CONT_PMD_SHIFT 5
#else
#define CONT_PTE_SHIFT 4
#define CONT_PMD_SHIFT 4
#endif
#define CONT_PTES (1 << CONT_PTE_SHIFT)
#define CONT_PTE_SIZE (CONT_PTES * PAGE_SIZE)
#define CONT_PTE_MASK (~(CONT_PTE_SIZE - 1))
#define CONT_PMDS (1 << CONT_PMD_SHIFT)
#define CONT_PMD_SIZE (CONT_PMDS * PMD_SIZE)
#define CONT_PMD_MASK (~(CONT_PMD_SIZE - 1))
/* the the numerical offset of the PTE within a range of CONT_PTES */
#define CONT_RANGE_OFFSET(addr) (((addr)>>PAGE_SHIFT)&(CONT_PTES-1))
......
......@@ -167,6 +167,16 @@ extern struct page *empty_zero_page;
((pte_val(pte) & (PTE_VALID | PTE_USER)) == (PTE_VALID | PTE_USER))
#define pte_valid_not_user(pte) \
((pte_val(pte) & (PTE_VALID | PTE_USER)) == PTE_VALID)
#define pte_valid_young(pte) \
((pte_val(pte) & (PTE_VALID | PTE_AF)) == (PTE_VALID | PTE_AF))
/*
* Could the pte be present in the TLB? We must check mm_tlb_flush_pending
* so that we don't erroneously return false for pages that have been
* remapped as PROT_NONE but are yet to be flushed from the TLB.
*/
#define pte_accessible(mm, pte) \
(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid_young(pte))
static inline pte_t clear_pte_bit(pte_t pte, pgprot_t prot)
{
......@@ -217,7 +227,8 @@ static inline pte_t pte_mkspecial(pte_t pte)
static inline pte_t pte_mkcont(pte_t pte)
{
return set_pte_bit(pte, __pgprot(PTE_CONT));
pte = set_pte_bit(pte, __pgprot(PTE_CONT));
return set_pte_bit(pte, __pgprot(PTE_TYPE_PAGE));
}
static inline pte_t pte_mknoncont(pte_t pte)
......@@ -225,6 +236,11 @@ static inline pte_t pte_mknoncont(pte_t pte)
return clear_pte_bit(pte, __pgprot(PTE_CONT));
}
static inline pmd_t pmd_mkcont(pmd_t pmd)
{
return __pmd(pmd_val(pmd) | PMD_SECT_CONT);
}
static inline void set_pte(pte_t *ptep, pte_t pte)
{
*ptep = pte;
......@@ -298,7 +314,7 @@ static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
/*
* Hugetlb definitions.
*/
#define HUGE_MAX_HSTATE 2
#define HUGE_MAX_HSTATE 4
#define HPAGE_SHIFT PMD_SHIFT
#define HPAGE_SIZE (_AC(1, UL) << HPAGE_SHIFT)
#define HPAGE_MASK (~(HPAGE_SIZE - 1))
......@@ -664,7 +680,8 @@ extern int kern_addr_valid(unsigned long addr);
#include <asm-generic/pgtable.h>
#define pgtable_cache_init() do { } while (0)
void pgd_cache_init(void);
#define pgtable_cache_init pgd_cache_init
/*
* On AArch64, the cache coherency is handled via the set_pte_at() function.
......
......@@ -21,7 +21,7 @@
* alignment value. Since we don't have aliasing D-caches, the rest of
* the time we can safely use PAGE_SIZE.
*/
#define COMPAT_SHMLBA 0x4000
#define COMPAT_SHMLBA (4 * PAGE_SIZE)
#include <asm-generic/shmparam.h>
......
......@@ -26,9 +26,28 @@
* The memory barriers are implicit with the load-acquire and store-release
* instructions.
*/
static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
{
unsigned int tmp;
arch_spinlock_t lockval;
#define arch_spin_unlock_wait(lock) \
do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0)
asm volatile(
" sevl\n"
"1: wfe\n"
"2: ldaxr %w0, %2\n"
" eor %w1, %w0, %w0, ror #16\n"
" cbnz %w1, 1b\n"
ARM64_LSE_ATOMIC_INSN(
/* LL/SC */
" stxr %w1, %w0, %2\n"
" cbnz %w1, 2b\n", /* Serialise against any concurrent lockers */
/* LSE atomics */
" nop\n"
" nop\n")
: "=&r" (lockval), "=&r" (tmp), "+Q" (*lock)
:
: "memory");
}
#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
......
......@@ -16,14 +16,19 @@
#ifndef __ASM_STACKTRACE_H
#define __ASM_STACKTRACE_H
struct task_struct;
struct stackframe {
unsigned long fp;
unsigned long sp;
unsigned long pc;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
unsigned int graph;
#endif
};
extern int unwind_frame(struct stackframe *frame);
extern void walk_stackframe(struct stackframe *frame,
extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data);
#endif /* __ASM_STACKTRACE_H */
......@@ -73,10 +73,16 @@ register unsigned long current_stack_pointer asm ("sp");
*/
static inline struct thread_info *current_thread_info(void) __attribute_const__;
/*
* struct thread_info can be accessed directly via sp_el0.
*/
static inline struct thread_info *current_thread_info(void)
{
return (struct thread_info *)
(current_stack_pointer & ~(THREAD_SIZE - 1));
unsigned long sp_el0;
asm ("mrs %0, sp_el0" : "=r" (sp_el0));
return (struct thread_info *)sp_el0;
}
#define thread_saved_pc(tsk) \
......
......@@ -158,9 +158,3 @@ void apply_alternatives(void *start, size_t length)
__apply_alternatives(&region);
}
void free_alternatives_memory(void)
{
free_reserved_area(__alt_instructions, __alt_instructions_end,
0, "alternatives");
}
......@@ -62,7 +62,7 @@ struct insn_emulation {
};
static LIST_HEAD(insn_emulation);
static int nr_insn_emulated;
static int nr_insn_emulated __initdata;
static DEFINE_RAW_SPINLOCK(insn_emulation_lock);
static void register_emulation_hooks(struct insn_emulation_ops *ops)
......@@ -173,7 +173,7 @@ static int update_insn_emulation_mode(struct insn_emulation *insn,
return ret;
}
static void register_insn_emulation(struct insn_emulation_ops *ops)
static void __init register_insn_emulation(struct insn_emulation_ops *ops)
{
unsigned long flags;
struct insn_emulation *insn;
......@@ -237,7 +237,7 @@ static struct ctl_table ctl_abi[] = {
{ }
};
static void register_insn_emulation_sysctl(struct ctl_table *table)
static void __init register_insn_emulation_sysctl(struct ctl_table *table)
{
unsigned long flags;
int i = 0;
......
......@@ -684,7 +684,7 @@ static const struct arm64_cpu_capabilities arm64_hwcaps[] = {
{},
};
static void cap_set_hwcap(const struct arm64_cpu_capabilities *cap)
static void __init cap_set_hwcap(const struct arm64_cpu_capabilities *cap)
{
switch (cap->hwcap_type) {
case CAP_HWCAP:
......@@ -729,7 +729,7 @@ static bool __maybe_unused cpus_have_hwcap(const struct arm64_cpu_capabilities *
return rc;
}
static void setup_cpu_hwcaps(void)
static void __init setup_cpu_hwcaps(void)
{
int i;
const struct arm64_cpu_capabilities *hwcaps = arm64_hwcaps;
......@@ -758,7 +758,8 @@ void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
* Run through the enabled capabilities and enable() it on all active
* CPUs
*/
static void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
static void __init
enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
{
int i;
......@@ -897,7 +898,7 @@ static inline void set_sys_caps_initialised(void)
#endif /* CONFIG_HOTPLUG_CPU */
static void setup_feature_capabilities(void)
static void __init setup_feature_capabilities(void)
{
update_cpu_capabilities(arm64_features, "detected feature:");
enable_cpu_capabilities(arm64_features);
......
......@@ -11,317 +11,34 @@
*
*/
#include <linux/atomic.h>
#include <linux/dmi.h>
#include <linux/efi.h>
#include <linux/export.h>
#include <linux/memblock.h>
#include <linux/mm_types.h>
#include <linux/bootmem.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <linux/preempt.h>
#include <linux/rbtree.h>
#include <linux/rwsem.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/init.h>
#include <asm/cacheflush.h>
#include <asm/efi.h>
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
#include <asm/mmu.h>
#include <asm/pgtable.h>
struct efi_memory_map memmap;
static u64 efi_system_table;
static pgd_t efi_pgd[PTRS_PER_PGD] __page_aligned_bss;
static struct mm_struct efi_mm = {
.mm_rb = RB_ROOT,
.pgd = efi_pgd,
.mm_users = ATOMIC_INIT(2),
.mm_count = ATOMIC_INIT(1),
.mmap_sem = __RWSEM_INITIALIZER(efi_mm.mmap_sem),
.page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
};
static int __init is_normal_ram(efi_memory_desc_t *md)
{
if (md->attribute & EFI_MEMORY_WB)
return 1;
return 0;
}
/*
* Translate a EFI virtual address into a physical address: this is necessary,
* as some data members of the EFI system table are virtually remapped after
* SetVirtualAddressMap() has been called.
*/
static phys_addr_t efi_to_phys(unsigned long addr)
int __init efi_create_mapping(struct mm_struct *mm, efi_memory_desc_t *md)
{
efi_memory_desc_t *md;
for_each_efi_memory_desc(&memmap, md) {
if (!(md->attribute & EFI_MEMORY_RUNTIME))
continue;
if (md->virt_addr == 0)
/* no virtual mapping has been installed by the stub */
break;
if (md->virt_addr <= addr &&
(addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
return md->phys_addr + addr - md->virt_addr;
}
return addr;
}
static int __init uefi_init(void)
{
efi_char16_t *c16;
void *config_tables;
u64 table_size;
char vendor[100] = "unknown";
int i, retval;
efi.systab = early_memremap(efi_system_table,
sizeof(efi_system_table_t));
if (efi.systab == NULL) {
pr_warn("Unable to map EFI system table.\n");
return -ENOMEM;
}
set_bit(EFI_BOOT, &efi.flags);
set_bit(EFI_64BIT, &efi.flags);
pteval_t prot_val;
/*
* Verify the EFI Table
* Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
* executable, everything else can be mapped with the XN bits
* set.
*/
if (efi.systab->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) {
pr_err("System table signature incorrect\n");
retval = -EINVAL;
goto out;
}
if ((efi.systab->hdr.revision >> 16) < 2)
pr_warn("Warning: EFI system table version %d.%02d, expected 2.00 or greater\n",
efi.systab->hdr.revision >> 16,
efi.systab->hdr.revision & 0xffff);
/* Show what we know for posterity */
c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
sizeof(vendor) * sizeof(efi_char16_t));
if (c16) {
for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
vendor[i] = c16[i];
vendor[i] = '\0';
early_memunmap(c16, sizeof(vendor) * sizeof(efi_char16_t));
}
pr_info("EFI v%u.%.02u by %s\n",
efi.systab->hdr.revision >> 16,
efi.systab->hdr.revision & 0xffff, vendor);
table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
config_tables = early_memremap(efi_to_phys(efi.systab->tables),
table_size);
if (config_tables == NULL) {
pr_warn("Unable to map EFI config table array.\n");
retval = -ENOMEM;
goto out;
}
retval = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
sizeof(efi_config_table_64_t), NULL);
early_memunmap(config_tables, table_size);
out:
early_memunmap(efi.systab, sizeof(efi_system_table_t));
return retval;
}
/*
* Return true for RAM regions we want to permanently reserve.
*/
static __init int is_reserve_region(efi_memory_desc_t *md)
{
switch (md->type) {
case EFI_LOADER_CODE:
case EFI_LOADER_DATA:
case EFI_BOOT_SERVICES_CODE:
case EFI_BOOT_SERVICES_DATA:
case EFI_CONVENTIONAL_MEMORY:
case EFI_PERSISTENT_MEMORY:
return 0;
default:
break;
}
return is_normal_ram(md);
}
static __init void reserve_regions(void)
{
efi_memory_desc_t *md;
u64 paddr, npages, size;
if (efi_enabled(EFI_DBG))
pr_info("Processing EFI memory map:\n");
for_each_efi_memory_desc(&memmap, md) {
paddr = md->phys_addr;
npages = md->num_pages;
if (efi_enabled(EFI_DBG)) {
char buf[64];
pr_info(" 0x%012llx-0x%012llx %s",
paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
efi_md_typeattr_format(buf, sizeof(buf), md));
}
memrange_efi_to_native(&paddr, &npages);
size = npages << PAGE_SHIFT;
if (is_normal_ram(md))
early_init_dt_add_memory_arch(paddr, size);
if (is_reserve_region(md)) {
memblock_reserve(paddr, size);
if (efi_enabled(EFI_DBG))
pr_cont("*");
}
if (efi_enabled(EFI_DBG))
pr_cont("\n");
}
set_bit(EFI_MEMMAP, &efi.flags);
}
void __init efi_init(void)
{
struct efi_fdt_params params;
/* Grab UEFI information placed in FDT by stub */
if (!efi_get_fdt_params(&params))
return;
efi_system_table = params.system_table;
memblock_reserve(params.mmap & PAGE_MASK,
PAGE_ALIGN(params.mmap_size + (params.mmap & ~PAGE_MASK)));
memmap.phys_map = params.mmap;
memmap.map = early_memremap(params.mmap, params.mmap_size);
if (memmap.map == NULL) {
/*
* If we are booting via UEFI, the UEFI memory map is the only
* description of memory we have, so there is little point in
* proceeding if we cannot access it.
*/
panic("Unable to map EFI memory map.\n");
}
memmap.map_end = memmap.map + params.mmap_size;
memmap.desc_size = params.desc_size;
memmap.desc_version = params.desc_ver;
if (uefi_init() < 0)
return;
reserve_regions();
early_memunmap(memmap.map, params.mmap_size);
}
static bool __init efi_virtmap_init(void)
{
efi_memory_desc_t *md;
init_new_context(NULL, &efi_mm);
for_each_efi_memory_desc(&memmap, md) {
pgprot_t prot;
if (!(md->attribute & EFI_MEMORY_RUNTIME))
continue;
if (md->virt_addr == 0)
return false;
pr_info(" EFI remap 0x%016llx => %p\n",
md->phys_addr, (void *)md->virt_addr);
/*
* Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
* executable, everything else can be mapped with the XN bits
* set.
*/
if (!is_normal_ram(md))
prot = __pgprot(PROT_DEVICE_nGnRE);
else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
!PAGE_ALIGNED(md->phys_addr))
prot = PAGE_KERNEL_EXEC;
else
prot = PAGE_KERNEL;
create_pgd_mapping(&efi_mm, md->phys_addr, md->virt_addr,
md->num_pages << EFI_PAGE_SHIFT,
__pgprot(pgprot_val(prot) | PTE_NG));
}
return true;
}
/*
* Enable the UEFI Runtime Services if all prerequisites are in place, i.e.,
* non-early mapping of the UEFI system table and virtual mappings for all
* EFI_MEMORY_RUNTIME regions.
*/
static int __init arm64_enable_runtime_services(void)
{
u64 mapsize;
if (!efi_enabled(EFI_BOOT)) {
pr_info("EFI services will not be available.\n");
return 0;
}
if (efi_runtime_disabled()) {
pr_info("EFI runtime services will be disabled.\n");
return 0;
}
pr_info("Remapping and enabling EFI services.\n");
mapsize = memmap.map_end - memmap.map;
memmap.map = (__force void *)ioremap_cache(memmap.phys_map,
mapsize);
if (!memmap.map) {
pr_err("Failed to remap EFI memory map\n");
return -ENOMEM;
}
memmap.map_end = memmap.map + mapsize;
efi.memmap = &memmap;
efi.systab = (__force void *)ioremap_cache(efi_system_table,
sizeof(efi_system_table_t));
if (!efi.systab) {
pr_err("Failed to remap EFI System Table\n");
return -ENOMEM;
}
set_bit(EFI_SYSTEM_TABLES, &efi.flags);
if (!efi_virtmap_init()) {
pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
return -ENOMEM;
}
/* Set up runtime services function pointers */
efi_native_runtime_setup();
set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
efi.runtime_version = efi.systab->hdr.revision;
if ((md->attribute & EFI_MEMORY_WB) == 0)
prot_val = PROT_DEVICE_nGnRE;
else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
!PAGE_ALIGNED(md->phys_addr))
prot_val = pgprot_val(PAGE_KERNEL_EXEC);
else
prot_val = pgprot_val(PAGE_KERNEL);
create_pgd_mapping(mm, md->phys_addr, md->virt_addr,
md->num_pages << EFI_PAGE_SHIFT,
__pgprot(prot_val | PTE_NG));
return 0;
}
early_initcall(arm64_enable_runtime_services);
static int __init arm64_dmi_init(void)
{
......@@ -337,23 +54,6 @@ static int __init arm64_dmi_init(void)
}
core_initcall(arm64_dmi_init);
static void efi_set_pgd(struct mm_struct *mm)
{
switch_mm(NULL, mm, NULL);
}
void efi_virtmap_load(void)
{
preempt_disable();
efi_set_pgd(&efi_mm);
}
void efi_virtmap_unload(void)
{
efi_set_pgd(current->active_mm);
preempt_enable();
}
/*
* UpdateCapsule() depends on the system being shutdown via
* ResetSystem().
......
......@@ -27,6 +27,7 @@
#include <asm/cpufeature.h>
#include <asm/errno.h>
#include <asm/esr.h>
#include <asm/irq.h>
#include <asm/thread_info.h>
#include <asm/unistd.h>
......@@ -88,9 +89,12 @@
.if \el == 0
mrs x21, sp_el0
get_thread_info tsk // Ensure MDSCR_EL1.SS is clear,
mov tsk, sp
and tsk, tsk, #~(THREAD_SIZE - 1) // Ensure MDSCR_EL1.SS is clear,
ldr x19, [tsk, #TI_FLAGS] // since we can unmask debug
disable_step_tsk x19, x20 // exceptions when scheduling.
mov x29, xzr // fp pointed to user-space
.else
add x21, sp, #S_FRAME_SIZE
.endif
......@@ -107,6 +111,13 @@
str x21, [sp, #S_SYSCALLNO]
.endif
/*
* Set sp_el0 to current thread_info.
*/
.if \el == 0
msr sp_el0, tsk
.endif
/*
* Registers that may be useful after this macro is invoked:
*
......@@ -164,8 +175,44 @@ alternative_endif
.endm
.macro get_thread_info, rd
mov \rd, sp
and \rd, \rd, #~(THREAD_SIZE - 1) // top of stack
mrs \rd, sp_el0
.endm
.macro irq_stack_entry
mov x19, sp // preserve the original sp
/*
* Compare sp with the current thread_info, if the top
* ~(THREAD_SIZE - 1) bits match, we are on a task stack, and
* should switch to the irq stack.
*/
and x25, x19, #~(THREAD_SIZE - 1)
cmp x25, tsk
b.ne 9998f
this_cpu_ptr irq_stack, x25, x26
mov x26, #IRQ_STACK_START_SP
add x26, x25, x26
/* switch to the irq stack */
mov sp, x26
/*
* Add a dummy stack frame, this non-standard format is fixed up
* by unwind_frame()
*/
stp x29, x19, [sp, #-16]!
mov x29, sp
9998:
.endm
/*
* x19 should be preserved between irq_stack_entry and
* irq_stack_exit.
*/
.macro irq_stack_exit
mov sp, x19
.endm
/*
......@@ -183,10 +230,11 @@ tsk .req x28 // current thread_info
* Interrupt handling.
*/
.macro irq_handler
adrp x1, handle_arch_irq
ldr x1, [x1, #:lo12:handle_arch_irq]
ldr_l x1, handle_arch_irq
mov x0, sp
irq_stack_entry
blr x1
irq_stack_exit
.endm
.text
......@@ -358,10 +406,10 @@ el1_irq:
bl trace_hardirqs_off
#endif
get_thread_info tsk
irq_handler
#ifdef CONFIG_PREEMPT
get_thread_info tsk
ldr w24, [tsk, #TI_PREEMPT] // get preempt count
cbnz w24, 1f // preempt count != 0
ldr x0, [tsk, #TI_FLAGS] // get flags
......@@ -599,6 +647,8 @@ ENTRY(cpu_switch_to)
ldp x29, x9, [x8], #16
ldr lr, [x8]
mov sp, x9
and x9, x9, #~(THREAD_SIZE - 1)
msr sp_el0, x9
ret
ENDPROC(cpu_switch_to)
......@@ -626,14 +676,14 @@ ret_fast_syscall_trace:
work_pending:
tbnz x1, #TIF_NEED_RESCHED, work_resched
/* TIF_SIGPENDING, TIF_NOTIFY_RESUME or TIF_FOREIGN_FPSTATE case */
ldr x2, [sp, #S_PSTATE]
mov x0, sp // 'regs'
tst x2, #PSR_MODE_MASK // user mode regs?
b.ne no_work_pending // returning to kernel
enable_irq // enable interrupts for do_notify_resume()
bl do_notify_resume
b ret_to_user
work_resched:
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off // the IRQs are off here, inform the tracing code
#endif
bl schedule
/*
......@@ -645,7 +695,6 @@ ret_to_user:
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
enable_step_tsk x1, x2
no_work_pending:
kernel_exit 0
ENDPROC(ret_to_user)
......
......@@ -289,7 +289,7 @@ static struct notifier_block fpsimd_cpu_pm_notifier_block = {
.notifier_call = fpsimd_cpu_pm_notifier,
};
static void fpsimd_pm_init(void)
static void __init fpsimd_pm_init(void)
{
cpu_pm_register_notifier(&fpsimd_cpu_pm_notifier_block);
}
......
......@@ -29,12 +29,11 @@ static int ftrace_modify_code(unsigned long pc, u32 old, u32 new,
/*
* Note:
* Due to modules and __init, code can disappear and change,
* we need to protect against faulting as well as code changing.
* We do this by aarch64_insn_*() which use the probe_kernel_*().
*
* No lock is held here because all the modifications are run
* through stop_machine().
* We are paranoid about modifying text, as if a bug were to happen, it
* could cause us to read or write to someplace that could cause harm.
* Carefully read and modify the code with aarch64_insn_*() which uses
* probe_kernel_*(), and make sure what we read is what we expected it
* to be before modifying it.
*/
if (validate) {
if (aarch64_insn_read((void *)pc, &replaced))
......@@ -93,6 +92,11 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
return ftrace_modify_code(pc, old, new, true);
}
void arch_ftrace_update_code(int command)
{
ftrace_modify_all_code(command);
}
int __init ftrace_dyn_arch_init(void)
{
return 0;
......@@ -125,23 +129,20 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
* on other archs. It's unlikely on AArch64.
*/
old = *parent;
*parent = return_hooker;
trace.func = self_addr;
trace.depth = current->curr_ret_stack + 1;
/* Only trace if the calling function expects to */
if (!ftrace_graph_entry(&trace)) {
*parent = old;
if (!ftrace_graph_entry(&trace))
return;
}
err = ftrace_push_return_trace(old, self_addr, &trace.depth,
frame_pointer);
if (err == -EBUSY) {
*parent = old;
if (err == -EBUSY)
return;
}
else
*parent = return_hooker;
}
#ifdef CONFIG_DYNAMIC_FTRACE
......
......@@ -415,15 +415,17 @@ ENDPROC(__create_page_tables)
*/
.set initial_sp, init_thread_union + THREAD_START_SP
__mmap_switched:
adr_l x6, __bss_start
adr_l x7, __bss_stop
1: cmp x6, x7
b.hs 2f
str xzr, [x6], #8 // Clear BSS
b 1b
2:
// Clear BSS
adr_l x0, __bss_start
mov x1, xzr
adr_l x2, __bss_stop
sub x2, x2, x0
bl __pi_memset
adr_l sp, initial_sp, x4
mov x4, sp
and x4, x4, #~(THREAD_SIZE - 1)
msr sp_el0, x4 // Save thread_info
str_l x21, __fdt_pointer, x5 // Save FDT pointer
str_l x24, memstart_addr, x6 // Save PHYS_OFFSET
mov x29, #0
......@@ -606,6 +608,8 @@ ENDPROC(secondary_startup)
ENTRY(__secondary_switched)
ldr x0, [x21] // get secondary_data.stack
mov sp, x0
and x0, x0, #~(THREAD_SIZE - 1)
msr sp_el0, x0 // save thread_info
mov x29, #0
b secondary_start_kernel
ENDPROC(__secondary_switched)
......
......@@ -30,6 +30,9 @@
unsigned long irq_err_count;
/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
int arch_show_interrupts(struct seq_file *p, int prec)
{
show_ipi_list(p, prec);
......
......@@ -30,9 +30,6 @@
#include <asm/insn.h>
#include <asm/sections.h>
#define AARCH64_INSN_IMM_MOVNZ AARCH64_INSN_IMM_MAX
#define AARCH64_INSN_IMM_MOVK AARCH64_INSN_IMM_16
void *module_alloc(unsigned long size)
{
void *p;
......@@ -75,15 +72,18 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, void *place, u64 val)
static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
{
u64 imm_mask = (1 << len) - 1;
s64 sval = do_reloc(op, place, val);
switch (len) {
case 16:
*(s16 *)place = sval;
if (sval < S16_MIN || sval > U16_MAX)
return -ERANGE;
break;
case 32:
*(s32 *)place = sval;
if (sval < S32_MIN || sval > U32_MAX)
return -ERANGE;
break;
case 64:
*(s64 *)place = sval;
......@@ -92,34 +92,23 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
pr_err("Invalid length (%d) for data relocation\n", len);
return 0;
}
/*
* Extract the upper value bits (including the sign bit) and
* shift them to bit 0.
*/
sval = (s64)(sval & ~(imm_mask >> 1)) >> (len - 1);
/*
* Overflow has occurred if the value is not representable in
* len bits (i.e the bottom len bits are not sign-extended and
* the top bits are not all zero).
*/
if ((u64)(sval + 1) > 2)
return -ERANGE;
return 0;
}
enum aarch64_insn_movw_imm_type {
AARCH64_INSN_IMM_MOVNZ,
AARCH64_INSN_IMM_MOVKZ,
};
static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
int lsb, enum aarch64_insn_imm_type imm_type)
int lsb, enum aarch64_insn_movw_imm_type imm_type)
{
u64 imm, limit = 0;
u64 imm;
s64 sval;
u32 insn = le32_to_cpu(*(u32 *)place);
sval = do_reloc(op, place, val);
sval >>= lsb;
imm = sval & 0xffff;
imm = sval >> lsb;
if (imm_type == AARCH64_INSN_IMM_MOVNZ) {
/*
......@@ -128,7 +117,7 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
* immediate is less than zero.
*/
insn &= ~(3 << 29);
if ((s64)imm >= 0) {
if (sval >= 0) {
/* >=0: Set the instruction to MOVZ (opcode 10b). */
insn |= 2 << 29;
} else {
......@@ -140,29 +129,13 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
*/
imm = ~imm;
}
imm_type = AARCH64_INSN_IMM_MOVK;
}
/* Update the instruction with the new encoding. */
insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
*(u32 *)place = cpu_to_le32(insn);
/* Shift out the immediate field. */
sval >>= 16;
/*
* For unsigned immediates, the overflow check is straightforward.
* For signed immediates, the sign bit is actually the bit past the
* most significant bit of the field.
* The AARCH64_INSN_IMM_16 immediate type is unsigned.
*/
if (imm_type != AARCH64_INSN_IMM_16) {
sval++;
limit++;
}
/* Check the upper bits depending on the sign of the immediate. */
if ((u64)sval > limit)
if (imm > U16_MAX)
return -ERANGE;
return 0;
......@@ -267,25 +240,25 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
overflow_check = false;
case R_AARCH64_MOVW_UABS_G0:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
AARCH64_INSN_IMM_16);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_UABS_G1_NC:
overflow_check = false;
case R_AARCH64_MOVW_UABS_G1:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
AARCH64_INSN_IMM_16);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_UABS_G2_NC:
overflow_check = false;
case R_AARCH64_MOVW_UABS_G2:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
AARCH64_INSN_IMM_16);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_UABS_G3:
/* We're using the top bits so we can't overflow. */
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
AARCH64_INSN_IMM_16);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_SABS_G0:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
......@@ -302,7 +275,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_MOVW_PREL_G0_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
AARCH64_INSN_IMM_MOVK);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_PREL_G0:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
......@@ -311,7 +284,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_MOVW_PREL_G1_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
AARCH64_INSN_IMM_MOVK);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_PREL_G1:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
......@@ -320,7 +293,7 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_MOVW_PREL_G2_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
AARCH64_INSN_IMM_MOVK);
AARCH64_INSN_IMM_MOVKZ);
break;
case R_AARCH64_MOVW_PREL_G2:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
......
......@@ -164,8 +164,11 @@ void perf_callchain_kernel(struct perf_callchain_entry *entry,
frame.fp = regs->regs[29];
frame.sp = regs->sp;
frame.pc = regs->pc;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = current->curr_ret_stack;
#endif
walk_stackframe(&frame, callchain_trace, entry);
walk_stackframe(current, &frame, callchain_trace, entry);
}
unsigned long perf_instruction_pointer(struct pt_regs *regs)
......
......@@ -344,11 +344,14 @@ unsigned long get_wchan(struct task_struct *p)
frame.fp = thread_saved_fp(p);
frame.sp = thread_saved_sp(p);
frame.pc = thread_saved_pc(p);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = p->curr_ret_stack;
#endif
stack_page = (unsigned long)task_stack_page(p);
do {
if (frame.sp < stack_page ||
frame.sp >= stack_page + THREAD_SIZE ||
unwind_frame(&frame))
unwind_frame(p, &frame))
return 0;
if (!in_sched_functions(frame.pc))
return frame.pc;
......
......@@ -58,6 +58,12 @@
*/
void ptrace_disable(struct task_struct *child)
{
/*
* This would be better off in core code, but PTRACE_DETACH has
* grown its fair share of arch-specific worts and changing it
* is likely to cause regressions on obscure architectures.
*/
user_disable_single_step(child);
}
#ifdef CONFIG_HAVE_HW_BREAKPOINT
......
......@@ -43,8 +43,11 @@ void *return_address(unsigned int level)
frame.fp = (unsigned long)__builtin_frame_address(0);
frame.sp = current_stack_pointer;
frame.pc = (unsigned long)return_address; /* dummy */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = current->curr_ret_stack;
#endif
walk_stackframe(&frame, save_return_addr, &data);
walk_stackframe(current, &frame, save_return_addr, &data);
if (!data.level)
return data.addr;
......
......@@ -173,6 +173,9 @@ ENTRY(cpu_resume)
/* load physical address of identity map page table in x1 */
adrp x1, idmap_pg_dir
mov sp, x2
/* save thread_info */
and x2, x2, #~(THREAD_SIZE - 1)
msr sp_el0, x2
/*
* cpu_do_resume expects x0 to contain context physical address
* pointer and x1 to contain physical address of 1:1 page tables
......
......@@ -17,9 +17,11 @@
*/
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/ftrace.h>
#include <linux/sched.h>
#include <linux/stacktrace.h>
#include <asm/irq.h>
#include <asm/stacktrace.h>
/*
......@@ -35,25 +37,83 @@
* ldp x29, x30, [sp]
* add sp, sp, #0x10
*/
int notrace unwind_frame(struct stackframe *frame)
int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
{
unsigned long high, low;
unsigned long fp = frame->fp;
unsigned long irq_stack_ptr;
/*
* Use raw_smp_processor_id() to avoid false-positives from
* CONFIG_DEBUG_PREEMPT. get_wchan() calls unwind_frame() on sleeping
* task stacks, we can be pre-empted in this case, so
* {raw_,}smp_processor_id() may give us the wrong value. Sleeping
* tasks can't ever be on an interrupt stack, so regardless of cpu,
* the checks will always fail.
*/
irq_stack_ptr = IRQ_STACK_PTR(raw_smp_processor_id());
low = frame->sp;
high = ALIGN(low, THREAD_SIZE);
/* irq stacks are not THREAD_SIZE aligned */
if (on_irq_stack(frame->sp, raw_smp_processor_id()))
high = irq_stack_ptr;
else
high = ALIGN(low, THREAD_SIZE) - 0x20;
if (fp < low || fp > high - 0x18 || fp & 0xf)
if (fp < low || fp > high || fp & 0xf)
return -EINVAL;
frame->sp = fp + 0x10;
frame->fp = *(unsigned long *)(fp);
frame->pc = *(unsigned long *)(fp + 8);
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
if (tsk && tsk->ret_stack &&
(frame->pc == (unsigned long)return_to_handler)) {
/*
* This is a case where function graph tracer has
* modified a return address (LR) in a stack frame
* to hook a function return.
* So replace it to an original value.
*/
frame->pc = tsk->ret_stack[frame->graph--].ret;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
/*
* Check whether we are going to walk through from interrupt stack
* to task stack.
* If we reach the end of the stack - and its an interrupt stack,
* unpack the dummy frame to find the original elr.
*
* Check the frame->fp we read from the bottom of the irq_stack,
* and the original task stack pointer are both in current->stack.
*/
if (frame->sp == irq_stack_ptr) {
struct pt_regs *irq_args;
unsigned long orig_sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
if (object_is_on_stack((void *)orig_sp) &&
object_is_on_stack((void *)frame->fp)) {
frame->sp = orig_sp;
/* orig_sp is the saved pt_regs, find the elr */
irq_args = (struct pt_regs *)orig_sp;
frame->pc = irq_args->pc;
} else {
/*
* This frame has a non-standard format, and we
* didn't fix it, because the data looked wrong.
* Refuse to output this frame.
*/
return -EINVAL;
}
}
return 0;
}
void notrace walk_stackframe(struct stackframe *frame,
void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data)
{
while (1) {
......@@ -61,7 +121,7 @@ void notrace walk_stackframe(struct stackframe *frame,
if (fn(frame, data))
break;
ret = unwind_frame(frame);
ret = unwind_frame(tsk, frame);
if (ret < 0)
break;
}
......@@ -112,8 +172,11 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
frame.sp = current_stack_pointer;
frame.pc = (unsigned long)save_stack_trace_tsk;
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = tsk->curr_ret_stack;
#endif
walk_stackframe(&frame, save_trace, &data);
walk_stackframe(tsk, &frame, save_trace, &data);
if (trace->nr_entries < trace->max_entries)
trace->entries[trace->nr_entries++] = ULONG_MAX;
}
......
......@@ -52,8 +52,11 @@ unsigned long profile_pc(struct pt_regs *regs)
frame.fp = regs->regs[29];
frame.sp = regs->sp;
frame.pc = regs->pc;
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = -1; /* no task info */
#endif
do {
int ret = unwind_frame(&frame);
int ret = unwind_frame(NULL, &frame);
if (ret < 0)
return 0;
} while (in_lock_functions(frame.pc));
......
......@@ -146,17 +146,15 @@ static void dump_instr(const char *lvl, struct pt_regs *regs)
static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
{
struct stackframe frame;
unsigned long irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id());
int skip;
pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
if (!tsk)
tsk = current;
if (regs) {
frame.fp = regs->regs[29];
frame.sp = regs->sp;
frame.pc = regs->pc;
} else if (tsk == current) {
if (tsk == current) {
frame.fp = (unsigned long)__builtin_frame_address(0);
frame.sp = current_stack_pointer;
frame.pc = (unsigned long)dump_backtrace;
......@@ -168,21 +166,49 @@ static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
frame.sp = thread_saved_sp(tsk);
frame.pc = thread_saved_pc(tsk);
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
frame.graph = tsk->curr_ret_stack;
#endif
pr_emerg("Call trace:\n");
skip = !!regs;
printk("Call trace:\n");
while (1) {
unsigned long where = frame.pc;
unsigned long stack;
int ret;
dump_backtrace_entry(where);
ret = unwind_frame(&frame);
/* skip until specified stack frame */
if (!skip) {
dump_backtrace_entry(where);
} else if (frame.fp == regs->regs[29]) {
skip = 0;
/*
* Mostly, this is the case where this function is
* called in panic/abort. As exception handler's
* stack frame does not contain the corresponding pc
* at which an exception has taken place, use regs->pc
* instead.
*/
dump_backtrace_entry(regs->pc);
}
ret = unwind_frame(tsk, &frame);
if (ret < 0)
break;
stack = frame.sp;
if (in_exception_text(where))
if (in_exception_text(where)) {
/*
* If we switched to the irq_stack before calling this
* exception handler, then the pt_regs will be on the
* task stack. The easiest way to tell is if the large
* pt_regs would overlap with the end of the irq_stack.
*/
if (stack < irq_stack_ptr &&
(stack + sizeof(struct pt_regs)) > irq_stack_ptr)
stack = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
dump_mem("", "Exception stack", stack,
stack + sizeof(struct pt_regs), false);
}
}
}
......@@ -456,22 +482,22 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
void __pte_error(const char *file, int line, unsigned long val)
{
pr_crit("%s:%d: bad pte %016lx.\n", file, line, val);
pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
}
void __pmd_error(const char *file, int line, unsigned long val)
{
pr_crit("%s:%d: bad pmd %016lx.\n", file, line, val);
pr_err("%s:%d: bad pmd %016lx.\n", file, line, val);
}
void __pud_error(const char *file, int line, unsigned long val)
{
pr_crit("%s:%d: bad pud %016lx.\n", file, line, val);
pr_err("%s:%d: bad pud %016lx.\n", file, line, val);
}
void __pgd_error(const char *file, int line, unsigned long val)
{
pr_crit("%s:%d: bad pgd %016lx.\n", file, line, val);
pr_err("%s:%d: bad pgd %016lx.\n", file, line, val);
}
/* GENERIC_BUG traps */
......
......@@ -113,7 +113,6 @@ SECTIONS
*(.got) /* Global offset table */
}
ALIGN_DEBUG_RO
RO_DATA(PAGE_SIZE)
EXCEPTION_TABLE(8)
NOTES
......@@ -128,7 +127,6 @@ SECTIONS
ARM_EXIT_KEEP(EXIT_TEXT)
}
ALIGN_DEBUG_RO_MIN(16)
.init.data : {
INIT_DATA
INIT_SETUP(16)
......@@ -143,9 +141,6 @@ SECTIONS
PERCPU_SECTION(L1_CACHE_BYTES)
. = ALIGN(PAGE_SIZE);
__init_end = .;
. = ALIGN(4);
.altinstructions : {
__alt_instructions = .;
......@@ -157,6 +152,8 @@ SECTIONS
}
. = ALIGN(PAGE_SIZE);
__init_end = .;
_data = .;
_sdata = .;
RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
......
......@@ -81,25 +81,31 @@ ENDPROC(__flush_cache_user_range)
/*
* __flush_dcache_area(kaddr, size)
*
* Ensure that the data held in the page kaddr is written back to the
* page in question.
* Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
* are cleaned and invalidated to the PoC.
*
* - kaddr - kernel address
* - size - size in question
*/
ENTRY(__flush_dcache_area)
dcache_line_size x2, x3
add x1, x0, x1
sub x3, x2, #1
bic x0, x0, x3
1: dc civac, x0 // clean & invalidate D line / unified line
add x0, x0, x2
cmp x0, x1
b.lo 1b
dsb sy
dcache_by_line_op civac, sy, x0, x1, x2, x3
ret
ENDPIPROC(__flush_dcache_area)
/*
* __clean_dcache_area_pou(kaddr, size)
*
* Ensure that any D-cache lines for the interval [kaddr, kaddr+size)
* are cleaned to the PoU.
*
* - kaddr - kernel address
* - size - size in question
*/
ENTRY(__clean_dcache_area_pou)
dcache_by_line_op cvau, ish, x0, x1, x2, x3
ret
ENDPROC(__clean_dcache_area_pou)
/*
* __inval_cache_range(start, end)
* - start - start address of region
......
......@@ -24,8 +24,9 @@
void __cpu_copy_user_page(void *kto, const void *kfrom, unsigned long vaddr)
{
struct page *page = virt_to_page(kto);
copy_page(kto, kfrom);
__flush_dcache_area(kto, PAGE_SIZE);
flush_dcache_page(page);
}
EXPORT_SYMBOL_GPL(__cpu_copy_user_page);
......
......@@ -40,7 +40,7 @@ static pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot,
static struct gen_pool *atomic_pool;
#define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K
static size_t atomic_pool_size = DEFAULT_DMA_COHERENT_POOL_SIZE;
static size_t atomic_pool_size __initdata = DEFAULT_DMA_COHERENT_POOL_SIZE;
static int __init early_coherent_pool(char *p)
{
......@@ -896,7 +896,7 @@ static int __iommu_attach_notifier(struct notifier_block *nb,
return 0;
}
static int register_iommu_dma_ops_notifier(struct bus_type *bus)
static int __init register_iommu_dma_ops_notifier(struct bus_type *bus)
{
struct notifier_block *nb = kzalloc(sizeof(*nb), GFP_KERNEL);
int ret;
......
......@@ -34,19 +34,24 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
__flush_icache_all();
}
static void sync_icache_aliases(void *kaddr, unsigned long len)
{
unsigned long addr = (unsigned long)kaddr;
if (icache_is_aliasing()) {
__clean_dcache_area_pou(kaddr, len);
__flush_icache_all();
} else {
flush_icache_range(addr, addr + len);
}
}
static void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
unsigned long uaddr, void *kaddr,
unsigned long len)
{
if (vma->vm_flags & VM_EXEC) {
unsigned long addr = (unsigned long)kaddr;
if (icache_is_aliasing()) {
__flush_dcache_area(kaddr, len);
__flush_icache_all();
} else {
flush_icache_range(addr, addr + len);
}
}
if (vma->vm_flags & VM_EXEC)
sync_icache_aliases(kaddr, len);
}
/*
......@@ -74,13 +79,11 @@ void __sync_icache_dcache(pte_t pte, unsigned long addr)
if (!page_mapping(page))
return;
if (!test_and_set_bit(PG_dcache_clean, &page->flags)) {
__flush_dcache_area(page_address(page),
PAGE_SIZE << compound_order(page));
if (!test_and_set_bit(PG_dcache_clean, &page->flags))
sync_icache_aliases(page_address(page),
PAGE_SIZE << compound_order(page));
else if (icache_is_aivivt())
__flush_icache_all();
} else if (icache_is_aivivt()) {
__flush_icache_all();
}
}
/*
......
......@@ -41,17 +41,289 @@ int pud_huge(pud_t pud)
#endif
}
static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, size_t *pgsize)
{
pgd_t *pgd = pgd_offset(mm, addr);
pud_t *pud;
pmd_t *pmd;
*pgsize = PAGE_SIZE;
if (!pte_cont(pte))
return 1;
if (!pgd_present(*pgd)) {
VM_BUG_ON(!pgd_present(*pgd));
return 1;
}
pud = pud_offset(pgd, addr);
if (!pud_present(*pud)) {
VM_BUG_ON(!pud_present(*pud));
return 1;
}
pmd = pmd_offset(pud, addr);
if (!pmd_present(*pmd)) {
VM_BUG_ON(!pmd_present(*pmd));
return 1;
}
if ((pte_t *)pmd == ptep) {
*pgsize = PMD_SIZE;
return CONT_PMDS;
}
return CONT_PTES;
}
void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
size_t pgsize;
int i;
int ncontig = find_num_contig(mm, addr, ptep, pte, &pgsize);
unsigned long pfn;
pgprot_t hugeprot;
if (ncontig == 1) {
set_pte_at(mm, addr, ptep, pte);
return;
}
pfn = pte_pfn(pte);
hugeprot = __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte));
for (i = 0; i < ncontig; i++) {
pr_debug("%s: set pte %p to 0x%llx\n", __func__, ptep,
pte_val(pfn_pte(pfn, hugeprot)));
set_pte_at(mm, addr, ptep, pfn_pte(pfn, hugeprot));
ptep++;
pfn += pgsize >> PAGE_SHIFT;
addr += pgsize;
}
}
pte_t *huge_pte_alloc(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
pgd_t *pgd;
pud_t *pud;
pte_t *pte = NULL;
pr_debug("%s: addr:0x%lx sz:0x%lx\n", __func__, addr, sz);
pgd = pgd_offset(mm, addr);
pud = pud_alloc(mm, pgd, addr);
if (!pud)
return NULL;
if (sz == PUD_SIZE) {
pte = (pte_t *)pud;
} else if (sz == (PAGE_SIZE * CONT_PTES)) {
pmd_t *pmd = pmd_alloc(mm, pud, addr);
WARN_ON(addr & (sz - 1));
/*
* Note that if this code were ever ported to the
* 32-bit arm platform then it will cause trouble in
* the case where CONFIG_HIGHPTE is set, since there
* will be no pte_unmap() to correspond with this
* pte_alloc_map().
*/
pte = pte_alloc_map(mm, NULL, pmd, addr);
} else if (sz == PMD_SIZE) {
if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) &&
pud_none(*pud))
pte = huge_pmd_share(mm, addr, pud);
else
pte = (pte_t *)pmd_alloc(mm, pud, addr);
} else if (sz == (PMD_SIZE * CONT_PMDS)) {
pmd_t *pmd;
pmd = pmd_alloc(mm, pud, addr);
WARN_ON(addr & (sz - 1));
return (pte_t *)pmd;
}
pr_debug("%s: addr:0x%lx sz:0x%lx ret pte=%p/0x%llx\n", __func__, addr,
sz, pte, pte_val(*pte));
return pte;
}
pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
{
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd = NULL;
pte_t *pte = NULL;
pgd = pgd_offset(mm, addr);
pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
if (!pgd_present(*pgd))
return NULL;
pud = pud_offset(pgd, addr);
if (!pud_present(*pud))
return NULL;
if (pud_huge(*pud))
return (pte_t *)pud;
pmd = pmd_offset(pud, addr);
if (!pmd_present(*pmd))
return NULL;
if (pte_cont(pmd_pte(*pmd))) {
pmd = pmd_offset(
pud, (addr & CONT_PMD_MASK));
return (pte_t *)pmd;
}
if (pmd_huge(*pmd))
return (pte_t *)pmd;
pte = pte_offset_kernel(pmd, addr);
if (pte_present(*pte) && pte_cont(*pte)) {
pte = pte_offset_kernel(
pmd, (addr & CONT_PTE_MASK));
return pte;
}
return NULL;
}
pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
struct page *page, int writable)
{
size_t pagesize = huge_page_size(hstate_vma(vma));
if (pagesize == CONT_PTE_SIZE) {
entry = pte_mkcont(entry);
} else if (pagesize == CONT_PMD_SIZE) {
entry = pmd_pte(pmd_mkcont(pte_pmd(entry)));
} else if (pagesize != PUD_SIZE && pagesize != PMD_SIZE) {
pr_warn("%s: unrecognized huge page size 0x%lx\n",
__func__, pagesize);
}
return entry;
}
pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
pte_t pte;
if (pte_cont(*ptep)) {
int ncontig, i;
size_t pgsize;
pte_t *cpte;
bool is_dirty = false;
cpte = huge_pte_offset(mm, addr);
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
/* save the 1st pte to return */
pte = ptep_get_and_clear(mm, addr, cpte);
for (i = 1; i < ncontig; ++i) {
/*
* If HW_AFDBM is enabled, then the HW could
* turn on the dirty bit for any of the page
* in the set, so check them all.
*/
++cpte;
if (pte_dirty(ptep_get_and_clear(mm, addr, cpte)))
is_dirty = true;
}
if (is_dirty)
return pte_mkdirty(pte);
else
return pte;
} else {
return ptep_get_and_clear(mm, addr, ptep);
}
}
int huge_ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t pte, int dirty)
{
pte_t *cpte;
if (pte_cont(pte)) {
int ncontig, i, changed = 0;
size_t pgsize = 0;
unsigned long pfn = pte_pfn(pte);
/* Select all bits except the pfn */
pgprot_t hugeprot =
__pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^
pte_val(pte));
cpte = huge_pte_offset(vma->vm_mm, addr);
pfn = pte_pfn(*cpte);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
for (i = 0; i < ncontig; ++i, ++cpte) {
changed = ptep_set_access_flags(vma, addr, cpte,
pfn_pte(pfn,
hugeprot),
dirty);
pfn += pgsize >> PAGE_SHIFT;
}
return changed;
} else {
return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
}
}
void huge_ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
if (pte_cont(*ptep)) {
int ncontig, i;
pte_t *cpte;
size_t pgsize = 0;
cpte = huge_pte_offset(mm, addr);
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
for (i = 0; i < ncontig; ++i, ++cpte)
ptep_set_wrprotect(mm, addr, cpte);
} else {
ptep_set_wrprotect(mm, addr, ptep);
}
}
void huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
if (pte_cont(*ptep)) {
int ncontig, i;
pte_t *cpte;
size_t pgsize = 0;
cpte = huge_pte_offset(vma->vm_mm, addr);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
for (i = 0; i < ncontig; ++i, ++cpte)
ptep_clear_flush(vma, addr, cpte);
} else {
ptep_clear_flush(vma, addr, ptep);
}
}
static __init int setup_hugepagesz(char *opt)
{
unsigned long ps = memparse(opt, &opt);
if (ps == PMD_SIZE) {
hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT);
} else if (ps == PUD_SIZE) {
hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT);
} else if (ps == (PAGE_SIZE * CONT_PTES)) {
hugetlb_add_hstate(CONT_PTE_SHIFT);
} else if (ps == (PMD_SIZE * CONT_PMDS)) {
hugetlb_add_hstate((PMD_SHIFT + CONT_PMD_SHIFT) - PAGE_SHIFT);
} else {
pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20);
pr_err("hugepagesz: Unsupported page size %lu K\n", ps >> 10);
return 0;
}
return 1;
}
__setup("hugepagesz=", setup_hugepagesz);
#ifdef CONFIG_ARM64_64K_PAGES
static __init int add_default_hugepagesz(void)
{
if (size_to_hstate(CONT_PTES * PAGE_SIZE) == NULL)
hugetlb_add_hstate(CONT_PMD_SHIFT);
return 0;
}
arch_initcall(add_default_hugepagesz);
#endif
......@@ -71,7 +71,7 @@ early_param("initrd", early_initrd);
* currently assumes that for memory starting above 4G, 32-bit devices will
* use a DMA offset.
*/
static phys_addr_t max_zone_dma_phys(void)
static phys_addr_t __init max_zone_dma_phys(void)
{
phys_addr_t offset = memblock_start_of_DRAM() & GENMASK_ULL(63, 32);
return min(offset + (1ULL << 32), memblock_end_of_DRAM());
......@@ -120,17 +120,17 @@ static void __init zone_sizes_init(unsigned long min, unsigned long max)
#ifdef CONFIG_HAVE_ARCH_PFN_VALID
int pfn_valid(unsigned long pfn)
{
return memblock_is_memory(pfn << PAGE_SHIFT);
return memblock_is_map_memory(pfn << PAGE_SHIFT);
}
EXPORT_SYMBOL(pfn_valid);
#endif
#ifndef CONFIG_SPARSEMEM
static void arm64_memory_present(void)
static void __init arm64_memory_present(void)
{
}
#else
static void arm64_memory_present(void)
static void __init arm64_memory_present(void)
{
struct memblock_region *reg;
......@@ -360,7 +360,6 @@ void free_initmem(void)
{
fixup_init();
free_initmem_default(0);
free_alternatives_memory();
}
#ifdef CONFIG_BLK_DEV_INITRD
......
......@@ -251,6 +251,14 @@ static void __create_mapping(struct mm_struct *mm, pgd_t *pgd,
{
unsigned long addr, length, end, next;
/*
* If the virtual and physical address don't have the same offset
* within a page, we cannot map the region as the caller expects.
*/
if (WARN_ON((phys ^ virt) & ~PAGE_MASK))
return;
phys &= PAGE_MASK;
addr = virt & PAGE_MASK;
length = PAGE_ALIGN(size + (virt & ~PAGE_MASK));
......@@ -280,7 +288,7 @@ static void __init create_mapping(phys_addr_t phys, unsigned long virt,
&phys, virt);
return;
}
__create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK), phys, virt,
__create_mapping(&init_mm, pgd_offset_k(virt), phys, virt,
size, prot, early_alloc);
}
......@@ -301,7 +309,7 @@ static void create_mapping_late(phys_addr_t phys, unsigned long virt,
return;
}
return __create_mapping(&init_mm, pgd_offset_k(virt & PAGE_MASK),
return __create_mapping(&init_mm, pgd_offset_k(virt),
phys, virt, size, prot, late_alloc);
}
......@@ -372,6 +380,8 @@ static void __init map_mem(void)
if (start >= end)
break;
if (memblock_is_nomap(reg))
continue;
if (ARM64_SWAPPER_USES_SECTION_MAPS) {
/*
......@@ -456,6 +466,9 @@ void __init paging_init(void)
empty_zero_page = virt_to_page(zero_page);
/* Ensure the zero page is visible to the page table walker */
dsb(ishst);
/*
* TTBR0 is only used for the identity mapping at this stage. Make it
* point to zero page to avoid speculatively fetching new entries.
......
......@@ -46,14 +46,14 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd)
kmem_cache_free(pgd_cache, pgd);
}
static int __init pgd_cache_init(void)
void __init pgd_cache_init(void)
{
if (PGD_SIZE == PAGE_SIZE)
return;
/*
* Naturally aligned pgds required by the architecture.
*/
if (PGD_SIZE != PAGE_SIZE)
pgd_cache = kmem_cache_create("pgd_cache", PGD_SIZE, PGD_SIZE,
SLAB_PANIC, NULL);
return 0;
pgd_cache = kmem_cache_create("pgd_cache", PGD_SIZE, PGD_SIZE,
SLAB_PANIC, NULL);
}
core_initcall(pgd_cache_init);
......@@ -62,3 +62,25 @@
bfi \valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
#endif
.endm
/*
* Macro to perform a data cache maintenance for the interval
* [kaddr, kaddr + size)
*
* op: operation passed to dc instruction
* domain: domain used in dsb instruciton
* kaddr: starting virtual address of the region
* size: size of the region
* Corrupts: kaddr, size, tmp1, tmp2
*/
.macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
dcache_line_size \tmp1, \tmp2
add \size, \kaddr, \size
sub \tmp2, \tmp1, #1
bic \kaddr, \kaddr, \tmp2
9998: dc \op, \kaddr
add \kaddr, \kaddr, \tmp1
cmp \kaddr, \size
b.lo 9998b
dsb \domain
.endm
......@@ -139,8 +139,6 @@ ENTRY(cpu_do_switch_mm)
ret
ENDPROC(cpu_do_switch_mm)
.section ".text.init", #alloc, #execinstr
/*
* __cpu_setup
*
......
......@@ -18,3 +18,6 @@ obj-$(CONFIG_EFI_RUNTIME_MAP) += runtime-map.o
obj-$(CONFIG_EFI_RUNTIME_WRAPPERS) += runtime-wrappers.o
obj-$(CONFIG_EFI_STUB) += libstub/
obj-$(CONFIG_EFI_FAKE_MEMMAP) += fake_mem.o
arm-obj-$(CONFIG_EFI) := arm-init.o arm-runtime.o
obj-$(CONFIG_ARM64) += $(arm-obj-y)
/*
* Extensible Firmware Interface
*
* Based on Extensible Firmware Interface Specification version 2.4
*
* Copyright (C) 2013 - 2015 Linaro Ltd.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
*/
#include <linux/efi.h>
#include <linux/init.h>
#include <linux/memblock.h>
#include <linux/mm_types.h>
#include <linux/of.h>
#include <linux/of_fdt.h>
#include <asm/efi.h>
struct efi_memory_map memmap;
u64 efi_system_table;
static int __init is_normal_ram(efi_memory_desc_t *md)
{
if (md->attribute & EFI_MEMORY_WB)
return 1;
return 0;
}
/*
* Translate a EFI virtual address into a physical address: this is necessary,
* as some data members of the EFI system table are virtually remapped after
* SetVirtualAddressMap() has been called.
*/
static phys_addr_t efi_to_phys(unsigned long addr)
{
efi_memory_desc_t *md;
for_each_efi_memory_desc(&memmap, md) {
if (!(md->attribute & EFI_MEMORY_RUNTIME))
continue;
if (md->virt_addr == 0)
/* no virtual mapping has been installed by the stub */
break;
if (md->virt_addr <= addr &&
(addr - md->virt_addr) < (md->num_pages << EFI_PAGE_SHIFT))
return md->phys_addr + addr - md->virt_addr;
}
return addr;
}
static int __init uefi_init(void)
{
efi_char16_t *c16;
void *config_tables;
size_t table_size;
char vendor[100] = "unknown";
int i, retval;
efi.systab = early_memremap(efi_system_table,
sizeof(efi_system_table_t));
if (efi.systab == NULL) {
pr_warn("Unable to map EFI system table.\n");
return -ENOMEM;
}
set_bit(EFI_BOOT, &efi.flags);
if (IS_ENABLED(CONFIG_64BIT))
set_bit(EFI_64BIT, &efi.flags);
/*
* Verify the EFI Table
*/
if (efi.systab->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) {
pr_err("System table signature incorrect\n");
retval = -EINVAL;
goto out;
}
if ((efi.systab->hdr.revision >> 16) < 2)
pr_warn("Warning: EFI system table version %d.%02d, expected 2.00 or greater\n",
efi.systab->hdr.revision >> 16,
efi.systab->hdr.revision & 0xffff);
/* Show what we know for posterity */
c16 = early_memremap(efi_to_phys(efi.systab->fw_vendor),
sizeof(vendor) * sizeof(efi_char16_t));
if (c16) {
for (i = 0; i < (int) sizeof(vendor) - 1 && *c16; ++i)
vendor[i] = c16[i];
vendor[i] = '\0';
early_memunmap(c16, sizeof(vendor) * sizeof(efi_char16_t));
}
pr_info("EFI v%u.%.02u by %s\n",
efi.systab->hdr.revision >> 16,
efi.systab->hdr.revision & 0xffff, vendor);
table_size = sizeof(efi_config_table_64_t) * efi.systab->nr_tables;
config_tables = early_memremap(efi_to_phys(efi.systab->tables),
table_size);
if (config_tables == NULL) {
pr_warn("Unable to map EFI config table array.\n");
retval = -ENOMEM;
goto out;
}
retval = efi_config_parse_tables(config_tables, efi.systab->nr_tables,
sizeof(efi_config_table_t), NULL);
early_memunmap(config_tables, table_size);
out:
early_memunmap(efi.systab, sizeof(efi_system_table_t));
return retval;
}
/*
* Return true for RAM regions we want to permanently reserve.
*/
static __init int is_reserve_region(efi_memory_desc_t *md)
{
switch (md->type) {
case EFI_LOADER_CODE:
case EFI_LOADER_DATA:
case EFI_BOOT_SERVICES_CODE:
case EFI_BOOT_SERVICES_DATA:
case EFI_CONVENTIONAL_MEMORY:
case EFI_PERSISTENT_MEMORY:
return 0;
default:
break;
}
return is_normal_ram(md);
}
static __init void reserve_regions(void)
{
efi_memory_desc_t *md;
u64 paddr, npages, size;
if (efi_enabled(EFI_DBG))
pr_info("Processing EFI memory map:\n");
for_each_efi_memory_desc(&memmap, md) {
paddr = md->phys_addr;
npages = md->num_pages;
if (efi_enabled(EFI_DBG)) {
char buf[64];
pr_info(" 0x%012llx-0x%012llx %s",
paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
efi_md_typeattr_format(buf, sizeof(buf), md));
}
memrange_efi_to_native(&paddr, &npages);
size = npages << PAGE_SHIFT;
if (is_normal_ram(md))
early_init_dt_add_memory_arch(paddr, size);
if (is_reserve_region(md)) {
memblock_mark_nomap(paddr, size);
if (efi_enabled(EFI_DBG))
pr_cont("*");
}
if (efi_enabled(EFI_DBG))
pr_cont("\n");
}
set_bit(EFI_MEMMAP, &efi.flags);
}
void __init efi_init(void)
{
struct efi_fdt_params params;
/* Grab UEFI information placed in FDT by stub */
if (!efi_get_fdt_params(&params))
return;
efi_system_table = params.system_table;
memmap.phys_map = params.mmap;
memmap.map = early_memremap(params.mmap, params.mmap_size);
if (memmap.map == NULL) {
/*
* If we are booting via UEFI, the UEFI memory map is the only
* description of memory we have, so there is little point in
* proceeding if we cannot access it.
*/
panic("Unable to map EFI memory map.\n");
}
memmap.map_end = memmap.map + params.mmap_size;
memmap.desc_size = params.desc_size;
memmap.desc_version = params.desc_ver;
if (uefi_init() < 0)
return;
reserve_regions();
early_memunmap(memmap.map, params.mmap_size);
memblock_mark_nomap(params.mmap & PAGE_MASK,
PAGE_ALIGN(params.mmap_size +
(params.mmap & ~PAGE_MASK)));
}
/*
* Extensible Firmware Interface
*
* Based on Extensible Firmware Interface Specification version 2.4
*
* Copyright (C) 2013, 2014 Linaro Ltd.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
*/
#include <linux/efi.h>
#include <linux/io.h>
#include <linux/memblock.h>
#include <linux/mm_types.h>
#include <linux/preempt.h>
#include <linux/rbtree.h>
#include <linux/rwsem.h>
#include <linux/sched.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <asm/cacheflush.h>
#include <asm/efi.h>
#include <asm/mmu.h>
#include <asm/pgalloc.h>
#include <asm/pgtable.h>
extern u64 efi_system_table;
static struct mm_struct efi_mm = {
.mm_rb = RB_ROOT,
.mm_users = ATOMIC_INIT(2),
.mm_count = ATOMIC_INIT(1),
.mmap_sem = __RWSEM_INITIALIZER(efi_mm.mmap_sem),
.page_table_lock = __SPIN_LOCK_UNLOCKED(efi_mm.page_table_lock),
.mmlist = LIST_HEAD_INIT(efi_mm.mmlist),
};
static bool __init efi_virtmap_init(void)
{
efi_memory_desc_t *md;
efi_mm.pgd = pgd_alloc(&efi_mm);
init_new_context(NULL, &efi_mm);
for_each_efi_memory_desc(&memmap, md) {
phys_addr_t phys = md->phys_addr;
int ret;
if (!(md->attribute & EFI_MEMORY_RUNTIME))
continue;
if (md->virt_addr == 0)
return false;
ret = efi_create_mapping(&efi_mm, md);
if (!ret) {
pr_info(" EFI remap %pa => %p\n",
&phys, (void *)(unsigned long)md->virt_addr);
} else {
pr_warn(" EFI remap %pa: failed to create mapping (%d)\n",
&phys, ret);
return false;
}
}
return true;
}
/*
* Enable the UEFI Runtime Services if all prerequisites are in place, i.e.,
* non-early mapping of the UEFI system table and virtual mappings for all
* EFI_MEMORY_RUNTIME regions.
*/
static int __init arm_enable_runtime_services(void)
{
u64 mapsize;
if (!efi_enabled(EFI_BOOT)) {
pr_info("EFI services will not be available.\n");
return 0;
}
if (efi_runtime_disabled()) {
pr_info("EFI runtime services will be disabled.\n");
return 0;
}
pr_info("Remapping and enabling EFI services.\n");
mapsize = memmap.map_end - memmap.map;
memmap.map = (__force void *)ioremap_cache(memmap.phys_map,
mapsize);
if (!memmap.map) {
pr_err("Failed to remap EFI memory map\n");
return -ENOMEM;
}
memmap.map_end = memmap.map + mapsize;
efi.memmap = &memmap;
efi.systab = (__force void *)ioremap_cache(efi_system_table,
sizeof(efi_system_table_t));
if (!efi.systab) {
pr_err("Failed to remap EFI System Table\n");
return -ENOMEM;
}
set_bit(EFI_SYSTEM_TABLES, &efi.flags);
if (!efi_virtmap_init()) {
pr_err("No UEFI virtual mapping was installed -- runtime services will not be available\n");
return -ENOMEM;
}
/* Set up runtime services function pointers */
efi_native_runtime_setup();
set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
efi.runtime_version = efi.systab->hdr.revision;
return 0;
}
early_initcall(arm_enable_runtime_services);
void efi_virtmap_load(void)
{
preempt_disable();
efi_set_pgd(&efi_mm);
}
void efi_virtmap_unload(void)
{
efi_set_pgd(current->active_mm);
preempt_enable();
}
......@@ -25,6 +25,8 @@
#include <linux/io.h>
#include <linux/platform_device.h>
#include <asm/efi.h>
struct efi __read_mostly efi = {
.mps = EFI_INVALID_TABLE_ADDR,
.acpi = EFI_INVALID_TABLE_ADDR,
......
......@@ -8,7 +8,7 @@ cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ $(LINUX_INCLUDE) -O2 \
-fPIC -fno-strict-aliasing -mno-red-zone \
-mno-mmx -mno-sse -DDISABLE_BRANCH_PROFILING
-mno-mmx -mno-sse
cflags-$(CONFIG_ARM64) := $(subst -pg,,$(KBUILD_CFLAGS))
cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \
......@@ -16,7 +16,7 @@ cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \
cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt
KBUILD_CFLAGS := $(cflags-y) \
KBUILD_CFLAGS := $(cflags-y) -DDISABLE_BRANCH_PROFILING \
$(call cc-option,-ffreestanding) \
$(call cc-option,-fno-stack-protector)
......
......@@ -96,9 +96,7 @@ u32 hugetlb_fault_mutex_hash(struct hstate *h, struct mm_struct *mm,
struct address_space *mapping,
pgoff_t idx, unsigned long address);
#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE
pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud);
#endif
extern int hugepages_treat_as_movable;
extern int sysctl_hugetlb_shm_group;
......
......@@ -25,6 +25,7 @@ enum {
MEMBLOCK_NONE = 0x0, /* No special request */
MEMBLOCK_HOTPLUG = 0x1, /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2, /* mirrored region */
MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */
};
struct memblock_region {
......@@ -82,6 +83,7 @@ bool memblock_overlaps_region(struct memblock_type *type,
int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size);
int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
ulong choose_memblock_flags(void);
/* Low level functions */
......@@ -184,6 +186,11 @@ static inline bool memblock_is_mirror(struct memblock_region *m)
return m->flags & MEMBLOCK_MIRROR;
}
static inline bool memblock_is_nomap(struct memblock_region *m)
{
return m->flags & MEMBLOCK_NOMAP;
}
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
int memblock_search_pfn_nid(unsigned long pfn, unsigned long *start_pfn,
unsigned long *end_pfn);
......@@ -319,6 +326,7 @@ phys_addr_t memblock_start_of_DRAM(void);
phys_addr_t memblock_end_of_DRAM(void);
void memblock_enforce_memory_limit(phys_addr_t memory_limit);
int memblock_is_memory(phys_addr_t addr);
int memblock_is_map_memory(phys_addr_t addr);
int memblock_is_region_memory(phys_addr_t base, phys_addr_t size);
int memblock_is_reserved(phys_addr_t addr);
bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size);
......
......@@ -822,6 +822,17 @@ int __init_memblock memblock_mark_mirror(phys_addr_t base, phys_addr_t size)
return memblock_setclr_flag(base, size, 1, MEMBLOCK_MIRROR);
}
/**
* memblock_mark_nomap - Mark a memory region with flag MEMBLOCK_NOMAP.
* @base: the base phys addr of the region
* @size: the size of the region
*
* Return 0 on success, -errno on failure.
*/
int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size)
{
return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP);
}
/**
* __next_reserved_mem_region - next function for for_each_reserved_region()
......@@ -913,6 +924,10 @@ void __init_memblock __next_mem_range(u64 *idx, int nid, ulong flags,
if ((flags & MEMBLOCK_MIRROR) && !memblock_is_mirror(m))
continue;
/* skip nomap memory unless we were asked for it explicitly */
if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m))
continue;
if (!type_b) {
if (out_start)
*out_start = m_start;
......@@ -1022,6 +1037,10 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int nid, ulong flags,
if ((flags & MEMBLOCK_MIRROR) && !memblock_is_mirror(m))
continue;
/* skip nomap memory unless we were asked for it explicitly */
if (!(flags & MEMBLOCK_NOMAP) && memblock_is_nomap(m))
continue;
if (!type_b) {
if (out_start)
*out_start = m_start;
......@@ -1519,6 +1538,15 @@ int __init_memblock memblock_is_memory(phys_addr_t addr)
return memblock_search(&memblock.memory, addr) != -1;
}
int __init_memblock memblock_is_map_memory(phys_addr_t addr)
{
int i = memblock_search(&memblock.memory, addr);
if (i == -1)
return false;
return !memblock_is_nomap(&memblock.memory.regions[i]);
}
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
unsigned long *start_pfn, unsigned long *end_pfn)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment