1. 18 Aug, 2017 1 commit
  2. 15 Aug, 2017 16 commits
    • Catalin Marinas's avatar
      Merge branch 'arm64/vmap-stack' of... · df5b95be
      Catalin Marinas authored
      Merge branch 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
      
      * 'arm64/vmap-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
        arm64: add VMAP_STACK overflow detection
        arm64: add on_accessible_stack()
        arm64: add basic VMAP_STACK support
        arm64: use an irq stack pointer
        arm64: assembler: allow adr_this_cpu to use the stack pointer
        arm64: factor out entry stack manipulation
        efi/arm64: add EFI_KIMG_ALIGN
        arm64: move SEGMENT_ALIGN to <asm/memory.h>
        arm64: clean up irq stack definitions
        arm64: clean up THREAD_* definitions
        arm64: factor out PAGE_* and CONT_* definitions
        arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP
        fork: allow arch-override of VMAP stack alignment
        arm64: remove __die()'s stack dump
      df5b95be
    • Mark Rutland's avatar
      arm64: add VMAP_STACK overflow detection · 872d8327
      Mark Rutland authored
      This patch adds stack overflow detection to arm64, usable when vmap'd stacks
      are in use.
      
      Overflow is detected in a small preamble executed for each exception entry,
      which checks whether there is enough space on the current stack for the general
      purpose registers to be saved. If there is not enough space, the overflow
      handler is invoked on a per-cpu overflow stack. This approach preserves the
      original exception information in ESR_EL1 (and where appropriate, FAR_EL1).
      
      Task and IRQ stacks are aligned to double their size, enabling overflow to be
      detected with a single bit test. For example, a 16K stack is aligned to 32K,
      ensuring that bit 14 of the SP must be zero. On an overflow (or underflow),
      this bit is flipped. Thus, overflow (of less than the size of the stack) can be
      detected by testing whether this bit is set.
      
      The overflow check is performed before any attempt is made to access the
      stack, avoiding recursive faults (and the loss of exception information
      these would entail). As logical operations cannot be performed on the SP
      directly, the SP is temporarily swapped with a general purpose register
      using arithmetic operations to enable the test to be performed.
      
      This gives us a useful error message on stack overflow, as can be trigger with
      the LKDTM overflow test:
      
      [  305.388749] lkdtm: Performing direct entry OVERFLOW
      [  305.395444] Insufficient stack space to handle exception!
      [  305.395482] ESR: 0x96000047 -- DABT (current EL)
      [  305.399890] FAR: 0xffff00000a5e7f30
      [  305.401315] Task stack:     [0xffff00000a5e8000..0xffff00000a5ec000]
      [  305.403815] IRQ stack:      [0xffff000008000000..0xffff000008004000]
      [  305.407035] Overflow stack: [0xffff80003efce4e0..0xffff80003efcf4e0]
      [  305.409622] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
      [  305.412785] Hardware name: linux,dummy-virt (DT)
      [  305.415756] task: ffff80003d051c00 task.stack: ffff00000a5e8000
      [  305.419221] PC is at recursive_loop+0x10/0x48
      [  305.421637] LR is at recursive_loop+0x38/0x48
      [  305.423768] pc : [<ffff00000859f330>] lr : [<ffff00000859f358>] pstate: 40000145
      [  305.428020] sp : ffff00000a5e7f50
      [  305.430469] x29: ffff00000a5e8350 x28: ffff80003d051c00
      [  305.433191] x27: ffff000008981000 x26: ffff000008f80400
      [  305.439012] x25: ffff00000a5ebeb8 x24: ffff00000a5ebeb8
      [  305.440369] x23: ffff000008f80138 x22: 0000000000000009
      [  305.442241] x21: ffff80003ce65000 x20: ffff000008f80188
      [  305.444552] x19: 0000000000000013 x18: 0000000000000006
      [  305.446032] x17: 0000ffffa2601280 x16: ffff0000081fe0b8
      [  305.448252] x15: ffff000008ff546d x14: 000000000047a4c8
      [  305.450246] x13: ffff000008ff7872 x12: 0000000005f5e0ff
      [  305.452953] x11: ffff000008ed2548 x10: 000000000005ee8d
      [  305.454824] x9 : ffff000008545380 x8 : ffff00000a5e8770
      [  305.457105] x7 : 1313131313131313 x6 : 00000000000000e1
      [  305.459285] x5 : 0000000000000000 x4 : 0000000000000000
      [  305.461781] x3 : 0000000000000000 x2 : 0000000000000400
      [  305.465119] x1 : 0000000000000013 x0 : 0000000000000012
      [  305.467724] Kernel panic - not syncing: kernel stack overflow
      [  305.470561] CPU: 0 PID: 1219 Comm: sh Not tainted 4.13.0-rc3-00021-g9636aea #5
      [  305.473325] Hardware name: linux,dummy-virt (DT)
      [  305.475070] Call trace:
      [  305.476116] [<ffff000008088ad8>] dump_backtrace+0x0/0x378
      [  305.478991] [<ffff000008088e64>] show_stack+0x14/0x20
      [  305.481237] [<ffff00000895a178>] dump_stack+0x98/0xb8
      [  305.483294] [<ffff0000080c3288>] panic+0x118/0x280
      [  305.485673] [<ffff0000080c2e9c>] nmi_panic+0x6c/0x70
      [  305.486216] [<ffff000008089710>] handle_bad_stack+0x118/0x128
      [  305.486612] Exception stack(0xffff80003efcf3a0 to 0xffff80003efcf4e0)
      [  305.487334] f3a0: 0000000000000012 0000000000000013 0000000000000400 0000000000000000
      [  305.488025] f3c0: 0000000000000000 0000000000000000 00000000000000e1 1313131313131313
      [  305.488908] f3e0: ffff00000a5e8770 ffff000008545380 000000000005ee8d ffff000008ed2548
      [  305.489403] f400: 0000000005f5e0ff ffff000008ff7872 000000000047a4c8 ffff000008ff546d
      [  305.489759] f420: ffff0000081fe0b8 0000ffffa2601280 0000000000000006 0000000000000013
      [  305.490256] f440: ffff000008f80188 ffff80003ce65000 0000000000000009 ffff000008f80138
      [  305.490683] f460: ffff00000a5ebeb8 ffff00000a5ebeb8 ffff000008f80400 ffff000008981000
      [  305.491051] f480: ffff80003d051c00 ffff00000a5e8350 ffff00000859f358 ffff00000a5e7f50
      [  305.491444] f4a0: ffff00000859f330 0000000040000145 0000000000000000 0000000000000000
      [  305.492008] f4c0: 0001000000000000 0000000000000000 ffff00000a5e8350 ffff00000859f330
      [  305.493063] [<ffff00000808205c>] __bad_stack+0x88/0x8c
      [  305.493396] [<ffff00000859f330>] recursive_loop+0x10/0x48
      [  305.493731] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494088] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494425] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494649] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.494898] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495205] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495453] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.495708] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496000] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496302] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496644] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.496894] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.497138] [<ffff00000859f358>] recursive_loop+0x38/0x48
      [  305.497325] [<ffff00000859f3dc>] lkdtm_OVERFLOW+0x14/0x20
      [  305.497506] [<ffff00000859f314>] lkdtm_do_action+0x1c/0x28
      [  305.497786] [<ffff00000859f178>] direct_entry+0xe0/0x170
      [  305.498095] [<ffff000008345568>] full_proxy_write+0x60/0xa8
      [  305.498387] [<ffff0000081fb7f4>] __vfs_write+0x1c/0x128
      [  305.498679] [<ffff0000081fcc68>] vfs_write+0xa0/0x1b0
      [  305.498926] [<ffff0000081fe0fc>] SyS_write+0x44/0xa0
      [  305.499182] Exception stack(0xffff00000a5ebec0 to 0xffff00000a5ec000)
      [  305.499429] bec0: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
      [  305.499674] bee0: 574f4c465245564f 0000000000000000 0000000000000000 8000000080808080
      [  305.499904] bf00: 0000000000000040 0000000000000038 fefefeff1b4bc2ff 7f7f7f7f7f7fff7f
      [  305.500189] bf20: 0101010101010101 0000000000000000 000000000047a4c8 0000000000000038
      [  305.500712] bf40: 0000000000000000 0000ffffa2601280 0000ffffc63f6068 00000000004b5000
      [  305.501241] bf60: 0000000000000001 000000001c4cf5e0 0000000000000009 000000001c4cf5e0
      [  305.501791] bf80: 0000000000000020 0000000000000000 00000000004b5000 000000001c4cc458
      [  305.502314] bfa0: 0000000000000000 0000ffffc63f7950 000000000040a3c4 0000ffffc63f70e0
      [  305.502762] bfc0: 0000ffffa2601268 0000000080000000 0000000000000001 0000000000000040
      [  305.503207] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [  305.503680] [<ffff000008082fb0>] el0_svc_naked+0x24/0x28
      [  305.504720] Kernel Offset: disabled
      [  305.505189] CPU features: 0x002082
      [  305.505473] Memory Limit: none
      [  305.506181] ---[ end Kernel panic - not syncing: kernel stack overflow
      
      This patch was co-authored by Ard Biesheuvel and Mark Rutland.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      872d8327
    • Mark Rutland's avatar
      arm64: add on_accessible_stack() · 12964443
      Mark Rutland authored
      Both unwind_frame() and dump_backtrace() try to check whether a stack
      address is sane to access, with very similar logic. Both will need
      updating in order to handle overflow stacks.
      
      Factor out this logic into a helper, so that we can avoid further
      duplication when we add overflow stacks.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      12964443
    • Mark Rutland's avatar
      arm64: add basic VMAP_STACK support · e3067861
      Mark Rutland authored
      This patch enables arm64 to be built with vmap'd task and IRQ stacks.
      
      As vmap'd stacks are mapped at page granularity, stacks must be a multiple of
      PAGE_SIZE. This means that a 64K page kernel must use stacks of at least 64K in
      size.
      
      To minimize the increase in Image size, IRQ stacks are dynamically allocated at
      boot time, rather than embedding the boot CPU's IRQ stack in the kernel image.
      
      This patch was co-authored by Ard Biesheuvel and Mark Rutland.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      e3067861
    • Mark Rutland's avatar
      arm64: use an irq stack pointer · f60fe78f
      Mark Rutland authored
      We allocate our IRQ stacks using a percpu array. This allows us to generate our
      IRQ stack pointers with adr_this_cpu, but bloats the kernel Image with the boot
      CPU's IRQ stack. Additionally, these are packed with other percpu variables,
      and aren't guaranteed to have guard pages.
      
      When we enable VMAP_STACK we'll want to vmap our IRQ stacks also, in order to
      provide guard pages and to permit more stringent alignment requirements. Doing
      so will require that we use a percpu pointer to each IRQ stack, rather than
      allocating a percpu IRQ stack in the kernel image.
      
      This patch updates our IRQ stack code to use a percpu pointer to the base of
      each IRQ stack. This will allow us to change the way the stack is allocated
      with minimal changes elsewhere. In some cases we may try to backtrace before
      the IRQ stack pointers are initialised, so on_irq_stack() is updated to account
      for this.
      
      In testing with cyclictest, there was no measureable difference between using
      adr_this_cpu (for irq_stack) and ldr_this_cpu (for irq_stack_ptr) in the IRQ
      entry path.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      f60fe78f
    • Ard Biesheuvel's avatar
      arm64: assembler: allow adr_this_cpu to use the stack pointer · 8ea41b11
      Ard Biesheuvel authored
      Given that adr_this_cpu already requires a temp register in addition
      to the destination register, tweak the instruction sequence so that sp
      may be used as well.
      
      This will simplify switching to per-cpu stacks in subsequent patches. While
      this limits the range of adr_this_cpu, to +/-4GiB, we don't currently use
      adr_this_cpu in modules, and this is not problematic for the main kernel image.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: add more commit text]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      8ea41b11
    • Mark Rutland's avatar
      arm64: factor out entry stack manipulation · b11e5759
      Mark Rutland authored
      In subsequent patches, we will detect stack overflow in our exception
      entry code, by verifying the SP after it has been decremented to make
      space for the exception regs.
      
      This verification code is small, and we can minimize its impact by
      placing it directly in the vectors. To avoid redundant modification of
      the SP, we also need to move the initial decrement of the SP into the
      vectors.
      
      As a preparatory step, this patch introduces kernel_ventry, which
      performs this decrement, and updates the entry code accordingly.
      Subsequent patches will fold SP verification into kernel_ventry.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: turn into prep patch, expand commit msg]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      b11e5759
    • Mark Rutland's avatar
      efi/arm64: add EFI_KIMG_ALIGN · 170976bc
      Mark Rutland authored
      The EFI stub is intimately coupled with the kernel, and takes advantage
      of this by relocating the kernel at a weaker alignment than the
      documented boot protocol mandates.
      
      However, it does so by assuming it can align the kernel to the segment
      alignment, and assumes that this is 64K. In subsequent patches, we'll
      have to consider other details to determine this de-facto alignment
      constraint.
      
      This patch adds a new EFI_KIMG_ALIGN definition that will track the
      kernel's de-facto alignment requirements. Subsequent patches will modify
      this as required.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      170976bc
    • Mark Rutland's avatar
      arm64: move SEGMENT_ALIGN to <asm/memory.h> · 8018ba4e
      Mark Rutland authored
      Currently we define SEGMENT_ALIGN directly in our vmlinux.lds.S.
      
      This is unfortunate, as the EFI stub currently open-codes the same
      number, and in future we'll want to fiddle with this.
      
      This patch moves the definition to our <asm/memory.h>, where it can be
      used by both vmlinux.lds.S and the EFI stub code.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      8018ba4e
    • Mark Rutland's avatar
      arm64: clean up irq stack definitions · f60ad4ed
      Mark Rutland authored
      Before we add yet another stack to the kernel, it would be nice to
      ensure that we consistently organise stack definitions and related
      helper functions.
      
      This patch moves the basic IRQ stack defintions to <asm/memory.h> to
      live with their task stack counterparts. Helpers used for unwinding are
      moved into <asm/stacktrace.h>, where subsequent patches will add helpers
      for other stacks. Includes are fixed up accordingly.
      
      This patch is a pure refactoring -- there should be no functional
      changes as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      f60ad4ed
    • Mark Rutland's avatar
      arm64: clean up THREAD_* definitions · dbc9344a
      Mark Rutland authored
      Currently we define THREAD_SIZE and THREAD_SIZE_ORDER separately, with
      the latter dependent on particular CONFIG_ARM64_*K_PAGES definitions.
      This is somewhat opaque, and will get in the way of future modifications
      to THREAD_SIZE.
      
      This patch cleans this up, defining both in terms of a common
      THREAD_SHIFT, and using PAGE_SHIFT to calculate THREAD_SIZE_ORDER,
      rather than using a number of definitions dependent on config symbols.
      Subsequent patches will make use of this to alter the stack size used in
      some configurations.
      
      At the same time, these are moved into <asm/memory.h>, which will avoid
      circular include issues in subsequent patches. To ensure that existing
      code isn't adversely affected, <asm/thread_info.h> is updated to
      transitively include these definitions.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      dbc9344a
    • Mark Rutland's avatar
      arm64: factor out PAGE_* and CONT_* definitions · b6531456
      Mark Rutland authored
      Some headers rely on PAGE_* definitions from <asm/page.h>, but cannot
      include this due to potential circular includes. For example, a number
      of definitions in <asm/memory.h> rely on PAGE_SHIFT, and <asm/page.h>
      includes <asm/memory.h>.
      
      This requires users of these definitions to include both headers, which
      is fragile and error-prone.
      
      This patch ameliorates matters by moving the basic definitions out to a
      new header, <asm/page-def.h>. Both <asm/page.h> and <asm/memory.h> are
      updated to include this, avoiding this fragility, and avoiding the
      possibility of circular include dependencies.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      b6531456
    • Ard Biesheuvel's avatar
      arm64: kernel: remove {THREAD,IRQ_STACK}_START_SP · 34be98f4
      Ard Biesheuvel authored
      For historical reasons, we leave the top 16 bytes of our task and IRQ
      stacks unused, a practice used to ensure that the SP can always be
      masked to find the base of the current stack (historically, where
      thread_info could be found).
      
      However, this is not necessary, as:
      
      * When an exception is taken from a task stack, we decrement the SP by
        S_FRAME_SIZE and stash the exception registers before we compare the
        SP against the task stack. In such cases, the SP must be at least
        S_FRAME_SIZE below the limit, and can be safely masked to determine
        whether the task stack is in use.
      
      * When transitioning to an IRQ stack, we'll place a dummy frame onto the
        IRQ stack before enabling asynchronous exceptions, or executing code
        we expect to trigger faults. Thus, if an exception is taken from the
        IRQ stack, the SP must be at least 16 bytes below the limit.
      
      * We no longer mask the SP to find the thread_info, which is now found
        via sp_el0. Note that historically, the offset was critical to ensure
        that cpu_switch_to() found the correct stack for new threads that
        hadn't yet executed ret_from_fork().
      
      Given that, this initial offset serves no purpose, and can be removed.
      This brings us in-line with other architectures (e.g. x86) which do not
      rely on this masking.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: rebase, kill THREAD_START_SP, commit msg additions]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      34be98f4
    • Mark Rutland's avatar
      fork: allow arch-override of VMAP stack alignment · 48ac3c18
      Mark Rutland authored
      In some cases, an architecture might wish its stacks to be aligned to a
      boundary larger than THREAD_SIZE. For example, using an alignment of
      double THREAD_SIZE can allow for stack overflows smaller than
      THREAD_SIZE to be detected by checking a single bit of the stack
      pointer.
      
      This patch allows architectures to override the alignment of VMAP'd
      stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
      THREAD_SIZE, as is the case today.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: linux-kernel@vger.kernel.org
      48ac3c18
    • Mark Rutland's avatar
      arm64: remove __die()'s stack dump · c5bc503c
      Mark Rutland authored
      Our __die() implementation tries to dump the stack memory, in addition
      to a backtrace, which is problematic.
      
      For contemporary 16K stacks, this can be a lot of data, which can take a
      long time to dump, and can push other useful context out of the kernel's
      printk ringbuffer (and/or a user's scrollback buffer on an attached
      console).
      
      Additionally, the code implicitly assumes that the SP is on the task's
      stack, and tries to dump everything between the SP and the highest task
      stack address. When the SP points at an IRQ stack (or is corrupted),
      this makes the kernel attempt to dump vast amounts of VA space. With
      vmap'd stacks, this may result in erroneous accesses to peripherals.
      
      This patch removes the memory dump, leaving us to rely on the backtrace,
      and other means of dumping stack memory such as kdump.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      c5bc503c
    • Dou Liyang's avatar
      arm64: numa: Remove the unused parent_node() macro · 969ff73e
      Dou Liyang authored
      Commit a7be6e5a ("mm: drop useless local parameters of
      __register_one_node()") removes the last user of parent_node().
      
      The parent_node() macro in ARM64 platform is unnecessary.
      
      Remove it for cleanup.
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      969ff73e
  3. 10 Aug, 2017 8 commits
  4. 09 Aug, 2017 13 commits
    • Ard Biesheuvel's avatar
      md/raid6: implement recovery using ARM NEON intrinsics · 6ec4e251
      Ard Biesheuvel authored
      Provide a NEON accelerated implementation of the recovery algorithm,
      which supersedes the default byte-by-byte one.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      6ec4e251
    • Ard Biesheuvel's avatar
      md/raid6: use faster multiplication for ARM NEON delta syndrome · 35129dde
      Ard Biesheuvel authored
      The P/Q left side optimization in the delta syndrome simply involves
      repeatedly multiplying a value by polynomial 'x' in GF(2^8). Given
      that 'x * x * x * x' equals 'x^4' even in the polynomial world, we
      can accelerate this substantially by performing up to 4 such operations
      at once, using the NEON instructions for polynomial multiplication.
      
      Results on a Cortex-A57 running in 64-bit mode:
      
        Before:
        -------
        raid6: neonx1   xor()  1680 MB/s
        raid6: neonx2   xor()  2286 MB/s
        raid6: neonx4   xor()  3162 MB/s
        raid6: neonx8   xor()  3389 MB/s
      
        After:
        ------
        raid6: neonx1   xor()  2281 MB/s
        raid6: neonx2   xor()  3362 MB/s
        raid6: neonx4   xor()  3787 MB/s
        raid6: neonx8   xor()  4239 MB/s
      
      While we're at it, simplify MASK() by using a signed shift rather than
      a vector compare involving a temp register.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      35129dde
    • Catalin Marinas's avatar
      Merge tag 'arm64-iort-for-v4.14' of... · f39c3f9b
      Catalin Marinas authored
      Merge tag 'arm64-iort-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux into for-next/core
      
      Two patches for the 4.14 release merge window:
      
      - IORT PCI aliases detection improvement to cater for systems with
        complex PCI topologies that current code mishandles (R.Murphy)
      - IORT SMMUv3 proximity domain (ie NUMA) handling (respective ACPICA
        changes will be merged separately) (G. Kulkarni)
      
      * tag 'arm64-iort-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lpieralisi/linux:
        ACPI/IORT: numa: Add numa node mapping for smmuv3 devices
        ACPI/IORT: Handle PCI aliases properly for IOMMUs
      f39c3f9b
    • Catalin Marinas's avatar
      Merge branch 'arm64/exception-stack' of... · 05538967
      Catalin Marinas authored
      Merge branch 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core
      
      * 'arm64/exception-stack' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux:
        arm64: unwind: remove sp from struct stackframe
        arm64: unwind: reference pt_regs via embedded stack frame
        arm64: unwind: disregard frame.sp when validating frame pointer
        arm64: unwind: avoid percpu indirection for irq stack
        arm64: move non-entry code out of .entry.text
        arm64: consistently use bl for C exception entry
        arm64: Add ASM_BUG()
      05538967
    • Ard Biesheuvel's avatar
      arm64: unwind: remove sp from struct stackframe · 31e43ad3
      Ard Biesheuvel authored
      The unwind code sets the sp member of struct stackframe to
      'frame pointer + 0x10' unconditionally, without regard for whether
      doing so produces a legal value. So let's simply remove it now that
      we have stopped using it anyway.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      31e43ad3
    • Ard Biesheuvel's avatar
      arm64: unwind: reference pt_regs via embedded stack frame · 73267498
      Ard Biesheuvel authored
      As it turns out, the unwind code is slightly broken, and probably has
      been for a while. The problem is in the dumping of the exception stack,
      which is intended to dump the contents of the pt_regs struct at each
      level in the call stack where an exception was taken and routed to a
      routine marked as __exception (which means its stack frame is right
      below the pt_regs struct on the stack).
      
      'Right below the pt_regs struct' is ill defined, though: the unwind
      code assigns 'frame pointer + 0x10' to the .sp member of the stackframe
      struct at each level, and dump_backtrace() happily dereferences that as
      the pt_regs pointer when encountering an __exception routine. However,
      the actual size of the stack frame created by this routine (which could
      be one of many __exception routines we have in the kernel) is not known,
      and so frame.sp is pretty useless to figure out where struct pt_regs
      really is.
      
      So it seems the only way to ensure that we can find our struct pt_regs
      when walking the stack frames is to put it at a known fixed offset of
      the stack frame pointer that is passed to such __exception routines.
      The simplest way to do that is to put it inside pt_regs itself, which is
      the main change implemented by this patch. As a bonus, doing this allows
      us to get rid of a fair amount of cruft related to walking from one stack
      to the other, which is especially nice since we intend to introduce yet
      another stack for overflow handling once we add support for vmapped
      stacks. It also fixes an inconsistency where we only add a stack frame
      pointing to ELR_EL1 if we are executing from the IRQ stack but not when
      we are executing from the task stack.
      
      To consistly identify exceptions regs even in the presence of exceptions
      taken from entry code, we must check whether the next frame was created
      by entry text, rather than whether the current frame was crated by
      exception text.
      
      To avoid backtracing using PCs that fall in the idmap, or are controlled
      by userspace, we must explcitly zero the FP and LR in startup paths, and
      must ensure that the frame embedded in pt_regs is zeroed upon entry from
      EL0. To avoid these NULL entries showin in the backtrace, unwind_frame()
      is updated to avoid them.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [Mark: compare current frame against .entry.text, avoid bogus PCs]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      73267498
    • Dmitry Safonov's avatar
      arm64/vdso: Support mremap() for vDSO · 73958695
      Dmitry Safonov authored
      vDSO VMA address is saved in mm_context for the purpose of using
      restorer from vDSO page to return to userspace after signal handling.
      
      In Checkpoint Restore in Userspace (CRIU) project we place vDSO VMA
      on restore back to the place where it was on the dump.
      With the exception for x86 (where there is API to map vDSO with
      arch_prctl()), we move vDSO inherited from CRIU task to restoree
      position by mremap().
      
      CRIU does support arm64 architecture, but kernel doesn't update
      context.vdso pointer after mremap(). Which results in translation
      fault after signal handling on restored application:
      https://github.com/xemul/criu/issues/288
      
      Make vDSO code track the VMA address by supplying .mremap() fops
      the same way it's done for x86 and arm32 by:
      commit b059a453 ("x86/vdso: Add mremap hook to vm_special_mapping")
      commit 280e87e9 ("ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO").
      
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Cc: Christopher Covington <cov@codeaurora.org>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarDmitry Safonov <dsafonov@virtuozzo.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      73958695
    • Robin Murphy's avatar
      arm64: uaccess: Implement *_flushcache variants · 5d7bdeb1
      Robin Murphy authored
      Implement the set of copy functions with guarantees of a clean cache
      upon completion necessary to support the pmem driver.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      5d7bdeb1
    • Robin Murphy's avatar
      arm64: Implement pmem API support · d50e071f
      Robin Murphy authored
      Add a clean-to-point-of-persistence cache maintenance helper, and wire
      up the basic architectural support for the pmem driver based on it.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      [catalin.marinas@arm.com: move arch_*_pmem() functions to arch/arm64/mm/flush.c]
      [catalin.marinas@arm.com: change dmb(sy) to dmb(osh)]
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      d50e071f
    • Robin Murphy's avatar
      arm64: Handle trapped DC CVAP · e1bc5d1b
      Robin Murphy authored
      Cache clean to PoP is subject to the same access controls as to PoC, so
      if we are trapping userspace cache maintenance with SCTLR_EL1.UCI, we
      need to be prepared to handle it. To avoid getting into complicated
      fights with binutils about ARMv8.2 options, we'll just cheat and use the
      raw SYS instruction rather than the 'proper' DC alias.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      e1bc5d1b
    • Robin Murphy's avatar
      arm64: Expose DC CVAP to userspace · 7aac405e
      Robin Murphy authored
      The ARMv8.2-DCPoP feature introduces persistent memory support to the
      architecture, by defining a point of persistence in the memory
      hierarchy, and a corresponding cache maintenance operation, DC CVAP.
      Expose the support via HWCAP and MRS emulation.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      7aac405e
    • Robin Murphy's avatar
      arm64: Convert __inval_cache_range() to area-based · d46befef
      Robin Murphy authored
      __inval_cache_range() is already the odd one out among our data cache
      maintenance routines as the only remaining range-based one; as we're
      going to want an invalidation routine to call from C code for the pmem
      API, let's tweak the prototype and name to bring it in line with the
      clean operations, and to make its relationship with __dma_inv_area()
      neatly mirror that of __clean_dcache_area_poc() and __dma_clean_area().
      The loop clearing the early page tables gets mildly massaged in the
      process for the sake of consistency.
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      d46befef
    • Robin Murphy's avatar
      arm64: mm: Fix set_memory_valid() declaration · 09c2a7dc
      Robin Murphy authored
      Clearly, set_memory_valid() has never been seen in the same room as its
      declaration... Whilst the type mismatch is such that kexec probably
      wasn't broken in practice, fix it to match the definition as it should.
      
      Fixes: 9b0aa14e ("arm64: mm: add set_memory_valid()")
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      09c2a7dc
  5. 08 Aug, 2017 2 commits