1. 17 Aug, 2018 1 commit
    • Will Deacon's avatar
      arm64: Avoid calling stop_machine() when patching jump labels · f6cc0c50
      Will Deacon authored
      Patching a jump label involves patching a single instruction at a time,
      swizzling between a branch and a NOP. The architecture treats these
      instructions specially, so a concurrently executing CPU is guaranteed to
      see either the NOP or the branch, rather than an amalgamation of the two
      instruction encodings.
      
      However, in order to guarantee that the new instruction is visible, it
      is necessary to send an IPI to the concurrently executing CPU so that it
      discards any previously fetched instructions from its pipeline. This
      operation therefore cannot be completed from a context with IRQs
      disabled, but this is exactly what happens on the jump label path where
      the hotplug lock is held and irqs are subsequently disabled by
      stop_machine_cpuslocked(). This results in a deadlock during boot on
      Hikey-960.
      
      Due to the architectural guarantees around patching NOPs and branches,
      we don't actually need to stop_machine() at all on the jump label path,
      so we can avoid the deadlock by using the "nosync" variant of our
      instruction patching routine.
      
      Fixes: 693350a7 ("arm64: insn: Don't fallback on nosync path for general insn patching")
      Reported-by: default avatarTuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
      Reported-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Tested-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Tested-by: default avatarTuomas Tynkkynen <tuomas@tuxera.com>
      Tested-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      f6cc0c50
  2. 08 Aug, 2018 1 commit
  3. 31 Jul, 2018 6 commits
    • Will Deacon's avatar
      arm64: kexec: Add comment to explain use of __flush_icache_range() · dcab90d9
      Will Deacon authored
      Now that we understand the deadlock arising from flush_icache_range()
      on the kexec crash kernel path, add a comment to justify the use of
      __flush_icache_range() here.
      Reported-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      dcab90d9
    • Will Deacon's avatar
      arm64: sdei: Mark sdei stack helper functions as static · eab1cecc
      Will Deacon authored
      The SDEI stack helper functions are only used by _on_sdei_stack() and
      refer to symbols (e.g. sdei_stack_normal_ptr) that are only defined if
      CONFIG_VMAP_STACK=y.
      
      Mark these functions as static, so we don't run into errors at link-time
      due to references to undefined symbols. Stick all the parameters onto
      the same line whilst we're passing through.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      eab1cecc
    • Bhupesh Sharma's avatar
      arm64, kaslr: export offset in VMCOREINFO ELF notes · e401b7c2
      Bhupesh Sharma authored
      Include KASLR offset in arm64 VMCOREINFO ELF notes to assist in
      debugging. vmcore parsing in user-space already expects this value in
      the notes and we are providing it for portability of those existing
      tools with x86.
      
      Ideally we would like core code to do this (so that way this
      information won't be missed when an architecture adds KASLR support),
      but mips has CONFIG_RANDOMIZE_BASE, and doesn't provide kaslr_offset(),
      so I am not sure if this is needed for mips (and other such similar arch
      cases in future). So, lets keep this architecture specific for now.
      
      As an example of a user-space use-case, consider the
      makedumpfile user-space utility which will need fixup to use this
      KASLR offset to work with cases where we need to find a way to
      translate symbol address from vmlinux to kernel run time address
      in case of KASLR boot on arm64.
      
      I have already submitted the makedumpfile user-space patch upstream
      and the maintainer has suggested to wait for the kernel changes to be
      included (see [0]).
      
      I tested this on my qualcomm amberwing board both for KASLR and
      non-KASLR boot cases:
      
      Without this patch:
         # cat > scrub.conf << EOF
         [vmlinux]
         erase jiffies
         erase init_task.utime
         for tsk in init_task.tasks.next within task_struct:tasks
             erase tsk.utime
         endfor
         EOF
      
        # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3}
        readpage_elf: Attempt to read non-existent page at 0xffffa8a5bf180000.
        readmem: type_addr: 1, addr:ffffa8a5bf180000, size:8
        vaddr_to_paddr_arm64: Can't read pgd
        readmem: Can't convert a virtual address(ffff0000092a542c) to physical
        address.
        readmem: type_addr: 0, addr:ffff0000092a542c, size:390
        check_release: Can't get the address of system_utsname
      
      After this patch check_release() is ok, and also we are able to erase
      symbol from vmcore (I checked this with kernel 4.18.0-rc4+):
      
        # makedumpfile --split -d 31 -x vmlinux --config scrub.conf vmcore dumpfile_{1,2,3}
        The kernel version is not supported.
        The makedumpfile operation may be incomplete.
        Checking for memory holes                         : [100.0 %] \
        Checking for memory holes                         : [100.0 %] |
        Checking foExcluding unnecessary pages                       : [100.0 %]
        \
        Excluding unnecessary pages                       : [100.0 %] \
      
        The dumpfiles are saved to dumpfile_1, dumpfile_2, and dumpfile_3.
      
        makedumpfile Completed.
      
      [0] https://www.spinics.net/lists/kexec/msg21195.html
      
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Acked-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarBhupesh Sharma <bhsharma@redhat.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      e401b7c2
    • Michael O'Farrell's avatar
      arm64: perf: Add cap_user_time aarch64 · 9d2dcc8f
      Michael O'Farrell authored
      It is useful to get the running time of a thread.  Doing so in an
      efficient manner can be important for performance of user applications.
      Avoiding system calls in `clock_gettime` when handling
      CLOCK_THREAD_CPUTIME_ID is important.  Other clocks are handled in the
      VDSO, but CLOCK_THREAD_CPUTIME_ID falls back on the system call.
      
      CLOCK_THREAD_CPUTIME_ID is not handled in the VDSO since it would have
      costs associated with maintaining updated user space accessible time
      offsets.  These offsets have to be updated everytime the a thread is
      scheduled/descheduled.  However, for programs regularly checking the
      running time of a thread, this is a performance improvement.
      
      This patch takes a middle ground, and adds support for cap_user_time an
      optional feature of the perf_event API.  This way costs are only
      incurred when the perf_event api is enabled.  This is done the same way
      as it is in x86.
      
      Ultimately this allows calculating the thread running time in userspace
      on aarch64 as follows (adapted from perf_event_open manpage):
      
      u32 seq, time_mult, time_shift;
      u64 running, count, time_offset, quot, rem, delta;
      struct perf_event_mmap_page *pc;
      pc = buf;  // buf is the perf event mmaped page as documented in the API.
      
      if (pc->cap_usr_time) {
          do {
              seq = pc->lock;
              barrier();
              running = pc->time_running;
      
              count = readCNTVCT_EL0();  // Read ARM hardware clock.
              time_offset = pc->time_offset;
              time_mult   = pc->time_mult;
              time_shift  = pc->time_shift;
      
              barrier();
          } while (pc->lock != seq);
      
          quot = (count >> time_shift);
          rem = count & (((u64)1 << time_shift) - 1);
          delta = time_offset + quot * time_mult +
                  ((rem * time_mult) >> time_shift);
      
          running += delta;
          // running now has the current nanosecond level thread time.
      }
      
      Summary of changes in the patch:
      
      For aarch64 systems, make arch_perf_update_userpage update the timing
      information stored in the perf_event page.  Requiring the following
      calculations:
        - Calculate the appropriate time_mult, and time_shift factors to convert
          ticks to nano seconds for the current clock frequency.
        - Adjust the mult and shift factors to avoid shift factors of 32 bits.
          (possibly unnecessary)
        - The time_offset userspace should apply when doing calculations:
          negative the current sched time (now), because time_running and
          time_enabled fields of the perf_event page have just been updated.
      Toggle bits to appropriate values:
        - Enable cap_user_time
      Signed-off-by: default avatarMichael O'Farrell <micpof@gmail.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      9d2dcc8f
    • Laura Abbott's avatar
      efi/libstub: Only disable stackleak plugin for arm64 · ce279d37
      Laura Abbott authored
      arm64 uses the full KBUILD_CFLAGS for building libstub as opposed
      to x86 which doesn't. This means that x86 doesn't pick up
      the gcc-plugins. We need to disable the stackleak plugin but
      doing this unconditionally breaks x86 build since it doesn't
      have any plugins. Switch to disabling the stackleak plugin for
      arm64 only.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ce279d37
    • Ard Biesheuvel's avatar
      arm64: drop unused kernel_neon_begin_partial() macro · d26de6c9
      Ard Biesheuvel authored
      When kernel mode NEON was first introduced to the arm64 kernel,
      every call to kernel_neon_begin()/_end() stacked resp. unstacked
      the entire NEON register file, making it worthwile to reduce the
      number of used NEON registers to a bare minimum, and only stack
      those. kernel_neon_begin_partial() was introduced for this purpose,
      but after the refactoring for SVE and other changes, it no longer
      exists and was simply #define'd to kernel_neon_begin() directly.
      
      In the mean time, all users have been updated, so let's remove
      the fallback macro.
      Reviewed-by: default avatarDave Martin <Dave.Martin@arm.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      d26de6c9
  4. 30 Jul, 2018 2 commits
  5. 27 Jul, 2018 2 commits
  6. 26 Jul, 2018 3 commits
  7. 24 Jul, 2018 2 commits
  8. 23 Jul, 2018 8 commits
    • Will Deacon's avatar
      rseq/selftests: Add support for arm64 · b9657463
      Will Deacon authored
      Hook up arm64 support to the rseq selftests.
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      b9657463
    • AKASHI Takahiro's avatar
      arm64: acpi: fix alignment fault in accessing ACPI · 09ffcb0d
      AKASHI Takahiro authored
      This is a fix against the issue that crash dump kernel may hang up
      during booting, which can happen on any ACPI-based system with "ACPI
      Reclaim Memory."
      
      (kernel messages after panic kicked off kdump)
      	   (snip...)
      	Bye!
      	   (snip...)
      	ACPI: Core revision 20170728
      	pud=000000002e7d0003, *pmd=000000002e7c0003, *pte=00e8000039710707
      	Internal error: Oops: 96000021 [#1] SMP
      	Modules linked in:
      	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc6 #1
      	task: ffff000008d05180 task.stack: ffff000008cc0000
      	PC is at acpi_ns_lookup+0x25c/0x3c0
      	LR is at acpi_ds_load1_begin_op+0xa4/0x294
      	   (snip...)
      	Process swapper/0 (pid: 0, stack limit = 0xffff000008cc0000)
      	Call trace:
      	   (snip...)
      	[<ffff0000084a6764>] acpi_ns_lookup+0x25c/0x3c0
      	[<ffff00000849b4f8>] acpi_ds_load1_begin_op+0xa4/0x294
      	[<ffff0000084ad4ac>] acpi_ps_build_named_op+0xc4/0x198
      	[<ffff0000084ad6cc>] acpi_ps_create_op+0x14c/0x270
      	[<ffff0000084acfa8>] acpi_ps_parse_loop+0x188/0x5c8
      	[<ffff0000084ae048>] acpi_ps_parse_aml+0xb0/0x2b8
      	[<ffff0000084a8e10>] acpi_ns_one_complete_parse+0x144/0x184
      	[<ffff0000084a8e98>] acpi_ns_parse_table+0x48/0x68
      	[<ffff0000084a82cc>] acpi_ns_load_table+0x4c/0xdc
      	[<ffff0000084b32f8>] acpi_tb_load_namespace+0xe4/0x264
      	[<ffff000008baf9b4>] acpi_load_tables+0x48/0xc0
      	[<ffff000008badc20>] acpi_early_init+0x9c/0xd0
      	[<ffff000008b70d50>] start_kernel+0x3b4/0x43c
      	Code: b9008fb9 2a000318 36380054 32190318 (b94002c0)
      	---[ end trace c46ed37f9651c58e ]---
      	Kernel panic - not syncing: Fatal exception
      	Rebooting in 10 seconds..
      
      (diagnosis)
      * This fault is a data abort, alignment fault (ESR=0x96000021)
        during reading out ACPI table.
      * Initial ACPI tables are normally stored in system ram and marked as
        "ACPI Reclaim memory" by the firmware.
      * After the commit f56ab9a5 ("efi/arm: Don't mark ACPI reclaim
        memory as MEMBLOCK_NOMAP"), those regions are differently handled
        as they are "memblock-reserved", without NOMAP bit.
      * So they are now excluded from device tree's "usable-memory-range"
        which kexec-tools determines based on a current view of /proc/iomem.
      * When crash dump kernel boots up, it tries to accesses ACPI tables by
        mapping them with ioremap(), not ioremap_cache(), in acpi_os_ioremap()
        since they are no longer part of mapped system ram.
      * Given that ACPI accessor/helper functions are compiled in without
        unaligned access support (ACPI_MISALIGNMENT_NOT_SUPPORTED),
        any unaligned access to ACPI tables can cause a fatal panic.
      
      With this patch, acpi_os_ioremap() always honors memory attribute
      information provided by the firmware (EFI) and retaining cacheability
      allows the kernel safe access to ACPI tables.
      Signed-off-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reported-by and Tested-by: Bhupesh Sharma <bhsharma@redhat.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      09ffcb0d
    • AKASHI Takahiro's avatar
      efi/arm: map UEFI memory map even w/o runtime services enabled · 20d12cf9
      AKASHI Takahiro authored
      Under the current implementation, UEFI memory map will be mapped and made
      available in virtual mappings only if runtime services are enabled.
      But in a later patch, we want to use UEFI memory map in acpi_os_ioremap()
      to create mappings of ACPI tables using memory attributes described in
      UEFI memory map.
      See the following commit:
          arm64: acpi: fix alignment fault in accessing ACPI tables
      
      So, as a first step, arm_enter_runtime_services() is modified, alongside
      Ard's patch[1], so that UEFI memory map will not be freed even if
      efi=noruntime.
      
      [1] https://marc.info/?l=linux-efi&m=152930773507524&w=2Signed-off-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      20d12cf9
    • Ard Biesheuvel's avatar
      efi/arm: preserve early mapping of UEFI memory map longer for BGRT · 3ea86495
      Ard Biesheuvel authored
      The BGRT code validates the contents of the table against the UEFI
      memory map, and so it expects it to be mapped when the code runs.
      
      On ARM, this is currently not the case, since we tear down the early
      mapping after efi_init() completes, and only create the permanent
      mapping in arm_enable_runtime_services(), which executes as an early
      initcall, but still leaves a window where the UEFI memory map is not
      mapped.
      
      So move the call to efi_memmap_unmap() from efi_init() to
      arm_enable_runtime_services().
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      [will: fold in EFI_MEMMAP attribute check from Ard]
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3ea86495
    • AKASHI Takahiro's avatar
      drivers: acpi: add dependency of EFI for arm64 · 5bcd4408
      AKASHI Takahiro authored
      As Ard suggested, CONFIG_ACPI && !CONFIG_EFI doesn't make sense on arm64,
      while CONFIG_ACPI and CONFIG_CPU_BIG_ENDIAN doesn't make sense either.
      
      As CONFIG_EFI already has a dependency of !CONFIG_CPU_BIG_ENDIAN, it is
      good enough to add a dependency of CONFIG_EFI to avoid any useless
      combination of configuration.
      
      This bug, reported by Will, will be revealed when my patch series,
      "arm64: kexec,kdump: fix boot failures on acpi-only system," is applied
      and the kernel is built under allmodconfig.
      Signed-off-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Suggested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      5bcd4408
    • James Morse's avatar
      arm64: export memblock_reserve()d regions via /proc/iomem · 50d7ba36
      James Morse authored
      There has been some confusion around what is necessary to prevent kexec
      overwriting important memory regions. memblock: reserve, or nomap?
      Only memblock nomap regions are reported via /proc/iomem, kexec's
      user-space doesn't know about memblock_reserve()d regions.
      
      Until commit f56ab9a5 ("efi/arm: Don't mark ACPI reclaim memory
      as MEMBLOCK_NOMAP") the ACPI tables were nomap, now they are reserved
      and thus possible for kexec to overwrite with the new kernel or initrd.
      But this was always broken, as the UEFI memory map is also reserved
      and not marked as nomap.
      
      Exporting both nomap and reserved memblock types is a nuisance as
      they live in different memblock structures which we can't walk at
      the same time.
      
      Take a second walk over memblock.reserved and add new 'reserved'
      subnodes for the memblock_reserved() regions that aren't already
      described by the existing code. (e.g. Kernel Code)
      
      We use reserve_region_with_split() to find the gaps in existing named
      regions. This handles the gap between 'kernel code' and 'kernel data'
      which is memblock_reserve()d, but already partially described by
      request_standard_resources(). e.g.:
      | 80000000-dfffffff : System RAM
      |   80080000-80ffffff : Kernel code
      |   81000000-8158ffff : reserved
      |   81590000-8237efff : Kernel data
      |   a0000000-dfffffff : Crash kernel
      | e00f0000-f949ffff : System RAM
      
      reserve_region_with_split needs kzalloc() which isn't available when
      request_standard_resources() is called, use an initcall.
      Reported-by: default avatarBhupesh Sharma <bhsharma@redhat.com>
      Reported-by: default avatarTyler Baicar <tbaicar@codeaurora.org>
      Suggested-by: default avatarAkashi Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Fixes: d28f6df1 ("arm64/kexec: Add core kexec support")
      Reviewed-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      CC: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      50d7ba36
    • Olof Johansson's avatar
      arm64: build with baremetal linker target instead of Linux when available · c931d34e
      Olof Johansson authored
      Not all toolchains have the baremetal elf targets, RedHat/Fedora ones
      in particular. So, probe for whether it's available and use the previous
      (linux) targets if it isn't.
      Reported-by: default avatarLaura Abbott <labbott@redhat.com>
      Tested-by: default avatarLaura Abbott <labbott@redhat.com>
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Paul Kocialkowski <contact@paulk.fr>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c931d34e
    • Mark Rutland's avatar
      arm64: fix possible spectre-v1 write in ptrace_hbp_set_event() · 14d6e289
      Mark Rutland authored
      It's possible for userspace to control idx. Sanitize idx when using it
      as an array index, to inhibit the potential spectre-v1 write gadget.
      
      Found by smatch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      14d6e289
  9. 12 Jul, 2018 15 commits