1. 27 Aug, 2015 1 commit
    • Ard Biesheuvel's avatar
      arm64: flush FP/SIMD state correctly after execve() · 674c242c
      Ard Biesheuvel authored
      When a task calls execve(), its FP/SIMD state is flushed so that
      none of the original program state is observeable by the incoming
      program.
      
      However, since this flushing consists of setting the in-memory copy
      of the FP/SIMD state to all zeroes, the CPU field is set to CPU 0 as
      well, which indicates to the lazy FP/SIMD preserve/restore code that
      the FP/SIMD state does not need to be reread from memory if the task
      is scheduled again on CPU 0 without any other tasks having entered
      userland (or used the FP/SIMD in kernel mode) on the same CPU in the
      mean time. If this happens, the FP/SIMD state of the old program will
      still be present in the registers when the new program starts.
      
      So set the CPU field to the invalid value of NR_CPUS when performing
      the flush, by calling fpsimd_flush_task_state().
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarChunyan Zhang <chunyan.zhang@spreadtrum.com>
      Reported-by: default avatarJanet Liu <janet.liu@spreadtrum.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      674c242c
  2. 24 Aug, 2015 4 commits
  3. 21 Aug, 2015 1 commit
    • Will Deacon's avatar
      arm64: entry: always restore x0 from the stack on syscall return · 412fcb6c
      Will Deacon authored
      We have a micro-optimisation on the fast syscall return path where we
      take care to keep x0 live with the return value from the syscall so that
      we can avoid restoring it from the stack. The benefit of doing this is
      fairly suspect, since we will be restoring x1 from the stack anyway
      (which lives adjacent in the pt_regs structure) and the only additional
      cost is saving x0 back to pt_regs after the syscall handler, which could
      be seen as a poor man's prefetch.
      
      More importantly, this causes issues with the context tracking code.
      
      The ct_user_enter macro ends up branching into C code, which is free to
      use x0 as a scratch register and consequently leads to us returning junk
      back to userspace as the syscall return value. Rather than special case
      the context-tracking code, this patch removes the questionable
      optimisation entirely.
      
      Cc: <stable@vger.kernel.org>
      Cc: Larry Bassel <larry.bassel@linaro.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Tested-by: default avatarHanjun Guo <hanjun.guo@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      412fcb6c
  4. 20 Aug, 2015 1 commit
  5. 19 Aug, 2015 1 commit
  6. 12 Aug, 2015 1 commit
    • Jungseok Lee's avatar
      arm64: Add __exception_irq_entry definition for function graph · 9a5ad7d0
      Jungseok Lee authored
      The gic_handle_irq() is defined with __exception_irq_entry attribute.
      A single remaining work is to add its definition as ARM did. Below
      shows how function graph data is changed with these hunks.
      
      A prologue of an interrupt handler is drawn as follows.
      
      - current status
      
       0)   0.208 us    |  cpuidle_not_available();
       0)               |  default_idle_call() {
       0)               |    arch_cpu_idle() {
       0)               |      __handle_domain_irq() {
       0)               |        irq_enter() {
       0)   0.313 us    |          rcu_irq_enter();
       0)   0.261 us    |          __local_bh_disable_ip();
      
      - with this change
      
       0)   0.625 us    |  cpuidle_not_available();
       0)               |  default_idle_call() {
       0)               |    arch_cpu_idle() {
       0)   ==========> |
       0)               |      gic_handle_irq() {
       0)               |        __handle_domain_irq() {
       0)               |          irq_enter() {
       0)   0.885 us    |            rcu_irq_enter();
       0)   0.781 us    |            __local_bh_disable_ip();
      
      An epilogue of an interrupt handler is recorded as follows.
      
      - current status
      
       0)   0.261 us    |          idle_cpu();
       0)               |          rcu_irq_exit() {
       0)   0.521 us    |            rcu_eqs_enter_common.isra.46();
       0)   2.552 us    |          }
       0) ! 322.448 us  |        }
       0) ! 583.437 us  |      }
       0) # 1656.041 us |    }
       0) # 1658.073 us |  }
      
      - with this change
      
       0)   0.677 us    |            idle_cpu();
       0)               |            rcu_irq_exit() {
       0)   1.770 us    |              rcu_eqs_enter_common.isra.46();
       0)   7.968 us    |            }
       0) # 1803.541 us |          }
       0) # 2626.667 us |        }
       0) # 2632.969 us |      }
       0)   <========== |
       0) # 14425.00 us |    }
       0) # 14430.98 us |  }
      
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Rabin Vincent <rabin@rab.in>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJungseok Lee <jungseoklee85@gmail.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      9a5ad7d0
  7. 05 Aug, 2015 2 commits
    • Will Deacon's avatar
      Merge branch 'aarch64/psci/drivers' into aarch64/for-next/core · d422e625
      Will Deacon authored
      Move our PSCI implementation out into drivers/firmware/ where it can be
      shared with arch/arm/.
      
      Conflicts:
      	arch/arm64/kernel/psci.c
      d422e625
    • Will Deacon's avatar
      arm64: mm: ensure patched kernel text is fetched from PoU · 8ec41987
      Will Deacon authored
      The arm64 booting document requires that the bootloader has cleaned the
      kernel image to the PoC. However, when a CPU re-enters the kernel due to
      either a CPU hotplug "on" event or resuming from a low-power state (e.g.
      cpuidle), the kernel text may in-fact be dirty at the PoU due to things
      like alternative patching or even module loading.
      
      Thanks to I-cache speculation with the MMU off, stale instructions could
      be fetched prior to enabling the MMU, potentially leading to crashes
      when executing regions of code that have been modified at runtime.
      
      This patch addresses the issue by ensuring that the local I-cache is
      invalidated immediately after a CPU has enabled its MMU but before
      jumping out of the identity mapping. Any stale instructions fetched from
      the PoC will then be discarded and refetched correctly from the PoU.
      Patching kernel text executed prior to the MMU being enabled is
      prohibited, so the early entry code will always be clean.
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      8ec41987
  8. 04 Aug, 2015 2 commits
  9. 03 Aug, 2015 4 commits
  10. 31 Jul, 2015 1 commit
    • Sudeep Holla's avatar
      arm64: restore cpu suspend/resume functionality · b511a659
      Sudeep Holla authored
      Commit 4b3dc967 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs")
      accidentally retained code for !CONFIG_SMP in cpu_resume function. This
      resulted in the hash index being zeroed in x7 after proper computation,
      which is then used to get the cpu context pointer while resuming.
      
      This patch removes the remanant code and restores back the cpu suspend/
      resume functionality.
      
      Fixes: 4b3dc967 ("arm64: force CONFIG_SMP=y and remove redundant #ifdefs")
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      b511a659
  11. 30 Jul, 2015 3 commits
    • Lorenzo Pieralisi's avatar
      ARM64: PCI: do not enable resources on PROBE_ONLY systems · 72407514
      Lorenzo Pieralisi authored
      On ARM64 PROBE_ONLY PCI systems resources are not currently claimed,
      therefore they can't be enabled since they do not have a valid
      parent pointer; this in turn prevents enabling PCI devices on
      ARM64 PROBE_ONLY systems, causing PCI devices initialization to
      fail.
      
      To solve this issue, resources must be claimed when devices are
      added on PROBE_ONLY systems, which ensures that the resource hierarchy
      is validated and the resource tree is sane, but this requires changes
      in the ARM64 resource management that can affect adversely existing
      PCI set-ups (claiming resources on !PROBE_ONLY systems might break
      existing ARM64 PCI platform implementations).
      
      As a temporary solution in preparation for a proper resources claiming
      implementation in ARM64 core, to enable PCI PROBE_ONLY systems on ARM64,
      this patch adds a pcibios_enable_device() arch implementation that
      simply prevents enabling resources on PROBE_ONLY systems (mirroring ARM
      behaviour).
      
      This is always a safe thing to do because on PROBE_ONLY systems the
      configuration space set-up can be considered immutable, and it is in
      preparation of proper resource claiming that would finally validate
      the PCI resources tree in the ARM64 arch implementation on PROBE_ONLY
      systems.
      
      For !PROBE_ONLY systems resources enablement in pcibios_enable_device()
      on ARM64 is implemented as in current PCI core, leaving the behaviour
      unchanged.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      72407514
    • Will Deacon's avatar
      arm64: cmpxchg: truncate sub-word signed types before comparison · a14949e0
      Will Deacon authored
      When performing a cmpxchg operation on a signed sub-word type (e.g. s8),
      we need to ensure that the upper register bits of the "old" value used
      for comparison are zeroed, otherwise we may erroneously fail the cmpxchg
      which may even be interpreted as success by the caller (if the compiler
      performs the truncation as part of its check). This has been observed
      in mod_state, where negative values where causing problems with
      this_cpu_cmpxchg.
      
      This patch fixes the issue by explicitly casting 8-bit and 16-bit "old"
      values using unsigned types in our cmpxchg wrappers. 32-bit types can be
      left alone, since the underlying asm makes use of W registers in this
      case.
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      a14949e0
    • Will Deacon's avatar
      arm64: alternative: put secondary CPUs into polling loop during patch · ef5e724b
      Will Deacon authored
      When patching the kernel text with alternatives, we may end up patching
      parts of the stop_machine state machine (e.g. atomic_dec_and_test in
      ack_state) and consequently corrupt the instruction stream of any
      secondary CPUs.
      
      This patch passes the cpu_online_mask to stop_machine, forcing all of
      the CPUs into our own callback which can place the secondary cores into
      a dumb (but safe!) polling loop whilst the patching is carried out.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ef5e724b
  12. 29 Jul, 2015 4 commits
  13. 28 Jul, 2015 5 commits
    • Will Deacon's avatar
      arm64: pgtable: fix definition of pte_valid · 766ffb69
      Will Deacon authored
      pte_valid should check if the PTE_VALID bit (1 << 0) is set in the pte,
      so fix the macro definition to use bitwise & instead of logical &&.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      766ffb69
    • Will Deacon's avatar
      arm64: spinlock: fix ll/sc unlock on big-endian systems · c1d7cd22
      Will Deacon authored
      When unlocking a spinlock, we perform a read-modify-write on the owner
      ticket in order to increment it and store it back with release
      semantics.
      
      In the LL/SC case, we load the 16-bit ticket using a 32-bit load and
      therefore store back the wrong halfword on a big-endian system,
      corrupting the lock after the first unlock and killing the system dead.
      
      This patch fixes the unlock code to use 16-bit accessors consistently.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c1d7cd22
    • Catalin Marinas's avatar
      arm64: Use last level TLBI for user pte changes · 4150e50b
      Catalin Marinas authored
      The flush_tlb_page() function is used on user address ranges when PTEs
      (or PMDs/PUDs for huge pages) were changed (attributes or clearing). For
      such cases, it is more efficient to invalidate only the last level of
      the TLB with the "tlbi vale1is" instruction.
      
      In the TLB shoot-down case, the TLB caching of the intermediate page
      table levels (pmd, pud, pgd) is handled by __flush_tlb_pgtable() via the
      __(pte|pmd|pud)_free_tlb() functions and it is not deferred to
      tlb_finish_mmu() (as of commit 285994a6 - "arm64: Invalidate the TLB
      corresponding to intermediate page table levels"). The tlb_flush()
      function only needs to invalidate the TLB for the last level of page
      tables; the __flush_tlb_range() function gains a fourth argument for
      last level TLBI.
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      4150e50b
    • Catalin Marinas's avatar
      arm64: Clean up __flush_tlb(_kernel)_range functions · da4e7330
      Catalin Marinas authored
      This patch moves the MAX_TLB_RANGE check into the
      flush_tlb(_kernel)_range functions directly to avoid the
      undescore-prefixed definitions (and for consistency with a subsequent
      patch).
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      da4e7330
    • Mark Rutland's avatar
      arm64: mm: mark create_mapping as __init · c53e0baa
      Mark Rutland authored
      Currently create_mapping is marked with __ref, apparently because it
      refers to early_alloc. However, create_mapping has no logic to prevent
      erroneous use of early_alloc after it has been freed, and is only ever
      called by __init functions anyway. Thus the __ref marker is misleading
      and unnecessary.
      
      Instead, this patch marks create_mapping as __init, resulting in
      warnings if it is used from a a non __init functions, and allowing its
      memory to be reclaimed.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      c53e0baa
  14. 27 Jul, 2015 10 commits