- 20 Jan, 2018 9 commits
-
-
Ram Pai authored
This patch provides the implementation of execute-only pkey. The architecture-independent layer expects the arch-dependent layer, to support the ability to create and enable a special key which has execute-only permission. Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Store and restore the AMR, IAMR and UAMOR register state of the task before scheduling out and after scheduling in, respectively. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
powerpc has hardware support to disable execute on a pkey. This patch enables the ability to create execute-disabled keys. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
This patch provides the detailed implementation for a user to allocate a key and enable it in the hardware. It provides the plumbing, but it cannot be used till the system call is implemented. The next patch will do so. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Cleanup the bits corresponding to a key in the AMR, and IAMR register, when the key is newly allocated/activated or is freed. We dont want some residual bits cause the hardware enforce unintended behavior when the key is activated or freed. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Introduce helper functions that can initialize the bits in the AMR, IAMR and UAMOR register; the bits that correspond to the given pkey. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Implements helper functions to read and write the key related registers; AMR, IAMR, UAMOR. AMR register tracks the read,write permission of a key IAMR register tracks the execute permission of a key UAMOR register enables and disables a key Acked-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Total 32 keys are available on power7 and above. However pkey 0,1 are reserved. So effectively we have 30 pkeys. On 4K kernels, we do not have 5 bits in the PTE to represent all the keys; we only have 3bits. Two of those keys are reserved; pkey 0 and pkey 1. So effectively we have 6 pkeys. This patch keeps track of reserved keys, allocated keys and keys that are currently free. Also it adds skeletal functions and macros, that the architecture-independent code expects to be available. Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Ram Pai authored
Basic plumbing to initialize the pkey system. Nothing is enabled yet. A later patch will enable it once all the infrastructure is in place. Signed-off-by: Ram Pai <linuxram@us.ibm.com> [mpe: Rework copyrights to use SPDX tags] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 19 Jan, 2018 24 commits
-
-
Christophe Lombard authored
The POWER9 core supports a new feature: ASB_Notify which requires the support of the Special Purpose Register: TIDR. The ASB_Notify command, generated by the AFU, will attempt to wake-up the host thread identified by the particular LPID:PID:TID. This patch assign a unique TIDR (thread id) for the current thread which will be used in the process element entry. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Change the data type for the variable 'ncpu' in ppc_core_imc_cpu_offline(), since cpumask_any_but() returns an 'int' value. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
In memory Collection (IMC) counter pmu driver controls the ucode's execution state. At the system boot, IMC perf driver pause the ucode. Ucode state is changed to "running" only when any of the nest units are monitored or profiled using perf tool. Nest units support only limited set of hardware counters and ucode is always programmed in the "production mode" ("accumulation") mode. This mode is configured to provide key performance metric data for most of the nest units. But ucode also supports other modes which would be used for "debug" to drill down specific nest units. That is, ucode when switched to "powerbus" debug mode (for example), will dynamically reconfigure the nest counters to target only "powerbus" related events in the hardware counters. This allows the IMC nest unit to focus on powerbus related transactions in the system in more detail. At this point, production mode events may or may not be counted. IMC nest counters has both in-band (ucode access) and out of band access to it. Since not all nest counter configurations are supported by ucode, out of band tools are used to characterize other nest counter configurations. Patch provides an interface via "debugfs" to enable the switching of ucode modes in the system. To switch ucode mode, one has to first pause the microcode (imc_cmd), and then write the target mode value to the "imc_mode" file. Proposed Approach: In the proposed approach, the function (export_imc_mode_and_cmd) which creates the debugfs interface for imc mode and command is implemented in opal-imc.c. Thus we can use imc_get_mem_addr() to get the homer base address for each chip. The interface to expose imc mode and command is required only if we have nest pmu units registered. Employing the existing data structures to track whether we have any nest units registered will require to extend data from perf side to opal-imc.c. Instead an integer is introduced to hold that information by counting successful nest unit registration. Debugfs interface is removed based on the integer count. Example for the interface: $ ls /sys/kernel/debug/imc imc_cmd_0 imc_cmd_8 imc_mode_0 imc_mode_8 Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Remove the allocation of struct imc_events from imc_parse_event(). Instead pass imc_events as a parameter to imc_parse_event(), which is a pointer to a slot in the array allocated in update_events_in_group(). Reported-by: Dan Carpenter ("powerpc/perf: Fix a sizeof() typo so we allocate less memory") Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Factor out memory freeing part for attribute elements from imc_common_cpuhp_mem_free(). Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Remove the global variable 'thread_imc_pmu', since it is not used in the code. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
local_t is used for atomic modifications for per-CPU data, versus re-entrant modifications via interrupts. local_t read-modify-write atomic operations are currently implemented with hardware atomics (larx/stcx), which are quite slow. This patch implements them by masking all types of interrupts that may do local_t operations ("standard" and perf interrupts). Rusty's benchmark (https://lkml.org/lkml/2008/12/16/450) gives the following timings for the local_t test, in nanoseconds per iteration: larx/stcx irq+pmu disable _inc 38 10 _add 38 10 _read 4 4 _add_return 38 10 There are still some interrupt types (system reset, machine check, and watchdog), which can not safely use local_t operations, because they are not masked. An alternative approach was proposed, using a CR bit to mark a critical section, which is tested in the interrupt return path, and would then branch to a fixup handler (similar to exception fixups), which re-starts the operation. The problem with this was the complexity of the fixup handler and the latency of the slow path. https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-November/123024.htmlSigned-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
powerpc implements local_t with atomic operations. There is already an asm-generic implementation which does this using atomic_t. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
To support soft-masking of the performance monitor interrupt, a set of new powerpc_local_irq_pmu_save() and powerpc_local_irq_restore() functions are added. And powerpc_local_irq_save() implemented, by adding a new irq_soft_mask manipulation function irq_soft_mask_or_return(). Local_irq_pmu_* macros are provided to access these powerpc_local_irq_pmu* functions which includes trace_hardirqs_on|off() to match what we have in include/linux/irqflags.h. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
New Kconfig is added "CONFIG_PPC_IRQ_SOFT_MASK_DEBUG" to add WARN_ON to alert the invalid transitions. Also moved the code under the CONFIG_TRACE_IRQFLAGS in arch_local_irq_restore() to new Kconfig. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Fix name of CONFIG option in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Two new bit mask field "IRQ_DISABLE_MASK_PMU" is introduced to support the masking of PMI and "IRQ_DISABLE_MASK_ALL" to aid interrupt masking checking. Couple of new irq #defs "PACA_IRQ_PMI" and "SOFTEN_VALUE_0xf0*" added to use in the exception code to check for PMI interrupts. In the masked_interrupt handler, for PMIs we reset the MSR[EE] and return. In the __check_irq_replay(), replay the PMI interrupt by calling performance_monitor_common handler. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
To support addition of "bitmask" to MASKABLE_* macros, factor out the EXCPETION_PROLOG_1 macro. Make it explicit the interrupt masking supported by a gievn interrupt handler. Patch correspondingly extends the MASKABLE_* macros with an addition's parameter. "bitmask" parameter is passed to SOFTEN_TEST macro to decide on masking the interrupt. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Currently we use both EXCEPTION_PROLOG_1 and __EXCEPTION_PROLOG_1 in the MASKABLE_* macros. As a cleanup, this patch makes MASKABLE_* to use only __EXCEPTION_PROLOG_1. There is not logic change. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Rename the paca->soft_enabled to paca->irq_soft_mask as it is no longer used as a flag for interrupt state, but a mask. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
"paca->soft_enabled" is used as a flag to mask some of interrupts. Currently supported flags values and their details: soft_enabled MSR[EE] 0 0 Disabled (PMI and HMI not masked) 1 1 Enabled "paca->soft_enabled" is initialized to 1 to make the interripts as enabled. arch_local_irq_disable() will toggle the value when interrupts needs to disbled. At this point, the interrupts are not actually disabled, instead, interrupt vector has code to check for the flag and mask it when it occurs. By "mask it", it update interrupt paca->irq_happened and return. arch_local_irq_restore() is called to re-enable interrupts, which checks and replays interrupts if any occured. Now, as mentioned, current logic doesnot mask "performance monitoring interrupts" and PMIs are implemented as NMI. But this patchset depends on local_irq_* for a successful local_* update. Meaning, mask all possible interrupts during local_* update and replay them after the update. So the idea here is to reserve the "paca->soft_enabled" logic. New values and details: soft_enabled MSR[EE] 1 0 Disabled (PMI and HMI not masked) 0 1 Enabled Reason for the this change is to create foundation for a third mask value "0x2" for "soft_enabled" to add support to mask PMIs. When ->soft_enabled is set to a value "3", PMI interrupts are mask and when set to a value of "1", PMI are not mask. With this patch also extends soft_enabled as interrupt disable mask. Current flags are renamed from IRQ_[EN?DIS}ABLED to IRQS_ENABLED and IRQS_DISABLED. Patch also fixes the ptrace call to force the user to see the softe value to be alway 1. Reason being, even though userspace has no business knowing about softe, it is part of pt_regs. Like-wise in signal context. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Minor cleanup to use helper function for manipulating paca->soft_enabled variable. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Add a new wrapper function, soft_enabled_set_return(), added to do the paca->soft_enabled updates requiring a set-return. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Add a new wrapper function, soft_enabled_return(), added to return paca->soft_enabled value. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Move set_soft_enabled() from powerpc/kernel/irq.c to asm/hw_irq.c, to encourage updates to paca->soft_enabled done via these access function. Add "memory" clobber to hint compiler since paca->soft_enabled memory is the target here. Renaming it as soft_enabled_set() will make namespaces works better as prefix than a postfix when new soft_enabled manipulation functions are introduced. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
In powerpc/64, the arch_local_irq_disable() function returns unsigned long, which is not consistent with other architectures. Move that set-return asm implementation into arch_local_irq_save(), and make arch_local_irq_disable() return void, simplifying the assembly. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
arch_local_irq_disable is implemented strangely, with a temporary output register being set to the desired soft_enabled value via an immediate input, which is then used to store to memory. This is not required, the immediate can be specified directly as a register input. For simple cases at least, assembly is unchanged except register mapping. Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when updating paca->soft_enabled. Replace the hardcoded values used when updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic change. Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
We have always had softe in pt_regs, and accessible via PT_SOFTE, even though it is not userspace state. The value userspace sees should always be 1, because we should never be in userspace with interrupts soft disabled. In a subsequent patch we will be changing the semantics of the kernel softe value, so hard wire the value to 1 to retain the existing semantics. As far as we know nothing ever looks at it, but better safe than sorry. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Split out of larger patch, write change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
The recent changes to TLB handling broke the PS3 build: arch/powerpc/include/asm/book3s/64/tlbflush.h:30: undefined reference to `.hash__tlbiel_all' Fix it by adding an fallback version of tlbiel_all() for non-native builds. It should never be called, due to checks in callers so it calls BUG(). We should probably clean it up further but this will suffice for now. Fixes: d4748276 ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 18 Jan, 2018 6 commits
-
-
Nicholas Piggin authored
For shared processor guests (e.g., KVM), add an idle polling mode rather than immediately returning to the hypervisor when the guest CPU goes idle. Test setup is a 2 socket POWER9 with 4 guests running, each with vCPUs equal to 1/2 of real of CPUs. Saturated each guest with tbench. Using polling idle gives about 1.4x throughput. Kernel compile speed was not changed significantly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Since e1689795 ("cpuidle: Add common time keeping and irq enabling"), cpuidle drivers are expected to return from ->enter with irqs disabled. Update the cpuidle-powernv snooze and cede loops to disable irqs before returning. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Since e1689795 ("cpuidle: Add common time keeping and irq enabling"), cpuidle drivers are expected to return from ->enter with irqs disabled. Update the cpuidle-powernv snooze loop to disable irqs before returning. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
powerpc calls irq_exit() with local irqs disabled, therefore it can define __ARCH_IRQ_EXIT_IRQS_DISABLED. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
The powerpc NMI IPIs may not be recoverable if they are taken in some sections of code, and also there have been and still are issues with taking NMIs (in KVM guest code, in firmware, etc) which makes them a bit dangerous to use. Generic code like softlockup detector and rcu stall detectors really hammer on trigger_*_backtrace, which has lead to further problems because we've implemented it with the NMI. So stop providing NMI backtraces for now. Importantly, the powerpc code uses NMI IPIs in crash/debug, and the SMP hardlockup watchdog. So if the softlockup and rcu hang detection traces are not being printed because the CPU is stuck with interrupts off, then the hard lockup watchdog should get it with the NMI IPI. Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Book3S PACA memory allocation is restricted by the RMA limit and also must not take SLB faults when accessed in virtual mode. Currently a fixed 256MB limit is used for this, which is imprecise and sub-optimal. Update the paca allocation limits to use use the ppc64_rma_size for RMA limit, and share the safe_stack_limit() that is currently used for stack allocations that must not take virtual mode faults. The safe_stack_limit() name is changed to ppc64_bolted_size() to match ppc64_rma_size and some comments are updated. We also need to use early_mmu_has_feature() because we are now calling this function prior to the jump label patching that enables mmu_has_feature(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Change mmu_has_feature() to early_mmu_has_feature()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 17 Jan, 2018 1 commit
-
-
Nicholas Piggin authored
With the previous patch to switch to 64-bit mode after returning from RTAS and before doing any memory accesses, the RMA limit need not be clamped to 1GB to avoid RTAS bugs. Keep the 1GB limit for older firmware (although this is more of a kernel concern than RTAS), and remove it starting with POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-