- 09 May, 2019 1 commit
-
-
Michael Ellerman authored
When implementing the KUAP support on Radix we fixed one case where mmu_has_feature() was being called too early in boot via __put_user_size(). However since then some new code in linux-next has created a new path via which we can end up calling mmu_has_feature() too early. On P9 this leads to crashes early in boot if we have both PPC_KUAP and CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code calls printk() which calls probe_kernel_read(), that does a __copy_from_user_inatomic() which calls into set_kuap() and that uses mmu_has_feature(). At that point in boot we haven't patched MMU features yet so the debug code in mmu_has_feature() complains, and calls printk(). At that point we recurse, eg: ... dump_stack+0xdc probe_kernel_read+0x1a4 check_pointer+0x58 ... printk+0x40 dump_stack_print_info+0xbc dump_stack+0x8 probe_kernel_read+0x1a4 probe_kernel_read+0x19c check_pointer+0x58 ... printk+0x40 cpufeatures_process_feature+0xc8 scan_cpufeatures_subnodes+0x380 of_scan_flat_dt_subnodes+0xb4 dt_cpu_ftrs_scan_callback+0x158 of_scan_flat_dt+0xf0 dt_cpu_ftrs_scan+0x3c early_init_devtree+0x360 early_setup+0x9c And so on for infinity, symptom is a dead system. Even more fun is what happens when using the hash MMU (ie. p8 or p9 with Radix disabled), and when we don't have CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled we don't check if static keys have been initialised, we just rely on the jump label. But the jump label defaults to true so we just whack the AMR even though Radix is not enabled. Clearing the AMR is fine, but after we've done the user copy we write (0b11 << 62) into AMR. When using hash that makes all pages with key zero no longer readable or writable. All kernel pages implicitly have key zero, and so all of a sudden the kernel can't read or write any of its memory. Again dead system. In the medium term we have several options for fixing this. probe_kernel_read() doesn't need to touch AMR at all, it's not doing a user access after all, but it uses __copy_from_user_inatomic() just because it's easy, we could fix that. It would also be safe to default to not writing to the AMR during early boot, until we've detected features. But it's not clear that flipping all the MMU features to static_key_false won't introduce other bugs. But for now just switch to early_mmu_has_feature() in set_kuap(), that avoids all the problems with jump labels. It adds the overhead of a global lookup and test, but that's probably trivial compared to the writes to the AMR anyway. Fixes: 890274c2 ("powerpc/64s: Implement KUAP for Radix MMU") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Russell Currey <ruscur@russell.cc>
-
- 07 May, 2019 1 commit
-
-
Rick Lindsley authored
When the memset code was added to pgd_alloc(), it failed to consider that kmem_cache_alloc() can return NULL. It's uncommon, but not impossible under heavy memory contention. Example oops: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000000a4000 Oops: Kernel access of bad area, sig: 11 [#1] LE SMP NR_CPUS=2048 NUMA pSeries CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1 task: c000000334a00000 task.stack: c000000331c00000 NIP: c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020 REGS: c000000331c039c0 TRAP: 0300 Not tainted (4.14.0-115.6.1.el7a.ppc64le) MSR: 800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 44022840 XER: 20040000 CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1 ... NIP [c0000000000a4000] memset+0x68/0x104 LR [c00000000012f43c] mm_init+0x27c/0x2f0 Call Trace: mm_init+0x260/0x2f0 (unreliable) copy_mm+0x11c/0x638 copy_process.isra.28.part.29+0x6fc/0x1080 _do_fork+0xdc/0x4c0 ppc_clone+0x8/0xc Instruction dump: 409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0 7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018 Fixes: fc5c2f4a ("powerpc/mm/hash64: Zero PGD pages on allocation") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Rick Lindsley <ricklind@vnet.linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 06 May, 2019 6 commits
-
-
Sachin Sant authored
This patch fixes a regression by using correct kernel config variable for HUGETLB_PAGE_SIZE_VARIABLE. Without this huge pages are disabled during kernel boot. [0.309496] hugetlbfs: disabling because there are no supported hugepage sizes Fixes: c5710cd2 ("powerpc/mm: cleanup HPAGE_SHIFT setup") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Wei Yongjun authored
In case of error, the function eventfd_ctx_fdget() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). This issue was detected by using the Coccinelle software. Fixes: 06014661 ("ocxl: move event_fd handling to frontend") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
commit b28c9750 ("powerpc/64: Setup KUP on secondary CPUs") moved setup_kup() out of the __init section. As stated in that commit, "this is only for 64-bit". But this function is also used on PPC32, where the two functions called by setup_kup() are in the __init section, so setup_kup() has to either be kept in the __init section on PPC32 or marked __ref. This patch marks it __ref, it fixes the below build warnings. MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x169ec): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuep() The function setup_kup() references the function __init setup_kuep(). This is often because setup_kup lacks a __init annotation or the annotation of setup_kuep is wrong. WARNING: vmlinux.o(.text+0x16a04): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuap() The function setup_kup() references the function __init setup_kuap(). This is often because setup_kup lacks a __init annotation or the annotation of setup_kuap is wrong. Fixes: b28c9750 ("powerpc/64: Setup KUP on secondary CPUs") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
The patch identified below added pgtable-frag.o to obj-y but some merge witchery kept it also for obj-CONFIG_PPC_BOOK3S_64 This patch clears the duplication. Fixes: 737b434d ("powerpc/mm: convert Book3E 64 to pte_fragment") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
In commit 17312f25 ("powerpc/mm: Move book3s32 specifics in subdirectory mm/book3s64"), ppc_mmu_32.c was moved and renamed. This patch fixes Makefiles to disable KASAN instrumentation on the new name and location. Fixes: f072015c ("powerpc: disable KASAN instrumentation on early/critical files.") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
For unknown reason (aka. mpe is a doofus), the new Makefile added via the KASAN support patch didn't land into arch/powerpc/mm/kasan/ This patch restores it. Fixes: 2edb16ef ("powerpc/32: Add KASAN support") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 02 May, 2019 32 commits
-
-
Breno Leitao authored
This is a new selftest that raises SIGUSR1 signals and handles it in a set of different ways, trying to create different scenario for testing purpose. This test works raising a signal and calling sigreturn interleaved with TM operations, as starting, suspending and terminating a transaction. The test depends on random numbers, and, based on them, it sets different TM states. Other than that, the test fills out the user context struct that is passed to the sigreturn system call with random data, in order to make sure that the signal handler syscall can handle different and invalid states properly. This selftest has command line parameters to control what kind of tests the user wants to run, as for example, if a transaction should be started prior to signal being raised, or, after the signal being raised and before the sigreturn. If no parameter is given, the default is enabling all options. This test does not check if the user context is being read and set properly by the kernel. Its purpose, at this time, is basically guaranteeing that the kernel does not crash on invalid scenarios. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Laurentiu Tudor authored
Set RI in the default kernel's MSR so that the architected way of detecting unrecoverable machine check interrupts has a chance to work. This is inline with the MSR setup of the rest of booke powerpc architectures configured here. Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
External drivers that communicate via OpenCAPI will need to make MMIO calls to interact with the devices. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Event_fd is only used in the driver frontend, so it does not need to exist in the backend code. Relocate it to the frontend and provide an opaque mechanism for consumers instead. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
The use of offsets is required only in the frontend, so alter the IRQ API to only work with IRQ IDs in the backend. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Most OpenCAPI operations require a valid context, so exposing these functions to external drivers is necessary. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
The OCXL driver contains both frontend code for interacting with userspace, as well as backend code for interacting with the hardware. This patch separates the backend code from the frontend so that it can be used by other device drivers that communicate via OpenCAPI. Relocate dev, cdev & sysfs files to the frontend code to allow external drivers to maintain their own devices. Reference counting on the device in the backend is replaced with kref counting. Move file & sysfs layer initialisation from core.c (backend) to pci.c (frontend). Create an ocxl_function oriented interface for initing devices & enumerating AFUs. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
This data is already available in a struct Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
In preparation for making core code available for external drivers, move the core code out of pci.c and into core.c Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
Remove some unused exported symbols. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
The 'extern' keyword adds no value here. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
No need for a return value in read_pasid as it only returns 0. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alastair D'Silva authored
The term 'link' is ambiguous (especially when the struct is used for a list), so rename it for clarity. Signed-off-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Greg Kurz <groug@kaod.org> Acked-by: Frederic Barrat <fbarrat@linux.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Add PMU functions to support trace-imc. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Patch detects trace-imc events, does memory initilizations for each online cpu, and registers cpuhotplug call-backs. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Add code to restrict user access to thread_imc pmu since some event report privilege level information. Fixes: f74c89bd ("powerpc/perf: Add thread IMC PMU support") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
LDBAR holds the memory address allocated for each cpu. For thread-imc the mode bit (i.e bit 1) of LDBAR is set to accumulation. Currently, ldbar is loaded with per cpu memory address and mode set to accumulation at boot time. To enable trace-imc, the mode bit of ldbar should be set to 'trace'. So to accommodate trace-mode of IMC, reposition setting of ldbar for thread-imc to thread_imc_event_add(). Also reset ldbar at thread_imc_event_del(). Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Add the macros needed for IMC (In-Memory Collection Counters) trace-mode and data structure to hold the trace-imc record data. Also, add the new type "OPAL_IMC_COUNTERS_TRACE" in 'opal-api.h', since there is a new switch case added in the opal-calls for IMC. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
The data structure (i.e struct imc_mem_info) to hold the memory address information for nest imc units is allocated based on the number of nodes in the system. nest_imc_event_init() traverse this struct array to calculate the memory base address for the event-cpu. If we fail to find a match for the event cpu's chip-id in imc_mem_info struct array, then the do-while loop will iterate until we crash. Fix this by changing the loop exit condition based on the number of non zero vbase elements in the array, since the allocation is done for nr_chips + 1. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Anju T Sudhakar authored
Nest hardware counter memory resides in a per-chip reserve-memory. During nest_imc_event_init(), chip-id of the event-cpu is considered to calculate the base memory addresss for that cpu. Return, proper error condition if the chip_id calculated is invalid. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support") Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
PM_BR_CMPL_ALT event is not supported, remove it from the power9 event list. Fixes: 24bedcb7 ("powerpc/perf: Fix branch event code for power9") Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Most of the power processor generation performance monitoring unit (PMU) driver code is bundled in the kernel and one of those is enabled/registered based on the oprofile_cpu_type check at the boot. But things get little tricky incase of "compat" mode boot. IBM POWER System Server based processors has a compactibility mode feature, which simpily put is, Nth generation processor (lets say POWER8) will act and appear in a mode consistent with an earlier generation (N-1) processor (that is POWER7). And in this "compat" mode boot, kernel modify the "oprofile_cpu_type" to be Nth generation (POWER8). If Nth generation pmu driver is bundled (POWER8), it gets registered. Key dependency here is to have distro support for latest processor performance monitoring support. Patch here adds a generic "compat-mode" performance monitoring driver to be register in absence of powernv platform specific pmu driver. Driver supports only "cycles" and "instruction" events. "0x0001e" used as event code for "cycles" and "0x00002" used as event code for "instruction" events. New file called "generic-compat-pmu.c" is created to contain the driver specific code. And base raw event code format modeled on PPMU_ARCH_207S. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Use SPDX tag for license] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Madhavan Srinivasan authored
Currenty pmu driver file for each ppc64 generation processor has a __init call in itself. Refactor the code by moving the __init call to core-books.c. This also clean's up compat mode pmu driver registration. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> [mpe: Use SPDX tag for license] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Joe Perches authored
Fix fallout too. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Joel Stanley authored
Those not of us not drowning in POWER might not know what this means. Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Joel Stanley authored
It turns out that some defconfig changes and kernel config option changes meant we accidentally dropped Ethernet support for Mellanox CLX5 cards. Fixes: cbc39809 ("powerpc/configs: Update skiroot defconfig") Reported-by: Carol L Soto <clsoto@us.ibm.com> Suggested-by: Carol L Soto <clsoto@us.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Mahesh Salgaonkar authored
On TOD/TB errors timebase register stops/freezes until HMI error recovery gets TOD/TB back into running state. On successful recovery, TB starts running again and udelay() that relies on TB value continues to function properly. But in case when HMI fails to recover from TOD/TB errors, the TB register stay freezed. With TB not running the __delay() function keeps looping and never return. If __delay() is called while in panic path then system hangs and never reboots after panic. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christopher M. Riedl authored
Operations which write to memory and special purpose registers should be restricted on systems with integrity guarantees (such as Secure Boot) and, optionally, to avoid self-destructive behaviors. Add a config option, XMON_DEFAULT_RO_MODE, to set default xmon behavior. The kernel cmdline options xmon=ro and xmon=rw override this default. The following xmon operations are affected: memops: disable memmove disable memset disable memzcan memex: no-op'd mwrite super_regs: no-op'd write_spr bpt_cmds: disable proc_call: disable Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf> Reviewed-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Bo YU authored
This is detected by Coverity scan: CID: 1440481 Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Valentin Schneider authored
Since the enabling and disabling of IRQs within preempt_schedule_irq() is contained in a need_resched() loop, we don't need the outer arch code loop. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> [mpe: Rebase since CURRENT_THREAD_INFO() removal] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Horia Geantă authored
crypto node alias is needed by U-boot to identify the node and perform fix-ups, like adding "fsl,sec-era" property. Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
PROM_SCRATCH_SIZE is same as sizeof(prom_scratch) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-