Commit 68fb18aa authored by Srivatsa S. Bhat's avatar Srivatsa S. Bhat Committed by Benjamin Herrenschmidt

powerpc: Add debug checks to catch invalid cpu-to-node mappings

There have been some weird bugs in the past where the kernel tried to associate
threads of the same core to different NUMA nodes, and things went haywire after
that point (as expected).

But unfortunately, root-causing such issues have been quite challenging, due to
the lack of appropriate debug checks in the kernel. These bugs usually lead to
some odd soft-lockups in the scheduler's build-sched-domain code in the CPU
hotplug path, which makes it very hard to trace it back to the incorrect
cpu-to-node mappings.

So add appropriate debug checks to catch such invalid cpu-to-node mappings
as early as possible.
Signed-off-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
parent d4edc5b6
...@@ -570,16 +570,38 @@ static int numa_setup_cpu(unsigned long lcpu) ...@@ -570,16 +570,38 @@ static int numa_setup_cpu(unsigned long lcpu)
return nid; return nid;
} }
static void verify_cpu_node_mapping(int cpu, int node)
{
int base, sibling, i;
/* Verify that all the threads in the core belong to the same node */
base = cpu_first_thread_sibling(cpu);
for (i = 0; i < threads_per_core; i++) {
sibling = base + i;
if (sibling == cpu || cpu_is_offline(sibling))
continue;
if (cpu_to_node(sibling) != node) {
WARN(1, "CPU thread siblings %d and %d don't belong"
" to the same node!\n", cpu, sibling);
break;
}
}
}
static int cpu_numa_callback(struct notifier_block *nfb, unsigned long action, static int cpu_numa_callback(struct notifier_block *nfb, unsigned long action,
void *hcpu) void *hcpu)
{ {
unsigned long lcpu = (unsigned long)hcpu; unsigned long lcpu = (unsigned long)hcpu;
int ret = NOTIFY_DONE; int ret = NOTIFY_DONE, nid;
switch (action) { switch (action) {
case CPU_UP_PREPARE: case CPU_UP_PREPARE:
case CPU_UP_PREPARE_FROZEN: case CPU_UP_PREPARE_FROZEN:
numa_setup_cpu(lcpu); nid = numa_setup_cpu(lcpu);
verify_cpu_node_mapping((int)lcpu, nid);
ret = NOTIFY_OK; ret = NOTIFY_OK;
break; break;
#ifdef CONFIG_HOTPLUG_CPU #ifdef CONFIG_HOTPLUG_CPU
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment