Commit b7cb707c authored by Gerald Schaefer's avatar Gerald Schaefer Committed by Martin Schwidefsky

s390/smp: fix CPU hotplug deadlock with CPU rescan

smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.

This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.

For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

        chcpu (rescan)                       systemd-udevd

 echo 1 > /sys/../rescan
 -> smp_rescan_cpus()
 -> (*) get_online_cpus()
    (increases refcount)
 -> smp_add_present_cpu()
    (new CPU found)
 -> register_cpu()
 -> device_add()
 -> udev "add" event triggered -----------> udev rule sets CPU online
                                         -> echo 1 > /sys/.../online
                                         -> lock_device_hotplug_sysfs()
                                            (this is missing in rescan path)
                                         -> device_online()
                                         -> (**) device_lock(new CPU dev)
                                         -> cpu_up()
                                         -> cpu_hotplug_begin()
                                            (loops until refcount == 0)
                                            -> deadlock with (*)
 -> bus_probe_device()
 -> device_attach()
 -> device_lock(new CPU dev)
    -> deadlock with (**)

Fix this by taking the device_hotplug_lock in the CPU rescan path.

Cc: <stable@vger.kernel.org>
Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
parent a3866208
...@@ -1166,7 +1166,11 @@ static ssize_t __ref rescan_store(struct device *dev, ...@@ -1166,7 +1166,11 @@ static ssize_t __ref rescan_store(struct device *dev,
{ {
int rc; int rc;
rc = lock_device_hotplug_sysfs();
if (rc)
return rc;
rc = smp_rescan_cpus(); rc = smp_rescan_cpus();
unlock_device_hotplug();
return rc ? rc : count; return rc ? rc : count;
} }
static DEVICE_ATTR_WO(rescan); static DEVICE_ATTR_WO(rescan);
......
...@@ -60,7 +60,9 @@ static void sclp_cpu_capability_notify(struct work_struct *work) ...@@ -60,7 +60,9 @@ static void sclp_cpu_capability_notify(struct work_struct *work)
static void __ref sclp_cpu_change_notify(struct work_struct *work) static void __ref sclp_cpu_change_notify(struct work_struct *work)
{ {
lock_device_hotplug();
smp_rescan_cpus(); smp_rescan_cpus();
unlock_device_hotplug();
} }
static void sclp_conf_receiver_fn(struct evbuf_header *evbuf) static void sclp_conf_receiver_fn(struct evbuf_header *evbuf)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment