-
Marc Zyngier authored
Running a lockedp-enabled kernel on a vim3l board (Amlogic SM1) leads to the following splat: [ 13.557138] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 13.587485] ip/456 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 13.625922] ffff000059908cf0 (&irq_desc_lock_class){-.-.}-{2:2}, at: __setup_irq+0xf8/0x8d8 [ 13.632273] which would create a new lock dependency: [ 13.637272] (&irq_desc_lock_class){-.-.}-{2:2} -> (&ctl->lock){+.+.}-{2:2} [ 13.644209] [ 13.644209] but this new dependency connects a HARDIRQ-irq-safe lock: [ 13.654122] (&irq_desc_lock_class){-.-.}-{2:2} [ 13.654125] [ 13.654125] ... which became HARDIRQ-irq-safe at: [ 13.664759] lock_acquire+0xec/0x368 [ 13.666926] _raw_spin_lock+0x60/0x88 [ 13.669979] handle_fasteoi_irq+0x30/0x178 [ 13.674082] generic_handle_irq+0x38/0x50 [ 13.678098] __handle_domain_irq+0x6c/0xc8 [ 13.682209] gic_handle_irq+0x5c/0xb0 [ 13.685872] el1_irq+0xd0/0x180 [ 13.689010] arch_cpu_idle+0x40/0x220 [ 13.692732] default_idle_call+0x54/0x60 [ 13.696677] do_idle+0x23c/0x2e8 [ 13.699903] cpu_startup_entry+0x30/0x50 [ 13.703852] rest_init+0x1e0/0x2b4 [ 13.707301] arch_call_rest_init+0x18/0x24 [ 13.711449] start_kernel+0x4ec/0x51c [ 13.715167] [ 13.715167] to a HARDIRQ-irq-unsafe lock: [ 13.722426] (&ctl->lock){+.+.}-{2:2} [ 13.722430] [ 13.722430] ... which became HARDIRQ-irq-unsafe at: [ 13.732319] ... [ 13.732324] lock_acquire+0xec/0x368 [ 13.735985] _raw_spin_lock+0x60/0x88 [ 13.739452] meson_gpio_irq_domain_alloc+0xcc/0x290 [ 13.744392] irq_domain_alloc_irqs_hierarchy+0x24/0x60 [ 13.749586] __irq_domain_alloc_irqs+0x160/0x2f0 [ 13.754254] irq_create_fwspec_mapping+0x118/0x320 [ 13.759073] irq_create_of_mapping+0x78/0xa0 [ 13.763360] of_irq_get+0x6c/0x80 [ 13.766701] of_mdiobus_register_phy+0x10c/0x238 [of_mdio] [ 13.772227] of_mdiobus_register+0x158/0x380 [of_mdio] [ 13.777388] mdio_mux_init+0x180/0x2e8 [mdio_mux] [ 13.782128] g12a_mdio_mux_probe+0x290/0x398 [mdio_mux_meson_g12a] [ 13.788349] platform_drv_probe+0x5c/0xb0 [ 13.792379] really_probe+0xe4/0x448 [ 13.795979] driver_probe_device+0xe8/0x140 [ 13.800189] __device_attach_driver+0x94/0x120 [ 13.804639] bus_for_each_drv+0x84/0xd8 [ 13.808474] __device_attach+0xe4/0x168 [ 13.812361] device_initial_probe+0x1c/0x28 [ 13.816592] bus_probe_device+0xa4/0xb0 [ 13.820430] deferred_probe_work_func+0xa8/0x100 [ 13.825064] process_one_work+0x264/0x688 [ 13.829088] worker_thread+0x4c/0x458 [ 13.832768] kthread+0x154/0x158 [ 13.836018] ret_from_fork+0x10/0x18 [ 13.839612] [ 13.839612] other info that might help us debug this: [ 13.839612] [ 13.850354] Possible interrupt unsafe locking scenario: [ 13.850354] [ 13.855720] CPU0 CPU1 [ 13.858774] ---- ---- [ 13.863242] lock(&ctl->lock); [ 13.866330] local_irq_disable(); [ 13.872233] lock(&irq_desc_lock_class); [ 13.878705] lock(&ctl->lock); [ 13.884297] <Interrupt> [ 13.886857] lock(&irq_desc_lock_class); [ 13.891014] [ 13.891014] *** DEADLOCK *** The issue can occur when CPU1 is doing something like irq_set_type() and CPU0 performing an interrupt allocation, for example. Taking an interrupt (like the one being reconfigured) would lead to a deadlock. A solution to this is: - Reorder the locking so that meson_gpio_irq_update_bits takes the lock itself at all times, instead of relying on the caller to lock or not, hence making the RMW sequence atomic, - Rework the critical section in meson_gpio_irq_request_channel to only cover the allocation itself, and let the gpio_irq_sel_pin callback deal with its own locking if required, - Take the private spin-lock with interrupts disabled at all times Reviewed-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
0a66d6f9