net: dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()

In the blamed commit, it was not noticed that one implementation of chip->info->ops->phylink_get_caps(), called by mv88e6xxx_get_caps(), may access hardware registers, and in doing so, it takes the mv88e6xxx_reg_lock(). Namely, this is mv88e6352_phylink_get_caps(). This is a problem because mv88e6xxx_get_caps(), apart from being a top-level function (method invoked by dsa_switch_ops), is now also directly called from mv88e6xxx_setup_port(), which runs under the mv88e6xxx_reg_lock() taken by mv88e6xxx_setup(). Therefore, when running on mv88e6352, the reg_lock would be acquired a second time and the system would deadlock on driver probe. The things that mv88e6xxx_setup() can compete with in terms of register access with are the IRQ handlers and MDIO bus operations registered by mv88e6xxx_probe(). So there is a real need to acquire the register lock. The register lock can, in principle, be dropped and re-acquired pretty much at will within the driver, as long as no operations that involve waiting for indirect access to complete (essentially, callers of mv88e6xxx_smi_direct_wait() and mv88e6xxx_wait_mask()) are interrupted with the lock released. However, I would guess that in mv88e6xxx_setup(), the critical section is kept open for such a long time just in order to optimize away multiple lock/unlock operations on the registers. We could, in principle, drop the reg_lock right before the mv88e6xxx_setup_port() -> mv88e6xxx_get_caps() call, and re-acquire it immediately afterwards. But this would look ugly, because mv88e6xxx_setup_port() would release a lock which it didn't acquire, but the caller did. A cleaner solution to this issue comes from the observation that struct mv88e6xxxx_ops methods generally assume they are called with the reg_lock already acquired. Whereas mv88e6352_phylink_get_caps() is more the exception rather than the norm, in that it acquires the lock itself. Let's enforce the same locking pattern/convention for chip->info->ops->phylink_get_caps() as well, and make mv88e6xxx_get_caps(), the top-level function, acquire the register lock explicitly, for this one implementation that will access registers for port 4 to work properly. This means that mv88e6xxx_setup_port() will no longer call the top-level function, but the low-level mv88e6xxx_ops method which expects the correct calling context (register lock held). Compared to chip->info->ops->phylink_get_caps(), mv88e6xxx_get_caps() also fixes up the supported_interfaces bitmap for internal ports, since that can be done generically and does not require per-switch knowledge. That's code which will no longer execute, however mv88e6xxx_setup_port() doesn't need that. It just needs to look at the mac_capabilities bitmap. Fixes: cc1049cc ("net: dsa: mv88e6xxx: fix speed setting for CPU/DSA ports") Reported-by: Maksim Kiselev <bigunclemax@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Maksim Kiselev <bigunclemax@gmail.com> Link: https://lore.kernel.org/r/20221214110120.3368472-1-vladimir.oltean@nxp.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>

net: dsa: mv88e6xxx: avoid reg_lock deadlock in mv88e6xxx_setup_port()
In the blamed commit, it was not noticed that one implementation of chip->info->ops->phylink_get_caps(), called by mv88e6xxx_get_caps(), may access hardware registers, and in doing so, it takes the mv88e6xxx_reg_lock(). Namely, this is mv88e6352_phylink_get_caps(). This is a problem because mv88e6xxx_get_caps(), apart from being a top-level function (method invoked by dsa_switch_ops), is now also directly called from mv88e6xxx_setup_port(), which runs under the mv88e6xxx_reg_lock() taken by mv88e6xxx_setup(). Therefore, when running on mv88e6352, the reg_lock would be acquired a second time and the system would deadlock on driver probe. The things that mv88e6xxx_setup() can compete with in terms of register access with are the IRQ handlers and MDIO bus operations registered by mv88e6xxx_probe(). So there is a real need to acquire the register lock. The register lock can, in principle, be dropped and re-acquired pretty much at will within the driver, as long as no operations that involve waiting for indirect access to complete (essentially, callers of mv88e6xxx_smi_direct_wait() and mv88e6xxx_wait_mask()) are interrupted with the lock released. However, I would guess that in mv88e6xxx_setup(), the critical section is kept open for such a long time just in order to optimize away multiple lock/unlock operations on the registers. We could, in principle, drop the reg_lock right before the mv88e6xxx_setup_port() -> mv88e6xxx_get_caps() call, and re-acquire it immediately afterwards. But this would look ugly, because mv88e6xxx_setup_port() would release a lock which it didn't acquire, but the caller did. A cleaner solution to this issue comes from the observation that struct mv88e6xxxx_ops methods generally assume they are called with the reg_lock already acquired. Whereas mv88e6352_phylink_get_caps() is more the exception rather than the norm, in that it acquires the lock itself. Let's enforce the same locking pattern/convention for chip->info->ops->phylink_get_caps() as well, and make mv88e6xxx_get_caps(), the top-level function, acquire the register lock explicitly, for this one implementation that will access registers for port 4 to work properly. This means that mv88e6xxx_setup_port() will no longer call the top-level function, but the low-level mv88e6xxx_ops method which expects the correct calling context (register lock held). Compared to chip->info->ops->phylink_get_caps(), mv88e6xxx_get_caps() also fixes up the supported_interfaces bitmap for internal ports, since that can be done generically and does not require per-switch knowledge. That's code which will no longer execute, however mv88e6xxx_setup_port() doesn't need that. It just needs to look at the mac_capabilities bitmap. Fixes: cc1049cc ("net: dsa: mv88e6xxx: fix speed setting for CPU/DSA ports") Reported-by: Maksim Kiselev <bigunclemax@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Maksim Kiselev <bigunclemax@gmail.com> Link: https://lore.kernel.org/r/20221214110120.3368472-1-vladimir.oltean@nxp.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
a7d82367 · Vladimir Oltean · Paolo Abeni · c72a7e42 · a7d82367
Commit a7d82367 authored Dec 14, 2022 by Vladimir Oltean Committed by Paolo Abeni Dec 15, 2022
Hide whitespace changes
Inline Side-by-side

Showing with 4 additions and 5 deletions

drivers/net/dsa/mv88e6xxx/chip.c drivers/net/dsa/mv88e6xxx/chip.c +4 -5

No files found.
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -689,13 +689,12 @@ static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,

 	/* Port 4 supports automedia if the serdes is associated with it. */
 	if (port == 4) {
-		mv88e6xxx_reg_lock(chip);
 		err = mv88e6352_g2_scratch_port_has_serdes(chip, port);
 		if (err < 0)
 			dev_err(chip->dev, "p%d: failed to read scratch\n",
 				port);
 		if (err <= 0)
-			goto unlock;
+			return;

 		cmode = mv88e6352_get_port4_serdes_cmode(chip);
 		if (cmode < 0)
@@ -703,8 +702,6 @@ static void mv88e6352_phylink_get_caps(struct mv88e6xxx_chip *chip, int port,
 				port);
 		else
 			mv88e6xxx_translate_cmode(cmode, supported);
-unlock:
-		mv88e6xxx_reg_unlock(chip);
 	}
 }

@@ -831,7 +828,9 @@ static void mv88e6xxx_get_caps(struct dsa_switch *ds, int port,
 {
 	struct mv88e6xxx_chip *chip = ds->priv;

+	mv88e6xxx_reg_lock(chip);
 	chip->info->ops->phylink_get_caps(chip, port, config);
+	mv88e6xxx_reg_unlock(chip);

 	if (mv88e6xxx_phy_is_internal(ds, port)) {
 		__set_bit(PHY_INTERFACE_MODE_INTERNAL,
@@ -3307,7 +3306,7 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
 		struct phylink_config pl_config = {};
 		unsigned long caps;

-		mv88e6xxx_get_caps(ds, port, &pl_config);
+		chip->info->ops->phylink_get_caps(chip, port, &pl_config);

 		caps = pl_config.mac_capabilities;