• Arseny Solokha's avatar
    net: phylink: don't start and stop SGMII PHYs in SFP modules twice · c7fa7f56
    Arseny Solokha authored
    SFP modules connected using the SGMII interface have their own PHYs which
    are handled by the struct phylink's phydev field. On the other hand, for
    the modules connected using 1000Base-X interface that field is not set.
    
    Since commit ce0aa27f ("sfp: add sfp-bus to bridge between network
    devices and sfp cages") phylink_start() ends up setting the phydev field
    using the sfp-bus infrastructure, which eventually calls phy_start() on it,
    and then calling phy_start() again on the same phydev from phylink_start()
    itself. Similar call sequence holds for phylink_stop(), only in the reverse
    order. This results in WARNs during network interface bringup and shutdown
    when a copper SFP module is connected, as phy_start() and phy_stop() are
    called twice in a row for the same phy_device:
    
      % ip link set up dev eth0
      ------------[ cut here ]------------
      called from state UP
      WARNING: CPU: 1 PID: 155 at drivers/net/phy/phy.c:895 phy_start+0x74/0xc0
      Modules linked in:
      CPU: 1 PID: 155 Comm: backend Not tainted 5.2.0+ #1
      NIP:  c0227bf0 LR: c0227bf0 CTR: c004d224
      REGS: df547720 TRAP: 0700   Not tainted  (5.2.0+)
      MSR:  00029000 <CE,EE,ME>  CR: 24002822  XER: 00000000
    
      GPR00: c0227bf0 df5477d8 df5d7080 00000014 df9d2370 df9d5ac4 1f4eb000 00000001
      GPR08: c061fe58 00000000 00000000 df5477d8 0000003c 100c8768 00000000 00000000
      GPR16: df486a00 c046f1c8 c046eea0 00000000 c046e904 c0239604 db68449c 00000000
      GPR24: e9083204 00000000 00000001 db684460 e9083404 00000000 db6dce00 db6dcc00
      NIP [c0227bf0] phy_start+0x74/0xc0
      LR [c0227bf0] phy_start+0x74/0xc0
      Call Trace:
      [df5477d8] [c0227bf0] phy_start+0x74/0xc0 (unreliable)
      [df5477e8] [c023cad0] startup_gfar+0x398/0x3f4
      [df547828] [c023cf08] gfar_enet_open+0x364/0x374
      [df547898] [c029d870] __dev_open+0xe4/0x140
      [df5478c8] [c029db70] __dev_change_flags+0xf0/0x188
      [df5478f8] [c029dc28] dev_change_flags+0x20/0x54
      [df547918] [c02ae304] do_setlink+0x310/0x818
      [df547a08] [c02b1eb8] __rtnl_newlink+0x384/0x6b0
      [df547c28] [c02b222c] rtnl_newlink+0x48/0x68
      [df547c48] [c02ad7c8] rtnetlink_rcv_msg+0x240/0x27c
      [df547c98] [c02cc068] netlink_rcv_skb+0x8c/0xf0
      [df547cd8] [c02cba3c] netlink_unicast+0x114/0x19c
      [df547d08] [c02cbd74] netlink_sendmsg+0x2b0/0x2c0
      [df547d58] [c027b668] sock_sendmsg_nosec+0x20/0x40
      [df547d68] [c027d080] ___sys_sendmsg+0x17c/0x1dc
      [df547e98] [c027df7c] __sys_sendmsg+0x68/0x84
      [df547ef8] [c027e430] sys_socketcall+0x1a0/0x204
      [df547f38] [c000d1d8] ret_from_syscall+0x0/0x38
      --- interrupt: c01 at 0xfd4e030
          LR = 0xfd4e010
      Instruction dump:
      813f0188 38800000 2b890005 419d0014 3d40c046 5529103a 394aa208 7c8a482e
      3c60c046 3863a1b8 4cc63182 4be009a1 <0fe00000> 48000030 3c60c046 3863a1d0
      ---[ end trace d4c095aeaf6ea998 ]---
    
    and
    
      % ip link set down dev eth0
      ------------[ cut here ]------------
      called from state HALTED
      WARNING: CPU: 1 PID: 184 at drivers/net/phy/phy.c:858 phy_stop+0x3c/0x88
    
      <...>
    
      Call Trace:
      [df581788] [c0228450] phy_stop+0x3c/0x88 (unreliable)
      [df581798] [c022d548] sfp_sm_phy_detach+0x1c/0x44
      [df5817a8] [c022e8cc] sfp_sm_event+0x4b0/0x87c
      [df581848] [c022f04c] sfp_upstream_stop+0x34/0x44
      [df581858] [c0225608] phylink_stop+0x7c/0xe4
      [df581868] [c023c57c] stop_gfar+0x7c/0x94
      [df581888] [c023c5b8] gfar_close+0x24/0x94
      [df5818a8] [c0298688] __dev_close_many+0xdc/0xf8
      [df5818c8] [c029db58] __dev_change_flags+0xd8/0x188
      [df5818f8] [c029dc28] dev_change_flags+0x20/0x54
      [df581918] [c02ae304] do_setlink+0x310/0x818
      [df581a08] [c02b1eb8] __rtnl_newlink+0x384/0x6b0
      [df581c28] [c02b222c] rtnl_newlink+0x48/0x68
      [df581c48] [c02ad7c8] rtnetlink_rcv_msg+0x240/0x27c
      [df581c98] [c02cc068] netlink_rcv_skb+0x8c/0xf0
      [df581cd8] [c02cba3c] netlink_unicast+0x114/0x19c
      [df581d08] [c02cbd74] netlink_sendmsg+0x2b0/0x2c0
      [df581d58] [c027b668] sock_sendmsg_nosec+0x20/0x40
      [df581d68] [c027d080] ___sys_sendmsg+0x17c/0x1dc
      [df581e98] [c027df7c] __sys_sendmsg+0x68/0x84
      [df581ef8] [c027e430] sys_socketcall+0x1a0/0x204
      [df581f38] [c000d1d8] ret_from_syscall+0x0/0x38
    
      <...>
    
      ---[ end trace d4c095aeaf6ea999 ]---
    
    SFP modules with the 1000Base-X interface are not affected.
    
    Place explicit calls to phy_start() and phy_stop() before enabling or after
    disabling an attached SFP module, where phydev is not yet set (or is
    already unset), so they will be made only from the inside of sfp-bus, if
    needed.
    
    Fixes: 21796261 ("net: phy: warn if phy_start is called from invalid state")
    Signed-off-by: default avatarArseny Solokha <asolokha@kb.kras.ru>
    Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c7fa7f56
phylink.c 48.3 KB