• Vladimir Oltean's avatar
    net: dsa: fix panic when DSA master device unbinds on shutdown · ee534378
    Vladimir Oltean authored
    Rafael reports that on a system with LX2160A and Marvell DSA switches,
    if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
    panic can be seen:
    
    systemd-shutdown[1]: Rebooting.
    Unable to handle kernel paging request at virtual address 00a0000800000041
    [00a0000800000041] address between user and kernel address ranges
    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
    pc : dsa_slave_netdevice_event+0x130/0x3e4
    lr : raw_notifier_call_chain+0x50/0x6c
    Call trace:
     dsa_slave_netdevice_event+0x130/0x3e4
     raw_notifier_call_chain+0x50/0x6c
     call_netdevice_notifiers_info+0x54/0xa0
     __dev_close_many+0x50/0x130
     dev_close_many+0x84/0x120
     unregister_netdevice_many+0x130/0x710
     unregister_netdevice_queue+0x8c/0xd0
     unregister_netdev+0x20/0x30
     dpaa2_eth_remove+0x68/0x190
     fsl_mc_driver_remove+0x20/0x5c
     __device_release_driver+0x21c/0x220
     device_release_driver_internal+0xac/0xb0
     device_links_unbind_consumers+0xd4/0x100
     __device_release_driver+0x94/0x220
     device_release_driver+0x28/0x40
     bus_remove_device+0x118/0x124
     device_del+0x174/0x420
     fsl_mc_device_remove+0x24/0x40
     __fsl_mc_device_remove+0xc/0x20
     device_for_each_child+0x58/0xa0
     dprc_remove+0x90/0xb0
     fsl_mc_driver_remove+0x20/0x5c
     __device_release_driver+0x21c/0x220
     device_release_driver+0x28/0x40
     bus_remove_device+0x118/0x124
     device_del+0x174/0x420
     fsl_mc_bus_remove+0x80/0x100
     fsl_mc_bus_shutdown+0xc/0x1c
     platform_shutdown+0x20/0x30
     device_shutdown+0x154/0x330
     __do_sys_reboot+0x1cc/0x250
     __arm64_sys_reboot+0x20/0x30
     invoke_syscall.constprop.0+0x4c/0xe0
     do_el0_svc+0x4c/0x150
     el0_svc+0x24/0xb0
     el0t_64_sync_handler+0xa8/0xb0
     el0t_64_sync+0x178/0x17c
    
    It can be seen from the stack trace that the problem is that the
    deregistration of the master causes a dev_close(), which gets notified
    as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
    But dsa_switch_shutdown() has already run, and this has unregistered the
    DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
    call dev_close_many() on those slave interfaces, leading to the problem.
    
    The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
    dsa_switch_shutdown() was called seems improper. Unregistering the slave
    interfaces is unnecessary and unhelpful. Instead, after the slaves have
    stopped being uppers of the DSA master, we can now reset to NULL the
    master->dsa_ptr pointer, which will make DSA start ignoring all future
    notifier events on the master.
    
    Fixes: 0650bf52 ("net: dsa: be compatible with masters which unregister on shutdown")
    Reported-by: default avatarRafael Richter <rafael.richter@gin.de>
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    ee534378
dsa2.c 38.5 KB