bnx2: Close device if tx_timeout reset fails
Based on original patch and description from Flavio Leitner <fbl@redhat.com> When bnx2_reset_task() is called, it will stop, (re)initialize and start the interface to restore the working condition. The bnx2_init_nic() calls bnx2_reset_nic() which will reset the chip and then calls bnx2_free_skbs() to free all the skbs. The problem happens when bnx2_init_chip() fails because bnx2_reset_nic() will just return skipping the ring initializations at bnx2_init_all_rings(). Later, the reset task starts the interface again and the system crashes due a NULL pointer access (no skb in the ring). To fix it, we call dev_close() if bnx2_init_nic() fails. One minor wrinkle to deal with is the cancel_work_sync() call in bnx2_close() to cancel bnx2_reset_task(). The call will wait forever because it is trying to cancel itself and the workqueue will be stuck. Since bnx2_reset_task() holds the rtnl_lock() and checks for netif_running() before proceeding, there is no need to cancel bnx2_reset_task() in bnx2_close() even if bnx2_close() and bnx2_reset_task() are running concurrently. The rtnl_lock() serializes the 2 calls. We need to move the cancel_work_sync() call to bnx2_remove_one() to make sure it is canceled before freeing the netdev struct. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Showing
Please register or sign in to comment