ibmvnic: Ensure login failure recovery is safe from other resets
If a login request fails, the recovery process should be protected against parallel resets. It is a known issue that freeing and registering CRQ's in quick succession can result in a failover CRQ from the VIOS. Processing a failover during login recovery is dangerous for two reasons: 1. This will result in two parallel initialization processes, this can cause serious issues during login. 2. It is possible that the failover CRQ is received but never executed. We get notified of a pending failover through a transport event CRQ. The reset is not performed until a INIT CRQ request is received. Previously, if CRQ init fails during login recovery, then the ibmvnic irq is freed and the login process returned error. If failover_pending is true (a transport event was received), then the ibmvnic device would never be able to process the reset since it cannot receive the CRQ_INIT request due to the irq being freed. This leaved the device in a inoperable state. Therefore, the login failure recovery process must be hardened against these possible issues. Possible failovers (due to quick CRQ free and init) must be avoided and any issues during re-initialization should be dealt with instead of being propagated up the stack. This logic is similar to that of ibmvnic_probe(). Fixes: dff515a3 ("ibmvnic: Harden device login requests") Signed-off-by: Nick Child <nnac123@linux.ibm.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230809221038.51296-5-nnac123@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
Showing
Please register or sign in to comment