• James Smart's avatar
    nvme: Revert: Fix controller creation races with teardown flow · b63de840
    James Smart authored
    The indicated patch introduced a barrier in the sysfs_delete attribute
    for the controller that rejects the request if the controller isn't
    created. "Created" is defined as at least 1 call to nvme_start_ctrl().
    
    This is problematic in error-injection testing.  If an error occurs on
    the initial attempt to create an association and the controller enters
    reconnect(s) attempts, the admin cannot delete the controller until
    either there is a successful association created or ctrl_loss_tmo
    times out.
    
    Where this issue is particularly hurtful is when the "admin" is the
    nvme-cli, it is performing a connection to a discovery controller, and
    it is initiated via auto-connect scripts.  With the FC transport, if the
    first connection attempt fails, the controller enters a normal reconnect
    state but returns control to the cli thread that created the controller.
    In this scenario, the cli attempts to read the discovery log via ioctl,
    which fails, causing the cli to see it as an empty log and then proceeds
    to delete the discovery controller. The delete is rejected and the
    controller is left live. If the discovery controller reconnect then
    succeeds, there is no action to delete it, and it sits live doing nothing.
    
    Cc: <stable@vger.kernel.org> # v5.7+
    Fixes: ce151813 ("nvme: Fix controller creation races with teardown flow")
    Signed-off-by: default avatarJames Smart <james.smart@broadcom.com>
    CC: Israel Rukshin <israelr@mellanox.com>
    CC: Max Gurtovoy <maxg@mellanox.com>
    CC: Christoph Hellwig <hch@lst.de>
    CC: Keith Busch <kbusch@kernel.org>
    CC: Sagi Grimberg <sagi@grimberg.me>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    b63de840
nvme.h 21.8 KB