• Siddharth Kawar's avatar
    SUNRPC: fix shutdown of NFS TCP client socket · 943d045a
    Siddharth Kawar authored
    NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients
    reconnecting using the same local TCP port. Unique NFS operations are
    identified by the per-TCP connection set of XIDs. This prevents file
    corruption when non-idempotent NFS operations are retried.
    
    Currently, NFS client TCP connections are using different local TCP ports
    when reconnecting to NFS servers.
    
    After an NFS server initiates shutdown of the TCP connection, the NFS
    client's TCP socket is set to NULL after the socket state has reached
    TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse
    the same local port but fails with EADDRNOTAVAIL (99). This forces the
    socket to use a different local TCP port to reconnect to the remote NFS
    server.
    
    State Transition and Events:
    TCP_CLOSE_WAIT(8)
    TCP_LAST_ACK(9)
    connect(fail EADDRNOTAVAIL(99))
    TCP_CLOSE(7)
    bind on new port
    connect success
    
    dmesg excerpts showing reconnect switching from TCP local port of 926 to
    763 after commit 7c81e6a9:
    [13354.947854] NFS call  mkdir testW
    ...
    [13405.654781] RPC:       xs_tcp_state_change client 00000000037d0f03...
    [13405.654813] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
    [13405.654826] RPC:       xs_data_ready...
    [13405.654892] RPC:       xs_tcp_state_change client 00000000037d0f03...
    [13405.654895] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
    [13405.654899] RPC:       xs_tcp_state_change client 00000000037d0f03...
    [13405.654900] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
    [13405.654950] RPC:       xs_connect scheduled xprt 00000000037d0f03
    [13405.654975] RPC:       xs_bind 0.0.0.0:926: ok (0)
    [13405.654980] RPC:       worker connecting xprt 00000000037d0f03 via tcp
    			  to 10.101.6.228 (port 2049)
    [13405.654991] RPC:       00000000037d0f03 connect status 99 connected 0
    			  sock state 7
    [13405.655001] RPC:       xs_tcp_state_change client 00000000037d0f03...
    [13405.655002] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
    [13405.655024] RPC:       xs_connect scheduled xprt 00000000037d0f03
    [13405.655038] RPC:       xs_bind 0.0.0.0:763: ok (0)
    [13405.655041] RPC:       worker connecting xprt 00000000037d0f03 via tcp
    			  to 10.101.6.228 (port 2049)
    [13405.655065] RPC:       00000000037d0f03 connect status 115 connected 0
    			  sock state 2
    
    State Transition and Events with patch applied:
    TCP_CLOSE_WAIT(8)
    TCP_LAST_ACK(9)
    TCP_CLOSE(7)
    connect(reuse of port succeeds)
    
    dmesg excerpts showing reconnect on same TCP local port of 936 with patch
    applied:
    [  257.139935] NFS: mkdir(0:59/560857152), testQ
    [  257.139937] NFS call  mkdir testQ
    ...
    [  307.822702] RPC:       state 8 conn 1 dead 0 zapped 1 sk_shutdown 1
    [  307.822714] RPC:       xs_data_ready...
    [  307.822817] RPC:       xs_tcp_state_change client 00000000ce702f14...
    [  307.822821] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
    [  307.822825] RPC:       xs_tcp_state_change client 00000000ce702f14...
    [  307.822826] RPC:       state 9 conn 0 dead 0 zapped 1 sk_shutdown 3
    [  307.823606] RPC:       xs_tcp_state_change client 00000000ce702f14...
    [  307.823609] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
    [  307.823629] RPC:       xs_tcp_state_change client 00000000ce702f14...
    [  307.823632] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
    [  307.823676] RPC:       xs_connect scheduled xprt 00000000ce702f14
    [  307.823704] RPC:       xs_bind 0.0.0.0:936: ok (0)
    [  307.823709] RPC:       worker connecting xprt 00000000ce702f14 via tcp
    			  to 10.101.1.30 (port 2049)
    [  307.823748] RPC:       00000000ce702f14 connect status 115 connected 0
    			  sock state 2
    ...
    [  314.916193] RPC:       state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
    [  314.916251] RPC:       xs_connect scheduled xprt 00000000ce702f14
    [  314.916282] RPC:       xs_bind 0.0.0.0:936: ok (0)
    [  314.916292] RPC:       worker connecting xprt 00000000ce702f14 via tcp
    			  to 10.101.1.30 (port 2049)
    [  314.916342] RPC:       00000000ce702f14 connect status 115 connected 0
    			  sock state 2
    
    Fixes: 7c81e6a9 ("SUNRPC: Tweak TCP socket shutdown in the RPC client")
    Signed-off-by: default avatarSiddharth Rajendra Kawar <sikawar@microsoft.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    943d045a
xprtsock.c 83.9 KB