• Tuong Lien's avatar
    tipc: fix successful connect() but timed out · 5391a877
    Tuong Lien authored
    In commit 9546a0b7 ("tipc: fix wrong connect() return code"), we
    fixed the issue with the 'connect()' that returns zero even though the
    connecting has failed by waiting for the connection to be 'ESTABLISHED'
    really. However, the approach has one drawback in conjunction with our
    'lightweight' connection setup mechanism that the following scenario
    can happen:
    
              (server)                        (client)
    
       +- accept()|                      |             wait_for_conn()
       |          |                      |connect() -------+
       |          |<-------[SYN]---------|                 > sleeping
       |          |                      *CONNECTING       |
       |--------->*ESTABLISHED           |                 |
                  |--------[ACK]-------->*ESTABLISHED      > wakeup()
            send()|--------[DATA]------->|\                > wakeup()
            send()|--------[DATA]------->| |               > wakeup()
              .   .          .           . |-> recvq       .
              .   .          .           . |               .
            send()|--------[DATA]------->|/                > wakeup()
           close()|--------[FIN]-------->*DISCONNECTING    |
                  *DISCONNECTING         |                 |
                  |                      ~~~~~~~~~~~~~~~~~~> schedule()
                                                           | wait again
                                                           .
                                                           .
                                                           | ETIMEDOUT
    
    Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
    and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
    the server starts to send a number of data following by a 'close()'
    shortly without waiting any response from the client, which then forces
    the client socket to be 'DISCONNECTING' immediately. When the wait
    process is switched to be running, it continues to wait until the timer
    expires because of the unexpected socket state. The client 'connect()'
    will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
    there remains the messages in its receive queue.
    
    Obviously the issue would not happen if the server had some delay prior
    to its 'close()' (or the number of 'DATA' messages is large enough),
    but any kind of delay would make the connection setup/shutdown "heavy".
    We solve this by simply allowing the 'connect()' returns zero in this
    particular case. The socket is already 'DISCONNECTING', so any further
    write will get '-EPIPE' but the socket is still able to read the
    messages existing in its receive queue.
    
    Note: This solution doesn't break the previous one as it deals with a
    different situation that the socket state is 'DISCONNECTING' but has no
    error (i.e. sk->sk_err = 0).
    
    Fixes: 9546a0b7 ("tipc: fix wrong connect() return code")
    Acked-by: default avatarYing Xue <ying.xue@windriver.com>
    Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
    Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    5391a877
socket.c 102 KB