• Kuniyuki Iwashima's avatar
    tcp: Don't drop SYN+ACK for simultaneous connect(). · 23e89e8e
    Kuniyuki Iwashima authored
    RFC 9293 states that in the case of simultaneous connect(), the connection
    gets established when SYN+ACK is received. [0]
    
          TCP Peer A                                       TCP Peer B
    
      1.  CLOSED                                           CLOSED
      2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...
      3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT
      4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED
      5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...
      6.  ESTABLISHED  <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED
      7.               ... <SEQ=100><ACK=301><CTL=SYN,ACK> --> ESTABLISHED
    
    However, since commit 0c24604b ("tcp: implement RFC 5961 4.2"), such a
    SYN+ACK is dropped in tcp_validate_incoming() and responded with Challenge
    ACK.
    
    For example, the write() syscall in the following packetdrill script fails
    with -EAGAIN, and wrong SNMP stats get incremented.
    
       0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
      +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
    
      +0 > S  0:0(0) <mss 1460,sackOK,TS val 1000 ecr 0,nop,wscale 8>
      +0 < S  0:0(0) win 1000 <mss 1000>
      +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 3308134035 ecr 0,nop,wscale 8>
      +0 < S. 0:0(0) ack 1 win 1000
    
      +0 write(3, ..., 100) = 100
      +0 > P. 1:101(100) ack 1
    
      --
    
      # packetdrill cross-synack.pkt
      cross-synack.pkt:13: runtime error in write call: Expected result 100 but got -1 with errno 11 (Resource temporarily unavailable)
      # nstat
      ...
      TcpExtTCPChallengeACK           1                  0.0
      TcpExtTCPSYNChallenge           1                  0.0
    
    The problem is that bpf_skops_established() is triggered by the Challenge
    ACK instead of SYN+ACK.  This causes the bpf prog to miss the chance to
    check if the peer supports a TCP option that is expected to be exchanged
    in SYN and SYN+ACK.
    
    Let's accept a bare SYN+ACK for active-open TCP_SYN_RECV sockets to avoid
    such a situation.
    
    Note that tcp_ack_snd_check() in tcp_rcv_state_process() is skipped not to
    send an unnecessary ACK, but this could be a bit risky for net.git, so this
    targets for net-next.
    
    Link: https://www.rfc-editor.org/rfc/rfc9293.html#section-3.5-7 [0]
    Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Link: https://patch.msgid.link/20240710171246.87533-2-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    23e89e8e
tcp_input.c 210 KB