• Sridhar Samudrala's avatar
    tcp: TCP connection times out if ICMP frag needed is delayed · 7d227cd2
    Sridhar Samudrala authored
    We are seeing an issue with TCP in handling an ICMP frag needed
    message that is received after net.ipv4.tcp_retries1 retransmits.
    The default value of retries1 is 3. So if the path mtu changes
    and ICMP frag needed is lost for the first 3 retransmits or if
    it gets delayed until 3 retransmits are done, TCP doesn't update
    MSS correctly and continues to retransmit the orginal message
    until it timesout after tcp_retries2 retransmits.
    
    I am seeing this issue even with the latest 2.6.25.4 kernel.
    
    In tcp_retransmit_timer(), when retransmits counter exceeds 
    tcp_retries1 value, the dst cache entry of the socket is reset.
    At this time, if we receive an ICMP frag needed message, the 
    dst entry gets updated with the new MTU, but the TCP sockets
    dst_cache entry remains NULL.
    
    So the next time when we try to retransmit after the ICMP frag
    needed is received, tcp_retransmit_skb() gets called. Here the
    cur_mss value is calculated at the start of the routine with
    a NULL sk_dst_cache. Instead we should call tcp_current_mss after
    the rebuild_header that caches the dst entry with the updated mtu.
    Also the rebuild_header should be called before tcp_fragment
    so that skb is fragmented if the mss goes down.
    Signed-off-by: default avatarSridhar Samudrala <sri@us.ibm.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    7d227cd2
tcp_output.c 73.7 KB