• Eric Dumazet's avatar
    tcp: fix receive window update in tcp_add_backlog() · 86bccd03
    Eric Dumazet authored
    We got reports from GKE customers flows being reset by netfilter
    conntrack unless nf_conntrack_tcp_be_liberal is set to 1.
    
    Traces seemed to suggest ACK packet being dropped by the
    packet capture, or more likely that ACK were received in the
    wrong order.
    
     wscale=7, SYN and SYNACK not shown here.
    
     This ACK allows the sender to send 1871*128 bytes from seq 51359321 :
     New right edge of the window -> 51359321+1871*128=51598809
    
     09:17:23.389210 IP A > B: Flags [.], ack 51359321, win 1871, options [nop,nop,TS val 10 ecr 999], length 0
    
     09:17:23.389212 IP B > A: Flags [.], seq 51422681:51424089, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 1408
     09:17:23.389214 IP A > B: Flags [.], ack 51422681, win 1376, options [nop,nop,TS val 10 ecr 999], length 0
     09:17:23.389253 IP B > A: Flags [.], seq 51424089:51488857, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 64768
     09:17:23.389272 IP A > B: Flags [.], ack 51488857, win 859, options [nop,nop,TS val 10 ecr 999], length 0
     09:17:23.389275 IP B > A: Flags [.], seq 51488857:51521241, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
    
     Receiver now allows to send 606*128=77568 from seq 51521241 :
     New right edge of the window -> 51521241+606*128=51598809
    
     09:17:23.389296 IP A > B: Flags [.], ack 51521241, win 606, options [nop,nop,TS val 10 ecr 999], length 0
    
     09:17:23.389308 IP B > A: Flags [.], seq 51521241:51553625, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 32384
    
     It seems the sender exceeds RWIN allowance, since 51611353 > 51598809
    
     09:17:23.389346 IP B > A: Flags [.], seq 51553625:51611353, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 57728
     09:17:23.389356 IP B > A: Flags [.], seq 51611353:51618393, ack 1577, win 268, options [nop,nop,TS val 999 ecr 10], length 7040
    
     09:17:23.389367 IP A > B: Flags [.], ack 51611353, win 0, options [nop,nop,TS val 10 ecr 999], length 0
    
     netfilter conntrack is not happy and sends RST
    
     09:17:23.389389 IP A > B: Flags [R], seq 92176528, win 0, length 0
     09:17:23.389488 IP B > A: Flags [R], seq 174478967, win 0, length 0
    
     Now imagine ACK were delivered out of order and tcp_add_backlog() sets window based on wrong packet.
     New right edge of the window -> 51521241+859*128=51631193
    
    Normally TCP stack handles OOO packets just fine, but it
    turns out tcp_add_backlog() does not. It can update the window
    field of the aggregated packet even if the ACK sequence
    of the last received packet is too old.
    
    Many thanks to Alexandre Ferrieux for independently reporting the issue
    and suggesting a fix.
    
    Fixes: 4f693b55 ("tcp: implement coalescing on backlog queue")
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Reported-by: default avatarAlexandre Ferrieux <alexandre.ferrieux@orange.com>
    Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    86bccd03
tcp_ipv4.c 77.7 KB