• Salvatore Dipietro's avatar
    tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set · f3f32a35
    Salvatore Dipietro authored
    Based on the tcp man page, if TCP_NODELAY is set, it disables Nagle's algorithm
    and packets are sent as soon as possible. However in the `tcp_push` function
    where autocorking is evaluated the `nonagle` value set by TCP_NODELAY is not
    considered which can trigger unexpected corking of packets and induce delays.
    
    For example, if two packets are generated as part of a server's reply, if the
    first one is not transmitted on the wire quickly enough, the second packet can
    trigger the autocorking in `tcp_push` and be delayed instead of sent as soon as
    possible. It will either wait for additional packets to be coalesced or an ACK
    from the client before transmitting the corked packet. This can interact badly
    if the receiver has tcp delayed acks enabled, introducing 40ms extra delay in
    completion times. It is not always possible to control who has delayed acks
    set, but it is possible to adjust when and how autocorking is triggered.
    Patch prevents autocorking if the TCP_NODELAY flag is set on the socket.
    
    Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu 22.04 and
    Apache Tomcat 9.0.83 running the basic servlet below:
    
    import java.io.IOException;
    import java.io.OutputStreamWriter;
    import java.io.PrintWriter;
    import javax.servlet.ServletException;
    import javax.servlet.http.HttpServlet;
    import javax.servlet.http.HttpServletRequest;
    import javax.servlet.http.HttpServletResponse;
    
    public class HelloWorldServlet extends HttpServlet {
        @Override
        protected void doGet(HttpServletRequest request, HttpServletResponse response)
          throws ServletException, IOException {
            response.setContentType("text/html;charset=utf-8");
            OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8");
            String s = "a".repeat(3096);
            osw.write(s,0,s.length());
            osw.flush();
        }
    }
    
    Load was applied using  wrk2 (https://github.com/kinvolk/wrk2) from an AWS
    c6i.8xlarge instance.  With the current auto-corking behavior and TCP_NODELAY
    set an additional 40ms latency from P99.99+ values are observed.  With the
    patch applied we see no occurrences of 40ms latencies. The patch has also been
    tested with iperf and uperf benchmarks and no regression was observed.
    
    # No patch with tcp_autocorking=1 and TCP_NODELAY set on all sockets
    ./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.49.177:8080/hello/hello'
      ...
     50.000%    0.91ms
     75.000%    1.12ms
     90.000%    1.46ms
     99.000%    1.73ms
     99.900%    1.96ms
     99.990%   43.62ms   <<< 40+ ms extra latency
     99.999%   48.32ms
    100.000%   49.34ms
    
    # With patch
    ./wrk -t32 -c128 -d40s --latency -R10000  http://172.31.49.177:8080/hello/hello'
      ...
     50.000%    0.89ms
     75.000%    1.13ms
     90.000%    1.44ms
     99.000%    1.67ms
     99.900%    1.78ms
     99.990%    2.27ms   <<< no 40+ ms extra latency
     99.999%    3.71ms
    100.000%    4.57ms
    
    Fixes: f54b3111 ("tcp: auto corking")
    Signed-off-by: default avatarSalvatore Dipietro <dipiets@amazon.com>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    f3f32a35
tcp.c 124 KB