• Willem de Bruijn's avatar
    net-timestamp: TCP timestamping · 4ed2d765
    Willem de Bruijn authored
    TCP timestamping extends SO_TIMESTAMPING to bytestreams.
    
    Bytestreams do not have a 1:1 relationship between send() buffers and
    network packets. The feature interprets a send call on a bytestream as
    a request for a timestamp for the last byte in that send() buffer.
    
    The choice corresponds to a request for a timestamp when all bytes in
    the buffer have been sent. That assumption depends on in-order kernel
    transmission. This is the common case. That said, it is possible to
    construct a traffic shaping tree that would result in reordering.
    The guarantee is strong, then, but not ironclad.
    
    This implementation supports send and sendpages (splice). GSO replaces
    one large packet with multiple smaller packets. This patch also copies
    the option into the correct smaller packet.
    
    This patch does not yet support timestamping on data in an initial TCP
    Fast Open SYN, because that takes a very different data path.
    
    If ID generation in ee_data is enabled, bytestream timestamps return a
    byte offset, instead of the packet counter for datagrams.
    
    The implementation supports a single timestamp per packet. It silenty
    replaces requests for previous timestamps. To avoid missing tstamps,
    flush the tcp queue by disabling Nagle, cork and autocork. Missing
    tstamps can be detected by offset when the ee_data ID is enabled.
    
    Implementation details:
    
    - On GSO, the timestamping code can be included in the main loop. I
    moved it into its own loop to reduce the impact on the common case
    to a single branch.
    
    - To avoid leaking the absolute seqno to userspace, the offset
    returned in ee_data must always be relative. It is an offset between
    an skb and sk field. The first is always set (also for GSO & ACK).
    The second must also never be uninitialized. Only allow the ID
    option on sockets in the ESTABLISHED state, for which the seqno
    is available. Never reset it to zero (instead, move it to the
    current seqno when reenabling the option).
    Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    4ed2d765
tcp.c 83.7 KB