• Wei Wang's avatar
    tcp: uniform the set up of sockets after successful connection · 27204aaa
    Wei Wang authored
    Currently in the TCP code, the initialization sequence for cached
    metrics, congestion control, BPF, etc, after successful connection
    is very inconsistent. This introduces inconsistent bevhavior and is
    prone to bugs. The current call sequence is as follows:
    
    (1) for active case (tcp_finish_connect() case):
            tcp_mtup_init(sk);
            icsk->icsk_af_ops->rebuild_header(sk);
            tcp_init_metrics(sk);
            tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB);
            tcp_init_congestion_control(sk);
            tcp_init_buffer_space(sk);
    
    (2) for passive case (tcp_rcv_state_process() TCP_SYN_RECV case):
            icsk->icsk_af_ops->rebuild_header(sk);
            tcp_call_bpf(sk, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
            tcp_init_congestion_control(sk);
            tcp_mtup_init(sk);
            tcp_init_buffer_space(sk);
            tcp_init_metrics(sk);
    
    (3) for TFO passive case (tcp_fastopen_create_child()):
            inet_csk(child)->icsk_af_ops->rebuild_header(child);
            tcp_init_congestion_control(child);
            tcp_mtup_init(child);
            tcp_init_metrics(child);
            tcp_call_bpf(child, BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB);
            tcp_init_buffer_space(child);
    
    This commit uniforms the above functions to have the following sequence:
            tcp_mtup_init(sk);
            icsk->icsk_af_ops->rebuild_header(sk);
            tcp_init_metrics(sk);
            tcp_call_bpf(sk, BPF_SOCK_OPS_ACTIVE/PASSIVE_ESTABLISHED_CB);
            tcp_init_congestion_control(sk);
            tcp_init_buffer_space(sk);
    This sequence is the same as the (1) active case. We pick this sequence
    because this order correctly allows BPF to override the settings
    including congestion control module and initial cwnd, etc from
    the route, and then allows the CC module to see those settings.
    Suggested-by: default avatarNeal Cardwell <ncardwell@google.com>
    Tested-by: default avatarNeal Cardwell <ncardwell@google.com>
    Signed-off-by: default avatarWei Wang <weiwan@google.com>
    Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
    Acked-by: default avatarYuchung Cheng <ycheng@google.com>
    Acked-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    27204aaa
tcp_input.c 181 KB