• Daniel Borkmann's avatar
    bpf: Add redirect_peer helper · 9aa1206e
    Daniel Borkmann authored
    Add an efficient ingress to ingress netns switch that can be used out of tc BPF
    programs in order to redirect traffic from host ns ingress into a container
    veth device ingress without having to go via CPU backlog queue [0]. For local
    containers this can also be utilized and path via CPU backlog queue only needs
    to be taken once, not twice. On a high level this borrows from ipvlan which does
    similar switch in __netif_receive_skb_core() and then iterates via another_round.
    This helps to reduce latency for mentioned use cases.
    
    Pod to remote pod with redirect(), TCP_RR [1]:
    
      # percpu_netperf 10.217.1.33
              RT_LATENCY:         122.450         (per CPU:         122.666         122.401         122.333         122.401 )
            MEAN_LATENCY:         121.210         (per CPU:         121.100         121.260         121.320         121.160 )
          STDDEV_LATENCY:         120.040         (per CPU:         119.420         119.910         125.460         115.370 )
             MIN_LATENCY:          46.500         (per CPU:          47.000          47.000          47.000          45.000 )
             P50_LATENCY:         118.500         (per CPU:         118.000         119.000         118.000         119.000 )
             P90_LATENCY:         127.500         (per CPU:         127.000         128.000         127.000         128.000 )
             P99_LATENCY:         130.750         (per CPU:         131.000         131.000         129.000         132.000 )
    
        TRANSACTION_RATE:       32666.400         (per CPU:        8152.200        8169.842        8174.439        8169.897 )
    
    Pod to remote pod with redirect_peer(), TCP_RR:
    
      # percpu_netperf 10.217.1.33
              RT_LATENCY:          44.449         (per CPU:          43.767          43.127          45.279          45.622 )
            MEAN_LATENCY:          45.065         (per CPU:          44.030          45.530          45.190          45.510 )
          STDDEV_LATENCY:          84.823         (per CPU:          66.770          97.290          84.380          90.850 )
             MIN_LATENCY:          33.500         (per CPU:          33.000          33.000          34.000          34.000 )
             P50_LATENCY:          43.250         (per CPU:          43.000          43.000          43.000          44.000 )
             P90_LATENCY:          46.750         (per CPU:          46.000          47.000          47.000          47.000 )
             P99_LATENCY:          52.750         (per CPU:          51.000          54.000          53.000          53.000 )
    
        TRANSACTION_RATE:       90039.500         (per CPU:       22848.186       23187.089       22085.077       21919.130 )
    
      [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf
      [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperfSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net
    9aa1206e
dev.c 276 KB