1. 29 Apr, 2020 28 commits
  2. 28 Apr, 2020 4 commits
  3. 27 Apr, 2020 6 commits
  4. 26 Apr, 2020 2 commits
    • Alexei Starovoitov's avatar
      Merge branch 'cloudflare-prog' · f131bd3e
      Alexei Starovoitov authored
      Lorenz Bauer says:
      
      ====================
      We've been developing an in-house L4 load balancer based on XDP
      and TC for a while. Following Alexei's call for more up-to-date examples of
      production BPF in the kernel tree [1], Cloudflare is making this available
      under dual GPL-2.0 or BSD 3-clause terms.
      
      The code requires at least v5.3 to function correctly.
      
      1: https://lore.kernel.org/bpf/20200326210719.den5isqxntnoqhmv@ast-mbp/
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f131bd3e
    • Lorenz Bauer's avatar
      selftests/bpf: Add cls_redirect classifier · 23458901
      Lorenz Bauer authored
      cls_redirect is a TC clsact based replacement for the glb-redirect iptables
      module available at [1]. It enables what GitHub calls "second chance"
      flows [2], similarly proposed by the Beamer paper [3]. In contrast to
      glb-redirect, it also supports migrating UDP flows as long as connected
      sockets are used. cls_redirect is in production at Cloudflare, as part of
      our own L4 load balancer.
      
      We have modified the encapsulation format slightly from glb-redirect:
      glbgue_chained_routing.private_data_type has been repurposed to form a
      version field and several flags. Both have been arranged in a way that
      a private_data_type value of zero matches the current glb-redirect
      behaviour. This means that cls_redirect will understand packets in
      glb-redirect format, but not vice versa.
      
      The test suite only covers basic features. For example, cls_redirect will
      correctly forward path MTU discovery packets, but this is not exercised.
      It is also possible to switch the encapsulation format to GRE on the last
      hop, which is also not tested.
      
      There are two major distinctions from glb-redirect: first, cls_redirect
      relies on receiving encapsulated packets directly from a router. This is
      because we don't have access to the neighbour tables from BPF, yet. See
      forward_to_next_hop for details. Second, cls_redirect performs decapsulation
      instead of using separate ipip and sit tunnel devices. This
      avoids issues with the sit tunnel [4] and makes deploying the classifier
      easier: decapsulated packets appear on the same interface, so existing
      firewall rules continue to work as expected.
      
      The code base started it's life on v4.19, so there are most likely still
      hold overs from old workarounds. In no particular order:
      
      - The function buf_off is required to defeat a clang optimization
        that leads to the verifier rejecting the program due to pointer
        arithmetic in the wrong order.
      
      - The function pkt_parse_ipv6 is force inlined, because it would
        otherwise be rejected due to returning a pointer to stack memory.
      
      - The functions fill_tuple and classify_tcp contain kludges, because
        we've run out of function arguments.
      
      - The logic in general is rather nested, due to verifier restrictions.
        I think this is either because the verifier loses track of constants
        on the stack, or because it can't track enum like variables.
      
      1: https://github.com/github/glb-director/tree/master/src/glb-redirect
      2: https://github.com/github/glb-director/blob/master/docs/development/second-chance-design.md
      3: https://www.usenix.org/conference/nsdi18/presentation/olteanu
      4: https://github.com/github/glb-director/issues/64Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200424185556.7358-2-lmb@cloudflare.com
      23458901