• Daniel Borkmann's avatar
    bpf, net: Rework cookie generator as per-cpu one · 92acdc58
    Daniel Borkmann authored
    With its use in BPF, the cookie generator can be called very frequently
    in particular when used out of cgroup v2 hooks (e.g. connect / sendmsg)
    and attached to the root cgroup, for example, when used in v1/v2 mixed
    environments. In particular, when there's a high churn on sockets in the
    system there can be many parallel requests to the bpf_get_socket_cookie()
    and bpf_get_netns_cookie() helpers which then cause contention on the
    atomic counter.
    
    As similarly done in f991bd2e ("fs: introduce a per-cpu last_ino
    allocator"), add a small helper library that both can use for the 64 bit
    counters. Given this can be called from different contexts, we also need
    to deal with potential nested calls even though in practice they are
    considered extremely rare. One idea as suggested by Eric Dumazet was
    to use a reverse counter for this situation since we don't expect 64 bit
    overflows anyways; that way, we can avoid bigger gaps in the 64 bit
    counter space compared to just batch-wise increase. Even on machines
    with small number of cores (e.g. 4) the cookie generation shrinks from
    min/max/med/avg (ns) of 22/50/40/38.9 down to 10/35/14/17.3 when run
    in parallel from multiple CPUs.
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Link: https://lore.kernel.org/bpf/8a80b8d27d3c49f9a14e1d5213c19d8be87d1dc8.1601477936.git.daniel@iogearbox.net
    92acdc58
reuseport_array.c 8.77 KB