• Andrey Ignatov's avatar
    bpf: Introduce bpf_skb_ancestor_cgroup_id helper · 77236281
    Andrey Ignatov authored
    == Problem description ==
    
    It's useful to be able to identify cgroup associated with skb in TC so
    that a policy can be applied to this skb, and existing bpf_skb_cgroup_id
    helper can help with this.
    
    Though in real life cgroup hierarchy and hierarchy to apply a policy to
    don't map 1:1.
    
    It's often the case that there is a container and corresponding cgroup,
    but there are many more sub-cgroups inside container, e.g. because it's
    delegated to containerized application to control resources for its
    subsystems, or to separate application inside container from infra that
    belongs to containerization system (e.g. sshd).
    
    At the same time it may be useful to apply a policy to container as a
    whole.
    
    If multiple containers like this are run on a host (what is often the
    case) and many of them have sub-cgroups, it may not be possible to apply
    per-container policy in TC with existing helpers such as
    bpf_skb_under_cgroup or bpf_skb_cgroup_id:
    
    * bpf_skb_cgroup_id will return id of immediate cgroup associated with
      skb, i.e. if it's a sub-cgroup inside container, it can't be used to
      identify container's cgroup;
    
    * bpf_skb_under_cgroup can work only with one cgroup and doesn't scale,
      i.e. if there are N containers on a host and a policy has to be
      applied to M of them (0 <= M <= N), it'd require M calls to
      bpf_skb_under_cgroup, and, if M changes, it'd require to rebuild &
      load new BPF program.
    
    == Solution ==
    
    The patch introduces new helper bpf_skb_ancestor_cgroup_id that can be
    used to get id of cgroup v2 that is an ancestor of cgroup associated
    with skb at specified level of cgroup hierarchy.
    
    That way admin can place all containers on one level of cgroup hierarchy
    (what is a good practice in general and already used in many
    configurations) and identify specific cgroup on this level no matter
    what sub-cgroup skb is associated with.
    
    E.g. if there is a cgroup hierarchy:
      root/
      root/container1/
      root/container1/app11/
      root/container1/app11/sub-app-a/
      root/container1/app12/
      root/container2/
      root/container2/app21/
      root/container2/app22/
      root/container2/app22/sub-app-b/
    
    , then having skb associated with root/container1/app11/sub-app-a/ it's
    possible to get ancestor at level 1, what is container1 and apply policy
    for this container, or apply another policy if it's container2.
    
    Policies can be kept e.g. in a hash map where key is a container cgroup
    id and value is an action.
    
    Levels where container cgroups are created are usually known in advance
    whether cgroup hierarchy inside container may be hard to predict
    especially in case when its creation is delegated to containerized
    application.
    
    == Implementation details ==
    
    The helper gets ancestor by walking parents up to specified level.
    
    Another option would be to get different kind of "id" from
    cgroup->ancestor_ids[level] and use it with idr_find() to get struct
    cgroup for ancestor. But that would require radix lookup what doesn't
    seem to be better (at least it's not obviously better).
    
    Format of return value of the new helper is same as that of
    bpf_skb_cgroup_id.
    Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    77236281
cgroup.h 26.5 KB