• Andrii Nakryiko's avatar
    bpf: add support for open-coded iterator loops · 06accc87
    Andrii Nakryiko authored
    Teach verifier about the concept of the open-coded (or inline) iterators.
    
    This patch adds generic iterator loop verification logic, new STACK_ITER
    stack slot type to contain iterator state, and necessary kfunc plumbing
    for iterator's constructor, destructor and next methods. Next patch
    implements first specific iterator (numbers iterator for implementing
    for() loop logic). Such split allows to have more focused commits for
    verifier logic and separate commit that we could point later to
    demonstrating  what does it take to add a new kind of iterator.
    
    Each kind of iterator has its own associated struct bpf_iter_<type>,
    where <type> denotes a specific type of iterator. struct bpf_iter_<type>
    state is supposed to live on BPF program stack, so there will be no way
    to change its size later on without breaking backwards compatibility, so
    choose wisely! But given this struct is specific to a given <type> of
    iterator, this allows a lot of flexibility: simple iterators could be
    fine with just one stack slot (8 bytes), like numbers iterator in the
    next patch, while some other more complicated iterators might need way
    more to keep their iterator state. Either way, such design allows to
    avoid runtime memory allocations, which otherwise would be necessary if
    we fixed on-the-stack size and it turned out to be too small for a given
    iterator implementation.
    
    The way BPF verifier logic is implemented, there are no artificial
    restrictions on a number of active iterators, it should work correctly
    using multiple active iterators at the same time. This also means you
    can have multiple nested iteration loops. struct bpf_iter_<type>
    reference can be safely passed to subprograms as well.
    
    General flow is easiest to demonstrate with a simple example using
    number iterator implemented in next patch. Here's the simplest possible
    loop:
    
      struct bpf_iter_num it;
      int *v;
    
      bpf_iter_num_new(&it, 2, 5);
      while ((v = bpf_iter_num_next(&it))) {
          bpf_printk("X = %d", *v);
      }
      bpf_iter_num_destroy(&it);
    
    Above snippet should output "X = 2", "X = 3", "X = 4". Note that 5 is
    exclusive and is not returned. This matches similar APIs (e.g., slices
    in Go or Rust) that implement a range of elements, where end index is
    non-inclusive.
    
    In the above example, we see a trio of function:
      - constructor, bpf_iter_num_new(), which initializes iterator state
      (struct bpf_iter_num it) on the stack. If any of the input arguments
      are invalid, constructor should make sure to still initialize it such
      that subsequent bpf_iter_num_next() calls will return NULL. I.e., on
      error, return error and construct empty iterator.
      - next method, bpf_iter_num_next(), which accepts pointer to iterator
      state and produces an element. Next method should always return
      a pointer. The contract between BPF verifier is that next method will
      always eventually return NULL when elements are exhausted. Once NULL is
      returned, subsequent next calls should keep returning NULL. In the
      case of numbers iterator, bpf_iter_num_next() returns a pointer to an int
      (storage for this integer is inside the iterator state itself),
      which can be dereferenced after corresponding NULL check.
      - once done with the iterator, it's mandated that user cleans up its
      state with the call to destructor, bpf_iter_num_destroy() in this
      case. Destructor frees up any resources and marks stack space used by
      struct bpf_iter_num as usable for something else.
    
    Any other iterator implementation will have to implement at least these
    three methods. It is enforced that for any given type of iterator only
    applicable constructor/destructor/next are callable. I.e., verifier
    ensures you can't pass number iterator state into, say, cgroup
    iterator's next method.
    
    It is important to keep the naming pattern consistent to be able to
    create generic macros to help with BPF iter usability. E.g., one
    of the follow up patches adds generic bpf_for_each() macro to bpf_misc.h
    in selftests, which allows to utilize iterator "trio" nicely without
    having to code the above somewhat tedious loop explicitly every time.
    This is enforced at kfunc registration point by one of the previous
    patches in this series.
    
    At the implementation level, iterator state tracking for verification
    purposes is very similar to dynptr. We add STACK_ITER stack slot type,
    reserve necessary number of slots, depending on
    sizeof(struct bpf_iter_<type>), and keep track of necessary extra state
    in the "main" slot, which is marked with non-zero ref_obj_id. Other
    slots are also marked as STACK_ITER, but have zero ref_obj_id. This is
    simpler than having a separate "is_first_slot" flag.
    
    Another big distinction is that STACK_ITER is *always refcounted*, which
    simplifies implementation without sacrificing usability. So no need for
    extra "iter_id", no need to anticipate reuse of STACK_ITER slots for new
    constructors, etc. Keeping it simple here.
    
    As far as the verification logic goes, there are two extensive comments:
    in process_iter_next_call() and iter_active_depths_differ() explaining
    some important and sometimes subtle aspects. Please refer to them for
    details.
    
    But from 10,000-foot point of view, next methods are the points of
    forking a verification state, which are conceptually similar to what
    verifier is doing when validating conditional jump. We branch out at
    a `call bpf_iter_<type>_next` instruction and simulate two outcomes:
    NULL (iteration is done) and non-NULL (new element is returned). NULL is
    simulated first and is supposed to reach exit without looping. After
    that non-NULL case is validated and it either reaches exit (for trivial
    examples with no real loop), or reaches another `call bpf_iter_<type>_next`
    instruction with the state equivalent to already (partially) validated
    one. State equivalency at that point means we technically are going to
    be looping forever without "breaking out" out of established "state
    envelope" (i.e., subsequent iterations don't add any new knowledge or
    constraints to the verifier state, so running 1, 2, 10, or a million of
    them doesn't matter). But taking into account the contract stating that
    iterator next method *has to* return NULL eventually, we can conclude
    that loop body is safe and will eventually terminate. Given we validated
    logic outside of the loop (NULL case), and concluded that loop body is
    safe (though potentially looping many times), verifier can claim safety
    of the overall program logic.
    
    The rest of the patch is necessary plumbing for state tracking, marking,
    validation, and necessary further kfunc plumbing to allow implementing
    iterator constructor, destructor, and next methods.
    Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20230308184121.1165081-4-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    06accc87
verifier.c 549 KB