• John Fastabend's avatar
    bpf: Track subprog poke descriptors correctly and fix use-after-free · f263a814
    John Fastabend authored
    Subprograms are calling map_poke_track(), but on program release there is no
    hook to call map_poke_untrack(). However, on program release, the aux memory
    (and poke descriptor table) is freed even though we still have a reference to
    it in the element list of the map aux data. When we run map_poke_run(), we then
    end up accessing free'd memory, triggering KASAN in prog_array_map_poke_run():
    
      [...]
      [  402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e
      [  402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337
      [  402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G          I       5.12.0+ #399
      [  402.824715] Call Trace:
      [  402.824719]  dump_stack+0x93/0xc2
      [  402.824727]  print_address_description.constprop.0+0x1a/0x140
      [  402.824736]  ? prog_array_map_poke_run+0xc2/0x34e
      [  402.824740]  ? prog_array_map_poke_run+0xc2/0x34e
      [  402.824744]  kasan_report.cold+0x7c/0xd8
      [  402.824752]  ? prog_array_map_poke_run+0xc2/0x34e
      [  402.824757]  prog_array_map_poke_run+0xc2/0x34e
      [  402.824765]  bpf_fd_array_map_update_elem+0x124/0x1a0
      [...]
    
    The elements concerned are walked as follows:
    
        for (i = 0; i < elem->aux->size_poke_tab; i++) {
               poke = &elem->aux->poke_tab[i];
        [...]
    
    The access to size_poke_tab is a 4 byte read, verified by checking offsets
    in the KASAN dump:
    
      [  402.825004] The buggy address belongs to the object at ffff8881905a7800
                     which belongs to the cache kmalloc-1k of size 1024
      [  402.825008] The buggy address is located 320 bytes inside of
                     1024-byte region [ffff8881905a7800, ffff8881905a7c00)
    
    The pahole output of bpf_prog_aux:
    
      struct bpf_prog_aux {
        [...]
        /* --- cacheline 5 boundary (320 bytes) --- */
        u32                        size_poke_tab;        /*   320     4 */
        [...]
    
    In general, subprograms do not necessarily manage their own data structures.
    For example, BTF func_info and linfo are just pointers to the main program
    structure. This allows reference counting and cleanup to be done on the latter
    which simplifies their management a bit. The aux->poke_tab struct, however,
    did not follow this logic. The initial proposed fix for this use-after-free
    bug further embedded poke data tracking into the subprogram with proper
    reference counting. However, Daniel and Alexei questioned why we were treating
    these objects special; I agree, its unnecessary. The fix here removes the per
    subprogram poke table allocation and map tracking and instead simply points
    the aux->poke_tab pointer at the main programs poke table. This way, map
    tracking is simplified to the main program and we do not need to manage them
    per subprogram.
    
    This also means, bpf_prog_free_deferred(), which unwinds the program reference
    counting and kfrees objects, needs to ensure that we don't try to double free
    the poke_tab when free'ing the subprog structures. This is easily solved by
    NULL'ing the poke_tab pointer. The second detail is to ensure that per
    subprogram JIT logic only does fixups on poke_tab[] entries it owns. To do
    this, we add a pointer in the poke structure to point at the subprogram value
    so JITs can easily check while walking the poke_tab structure if the current
    entry belongs to the current program. The aux pointer is stable and therefore
    suitable for such comparison. On the jit_subprogs() error path, we omit
    cleaning up the poke->aux field because these are only ever referenced from
    the JIT side, but on error we will never make it to the JIT, so its fine to
    leave them dangling. Removing these pointers would complicate the error path
    for no reason. However, we do need to untrack all poke descriptors from the
    main program as otherwise they could race with the freeing of JIT memory from
    the subprograms. Lastly, a748c697 ("bpf: propagate poke descriptors to
    subprograms") had an off-by-one on the subprogram instruction index range
    check as it was testing 'insn_idx >= subprog_start && insn_idx <= subprog_end'.
    However, subprog_end is the next subprogram's start instruction.
    
    Fixes: a748c697 ("bpf: propagate poke descriptors to subprograms")
    Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Co-developed-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20210707223848.14580-2-john.fastabend@gmail.com
    f263a814
verifier.c 386 KB