• Dave Marchevsky's avatar
    bpf: Refcount task stack in bpf_get_task_stack · 06ab134c
    Dave Marchevsky authored
    On x86 the struct pt_regs * grabbed by task_pt_regs() points to an
    offset of task->stack. The pt_regs are later dereferenced in
    __bpf_get_stack (e.g. by user_mode() check). This can cause a fault if
    the task in question exits while bpf_get_task_stack is executing, as
    warned by task_stack_page's comment:
    
    * When accessing the stack of a non-current task that might exit, use
    * try_get_task_stack() instead.  task_stack_page will return a pointer
    * that could get freed out from under you.
    
    Taking the comment's advice and using try_get_task_stack() and
    put_task_stack() to hold task->stack refcount, or bail early if it's
    already 0. Incrementing stack_refcount will ensure the task's stack
    sticks around while we're using its data.
    
    I noticed this bug while testing a bpf task iter similar to
    bpf_iter_task_stack in selftests, except mine grabbed user stack, and
    getting intermittent crashes, which resulted in dumps like:
    
      BUG: unable to handle page fault for address: 0000000000003fe0
      \#PF: supervisor read access in kernel mode
      \#PF: error_code(0x0000) - not-present page
      RIP: 0010:__bpf_get_stack+0xd0/0x230
      <snip...>
      Call Trace:
      bpf_prog_0a2be35c092cb190_get_task_stacks+0x5d/0x3ec
      bpf_iter_run_prog+0x24/0x81
      __task_seq_show+0x58/0x80
      bpf_seq_read+0xf7/0x3d0
      vfs_read+0x91/0x140
      ksys_read+0x59/0xd0
      do_syscall_64+0x48/0x120
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
    
    Fixes: fa28dcb8 ("bpf: Introduce helper bpf_get_task_stack()")
    Signed-off-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Acked-by: default avatarSong Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/bpf/20210401000747.3648767-1-davemarchevsky@fb.com
    06ab134c
stackmap.c 18.5 KB