• Ran Xiaokai's avatar
    set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds · 2863643f
    Ran Xiaokai authored
    in copy_process(): non root users but with capability CAP_SYS_RESOURCE
    or CAP_SYS_ADMIN will clean PF_NPROC_EXCEEDED flag even
    rlimit(RLIMIT_NPROC) exceeds. Add the same capability check logic here.
    
    Align the permission checks in copy_process() and set_user(). In
    copy_process() CAP_SYS_RESOURCE or CAP_SYS_ADMIN capable users will be
    able to circumvent and clear the PF_NPROC_EXCEEDED flag whereas they
    aren't able to the same in set_user(). There's no obvious logic to this
    and trying to unearth the reason in the thread didn't go anywhere.
    
    The gist seems to be that this code wants to make sure that a program
    can't successfully exec if it has gone through a set*id() transition
    while exceeding its RLIMIT_NPROC.
    A capable but non-INIT_USER caller getting PF_NPROC_EXCEEDED set during
    a set*id() transition wouldn't be able to exec right away if they still
    exceed their RLIMIT_NPROC at the time of exec. So their exec would fail
    in fs/exec.c:
    
            if ((current->flags & PF_NPROC_EXCEEDED) &&
                is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
                    retval = -EAGAIN;
                    goto out_ret;
            }
    
    However, if the caller were to fork() right after the set*id()
    transition but before the exec while still exceeding their RLIMIT_NPROC
    then they would get PF_NPROC_EXCEEDED cleared (while the child would
    inherit it):
    
            retval = -EAGAIN;
            if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) {
                    if (p->real_cred->user != INIT_USER &&
                        !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
                            goto bad_fork_free;
            }
            current->flags &= ~PF_NPROC_EXCEEDED;
    
    which means a subsequent exec by the capable caller would now succeed
    even though they could still exceed their RLIMIT_NPROC limit. This seems
    inconsistent. Allow a CAP_SYS_ADMIN or CAP_SYS_RESOURCE capable user to
    avoid PF_NPROC_EXCEEDED as they already can in copy_process().
    
    Cc: peterz@infradead.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, Ran Xiaokai <ran.xiaokai@zte.com.cn>, , ,
    
    Link: https://lore.kernel.org/r/20210728072629.530435-1-ran.xiaokai@zte.com.cn
    Cc: Neil Brown <neilb@suse.de>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: James Morris <jamorris@linux.microsoft.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: default avatarRan Xiaokai <ran.xiaokai@zte.com.cn>
    Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
    2863643f
sys.c 63.9 KB