• Frederic Weisbecker's avatar
    sched: Stop PF_NO_SETAFFINITY from being inherited by various init system threads · a8ea6fc9
    Frederic Weisbecker authored
    Commit:
    
      00b89fe0 ("sched: Make the idle task quack like a per-CPU kthread")
    
    ... added PF_KTHREAD | PF_NO_SETAFFINITY to the idle kernel threads.
    
    Unfortunately these properties are inherited to the init/0 children
    through kernel_thread() calls: init/1 and kthreadd. There are several
    side effects to that:
    
    1) kthreadd affinity can not be reset anymore from userspace. Also
       PF_NO_SETAFFINITY propagates to all kthreadd children, including
       the unbound kthreads Therefore it's not possible anymore to overwrite
       the affinity of any of them. Here is an example of warning reported
       by rcutorture:
    
    		WARNING: CPU: 0 PID: 116 at kernel/rcu/tree_nocb.h:1306 rcu_bind_current_to_nocb+0x31/0x40
    		Call Trace:
    		 rcu_torture_fwd_prog+0x62/0x730
    		 kthread+0x122/0x140
    		 ret_from_fork+0x22/0x30
    
    2) init/1 does an exec() in the end which clears both
       PF_KTHREAD and PF_NO_SETAFFINITY so we are fine once kernel_init()
       escapes to userspace. But until then, no initcall or init code can
       successfully call sched_setaffinity() to init/1.
    
       Also PF_KTHREAD looks legit on init/1 before it calls exec() but
       we better be careful with unknown introduced side effects.
    
    One way to solve the PF_NO_SETAFFINITY issue is to not inherit this flag
    on copy_process() at all. The cases where it matters are:
    
    * fork_idle(): explicitly set the flag already.
    * fork() syscalls: userspace tasks that shouldn't be concerned by that.
    * create_io_thread(): the callers explicitly attribute the flag to the
                          newly created tasks.
    * kernel_thread():
    	- Fix the issues on init/1 and kthreadd
    	- Fix the issues on kthreadd children.
    	- Usermode helper created by an unbound workqueue. This shouldn't
    	  matter. In the worst case it gives more control to userspace
    	  on setting affinity to these short living tasks although this can
    	  be tuned with inherited unbound workqueues affinity already.
    
    Fixes: 00b89fe0 ("sched: Make the idle task quack like a per-CPU kthread")
    Reported-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Tested-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    Link: https://lore.kernel.org/r/20210525235849.441842-1-frederic@kernel.org
    a8ea6fc9
fork.c 75.6 KB