• Peter Zijlstra's avatar
    sched: Fix race in task_call_func() · 91dabf33
    Peter Zijlstra authored
    There is a very narrow race between schedule() and task_call_func().
    
      CPU0						CPU1
    
      __schedule()
        rq_lock();
        prev_state = READ_ONCE(prev->__state);
        if (... && prev_state) {
          deactivate_tasl(rq, prev, ...)
            prev->on_rq = 0;
    
    						task_call_func()
    						  raw_spin_lock_irqsave(p->pi_lock);
    						  state = READ_ONCE(p->__state);
    						  smp_rmb();
    						  if (... || p->on_rq) // false!!!
    						    rq = __task_rq_lock()
    
    						  ret = func();
    
        next = pick_next_task();
        rq = context_switch(prev, next)
          prepare_lock_switch()
            spin_release(&__rq_lockp(rq)->dep_map...)
    
    So while the task is on it's way out, it still holds rq->lock for a
    little while, and right then task_call_func() comes in and figures it
    doesn't need rq->lock anymore (because the task is already dequeued --
    but still running there) and then the __set_task_frozen() thing observes
    it's holding rq->lock and yells murder.
    
    Avoid this by waiting for p->on_cpu to get cleared, which guarantees
    the task is fully finished on the old CPU.
    
    ( While arguably the fixes tag is 'wrong' -- none of the previous
      task_call_func() users appears to care for this case. )
    
    Fixes: f5d39b02 ("freezer,sched: Rewrite core freezer logic")
    Reported-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
    Link: https://lkml.kernel.org/r/Y1kdRNNfUeAU+FNl@hirez.programming.kicks-ass.net
    91dabf33
core.c 285 KB