• Tejun Heo's avatar
    workqueue: cond_resched() after processing each work item · b22ce278
    Tejun Heo authored
    If !PREEMPT, a kworker running work items back to back can hog CPU.
    This becomes dangerous when a self-requeueing work item which is
    waiting for something to happen races against stop_machine.  Such
    self-requeueing work item would requeue itself indefinitely hogging
    the kworker and CPU it's running on while stop_machine would wait for
    that CPU to enter stop_machine while preventing anything else from
    happening on all other CPUs.  The two would deadlock.
    
    Jamie Liu reports that this deadlock scenario exists around
    scsi_requeue_run_queue() and libata port multiplier support, where one
    port may exclude command processing from other ports.  With the right
    timing, scsi_requeue_run_queue() can end up requeueing itself trying
    to execute an IO which is asked to be retried while another device has
    an exclusive access, which in turn can't make forward progress due to
    stop_machine.
    
    Fix it by invoking cond_resched() after executing each work item.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarJamie Liu <jamieliu@google.com>
    References: http://thread.gmane.org/gmane.linux.kernel/1552567
    Cc: stable@vger.kernel.org
    --
     kernel/workqueue.c |    9 +++++++++
     1 file changed, 9 insertions(+)
    b22ce278
workqueue.c 139 KB