• Joel Fernandes (Google)'s avatar
    torture: Fix hang during kthread shutdown phase · d52d3a2b
    Joel Fernandes (Google) authored
    During rcutorture shutdown, the rcu_torture_cleanup() function calls
    torture_cleanup_begin(), which sets the fullstop global variable to
    FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
    fakewriters to exit all of their "while" loops and start shutting down.
    
    They then call torture_kthread_stopping(), which in turn waits for
    kthread_stop() to be called.  However, rcu_torture_cleanup() has
    not yet called kthread_stop() on those threads, and before it gets a
    chance to do so, multiple instances of torture_kthread_stopping() invoke
    schedule_timeout_interruptible(1) in a tight loop.  Tracing confirms that
    TIMER_SOFTIRQ can then continuously execute timer callbacks.  If that
    TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
    task might never invoke kthread_stop().
    
    This commit improves this situation by increasing the timeout passed to
    schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
    This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
    allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
    invocations.  Testing has shown 100 runs of TREE07 passing reliably,
    as oppose to the tens-of-percent failure rates seen beforehand.
    
    Cc: Paul McKenney <paulmck@kernel.org>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
    Cc: <stable@vger.kernel.org> # 6.0.x
    Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
    Tested-by: default avatarZhouyi Zhou <zhouzhouyi@gmail.com>
    Reviewed-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    d52d3a2b
torture.c 25.4 KB