• Thomas Pfaff's avatar
    genirq: Synchronize interrupt thread startup · 8707898e
    Thomas Pfaff authored
    A kernel hang can be observed when running setserial in a loop on a kernel
    with force threaded interrupts. The sequence of events is:
    
       setserial
         open("/dev/ttyXXX")
           request_irq()
         do_stuff()
          -> serial interrupt
             -> wake(irq_thread)
    	      desc->threads_active++;
         close()
           free_irq()
             kthread_stop(irq_thread)
         synchronize_irq() <- hangs because desc->threads_active != 0
    
    The thread is created in request_irq() and woken up, but does not get on a
    CPU to reach the actual thread function, which would handle the pending
    wake-up. kthread_stop() sets the should stop condition which makes the
    thread immediately exit, which in turn leaves the stale threads_active
    count around.
    
    This problem was introduced with commit 519cc865, which addressed a
    interrupt sharing issue in the PCIe code.
    
    Before that commit free_irq() invoked synchronize_irq(), which waits for
    the hard interrupt handler and also for associated threads to complete.
    
    To address the PCIe issue synchronize_irq() was replaced with
    __synchronize_hardirq(), which only waits for the hard interrupt handler to
    complete, but not for threaded handlers.
    
    This was done under the assumption, that the interrupt thread already
    reached the thread function and waits for a wake-up, which is guaranteed to
    be handled before acting on the stop condition. The problematic case, that
    the thread would not reach the thread function, was obviously overlooked.
    
    Make sure that the interrupt thread is really started and reaches
    thread_fn() before returning from __setup_irq().
    
    This utilizes the existing wait queue in the interrupt descriptor. The
    wait queue is unused for non-shared interrupts. For shared interrupts the
    usage might cause a spurious wake-up of a waiter in synchronize_irq() or the
    completion of a threaded handler might cause a spurious wake-up of the
    waiter for the ready flag. Both are harmless and have no functional impact.
    
    [ tglx: Amended changelog ]
    
    Fixes: 519cc865 ("genirq: Synchronize only with single thread on free_irq()")
    Signed-off-by: default avatarThomas Pfaff <tpfaff@pcs.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/552fe7b4-9224-b183-bb87-a8f36d335690@pcs.com
    8707898e
internals.h 14.3 KB