• Heiko Carstens's avatar
    generic-ipi: Fix deadlock in __smp_call_function_single · 27c379f7
    Heiko Carstens authored
    Just got my 6 way machine to a state where cpu 0 is in an
    endless loop within __smp_call_function_single.
    All other cpus are idle.
    
    The call trace on cpu 0 looks like this:
    
     __smp_call_function_single
     scheduler_tick
     update_process_times
     tick_sched_timer
     __run_hrtimer
     hrtimer_interrupt
     clock_comparator_work
     do_extint
     ext_int_handler
     ----> timer irq
     cpu_idle
    
    __smp_call_function_single() got called from nohz_balancer_kick()
    (inlined) with the remote cpu being 1, wait being 0 and the per
    cpu variable remote_sched_softirq_cb (call_single_data) of the
    current cpu (0).
    
    Then it loops forever when it tries to grab the lock of the
    call_single_data, since it is already locked and enqueued on cpu 0.
    
    My theory how this could have happened: for some reason the
    scheduler decided to call __smp_call_function_single() on it's own
    cpu, and sends an IPI to itself. The interrupt stays pending
    since IRQs are disabled. If then the hypervisor schedules the
    cpu away it might happen that upon rescheduling both the IPI and
    the timer IRQ are pending. If then interrupts are enabled again
    it depends which one gets scheduled first.
    If the timer interrupt gets delivered first we end up with the
    local deadlock as seen in the calltrace above.
    
    Let's make __smp_call_function_single() check if the target cpu is
    the current cpu and execute the function immediately just like
    smp_call_function_single does. That should prevent at least the
    scenario described here.
    
    It might also be that the scheduler is not supposed to call
    __smp_call_function_single with the remote cpu being the current
    cpu, but that is a different issue.
    Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
    Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: default avatarJens Axboe <jaxboe@fusionio.com>
    Cc: Venkatesh Pallipadi <venki@google.com>
    Cc: Suresh Siddha <suresh.b.siddha@intel.com>
    LKML-Reference: <20100910114729.GB2827@osiris.boeblingen.de.ibm.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    27c379f7
smp.c 13.6 KB