• Benjamin Herrenschmidt's avatar
    [PATCH] del_timer() vs. mod_timer() SMP race · 8c2bfc5b
    Benjamin Herrenschmidt authored
    
    
    We just spent some days fighting a rare race in one of the distro's who backported
    some of timer.c from 2.6 to 2.4 (though they missed a bit).
    
    The actual race we found didn't happen in 2.6 _but_ code inspection showed that a
    similar race is still present in 2.6, explanation below:
    
    Code removing a timer from a list (run_timers or del_timer) takes that CPU list
    lock, does list_del, then timer->base = NULL.
    
    It is mandatory that this timer->base = NULL is visible to other CPUs only after
    the list_del() is complete. If not, then mod timer could see it NULL, thus take it's
    own CPU list lock and not the one for the CPU the timer was beeing removed from the
    list, and thus the list_add in mod_timer() could race with the list_del() from
    run_timers() or del_timer().
    
    Our race happened with run_timers(), which _DOES_ contain a proper smp_wmb() in the
    right spot in 2.6, but didn't in the "backport" we were fighting with.
    
    However, del_timer() doesn't have such a barrier, and thus is subject to this race in
    2.6 as well. This patch fixes it.
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    8c2bfc5b
timer.c 42.5 KB