• john stultz's avatar
    Fix time() inconsistencies caused by intermediate xtime_cache values being read · 2a4027a4
    john stultz authored
    Currently with 2.6.32-longterm, its possible for time() to occasionally
    return values one second earlier then the previous time() call.
    
    This happens because update_xtime_cache() does:
    	xtime_cache = xtime;
    	timespec_add_ns(&xtime_cache, nsec);
    
    Its possible that xtime is 1sec,999msecs, and nsecs is 1ms, resulting in
    a xtime_cache that is 2sec,0ms.
    
    get_seconds() (which is used by sys_time()) does not take the
    xtime_lock, which is ok as the xtime.tv_sec value is a long and can be
    atomically read safely.
    
    The problem occurs the next call to update_xtime_cache() if xtime has
    not increased:
    	/* This sets xtime_cache back to 1sec, 999msec */
    	xtime_cache = xtime; 
    	/* get_seconds, calls here, and sees a 1second inconsistency */
    	timespec_add_ns(&xtime_cache, nsec);
    
    
    In order to resolve this, we could add locking to get_seconds(), but it
    needs to be lock free, as it is called from the machine check handler,
    opening a possible deadlock.
    
    So instead, this patch introduces an intermediate value for the
    calculations, so that we only assign xtime_cache once with the correct
    time, using ACCESS_ONCE to make sure the compiler doesn't optimize out
    any intermediate values.
    
    The xtime_cache manipulations were removed with 2.6.35, so that kernel
    and later do not need this change.
    
    In 2.6.33 and 2.6.34 the logarithmic accumulation should make it so
    xtime is updated each tick, so it is unlikely that two updates to
    xtime_cache could occur while the difference between xtime and
    xtime_cache crosses the second boundary. However, the paranoid might
    want to pull this into 2.6.33/34-longterm just to be sure.
    
    Thanks to Stephen for helping finally narrow down the root cause and
    many hours of help with testing and validation. Also thanks to Max,
    Andi, Eric and Paul for review of earlier attempts and helping clarify
    what is possible with regard to out of order execution.
    Acked-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: default avatarJohn Stultz <johnstul@us.ibm.com>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
    2a4027a4
timekeeping.c 22.9 KB