• Vik Heyndrickx's avatar
    sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems · 245d83e3
    Vik Heyndrickx authored
    commit 20878232 upstream.
    
    Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
    have no load at all.
    
    Uptime and /proc/loadavg on all systems with kernels released during the
    last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
    minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
    idle systems, but the way the kernel calculates this value prevents it
    from getting lower than the mentioned values.
    
    Likewise but not as obviously noticeable, a fully loaded system with no
    processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
    (multiplied by number of cores).
    
    Once the (old) load becomes 93 or higher, it mathematically can never
    get lower than 93, even when the active (load) remains 0 forever.
    This results in the strange 0.00, 0.01, 0.05 uptime values on idle
    systems.  Note: 93/2048 = 0.0454..., which rounds up to 0.05.
    
    It is not correct to add a 0.5 rounding (=1024/2048) here, since the
    result from this function is fed back into the next iteration again,
    so the result of that +0.5 rounding value then gets multiplied by
    (2048-2037), and then rounded again, so there is a virtual "ghost"
    load created, next to the old and active load terms.
    
    By changing the way the internally kept value is rounded, that internal
    value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
    increasing load, the internally kept load value is rounded up, when the
    load is decreasing, the load value is rounded down.
    
    The modified code was tested on nohz=off and nohz kernels. It was tested
    on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
    tested on single, dual, and octal cores system. It was tested on virtual
    hosts and bare hardware. No unwanted effects have been observed, and the
    problems that the patch intended to fix were indeed gone.
    Tested-by: default avatarDamien Wyart <damien.wyart@free.fr>
    Signed-off-by: default avatarVik Heyndrickx <vik.heyndrickx@veribox.net>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Doug Smythies <dsmythies@telus.net>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Fixes: 0f004f5a ("sched: Cure more NO_HZ load average woes")
    Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    [bwh: Backported to 3.2: adjust filename]
    Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
    245d83e3
sched.c 242 KB