• Michal Hocko's avatar
    procfs: do not overflow get_{idle,iowait}_time for nohz · 2a95ea6c
    Michal Hocko authored
    Since commit a25cac51 ("proc: Consider NO_HZ when printing idle and
    iowait times") we are reporting idle/io_wait time also while a CPU is
    tickless.  We rely on get_{idle,iowait}_time functions to retrieve
    proper data.
    
    These functions, however, use usecs_to_cputime to translate micro
    seconds time to cputime64_t.  This is just an alias to usecs_to_jiffies
    which reduces the data type from u64 to unsigned int and also checks
    whether the given parameter overflows jiffies_to_usecs(MAX_JIFFY_OFFSET)
    and returns MAX_JIFFY_OFFSET in that case.
    
    When we overflow depends on CONFIG_HZ but especially for CONFIG_HZ_300
    it is quite low (1431649781) so we are getting MAX_JIFFY_OFFSET for
    >3000s! until we overflow unsigned int.  Just for reference
    CONFIG_HZ_100 has an overflow window around 20s, CONFIG_HZ_250 ~8s and
    CONFIG_HZ_1000 ~2s.
    
    This results in a bug when people saw [h]top going mad reporting 100%
    CPU usage even though there was basically no CPU load.  The reason was
    simply that /proc/stat stopped reporting idle/io_wait changes (and
    reported MAX_JIFFY_OFFSET) and so the only change happening was for user
    system time.
    
    Let's use nsecs_to_jiffies64 instead which doesn't reduce the precision
    to 32b type and it is much more appropriate for cumulative time values
    (unlike usecs_to_jiffies which intended for timeout calculations).
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
    Tested-by: default avatarArtem S. Tashkinov <t.artem@mailcity.com>
    Cc: Dave Jones <davej@redhat.com>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    2a95ea6c
stat.c 5.42 KB