• Andrew Morton's avatar
    [PATCH] sched: add local load metrics · 1ec43096
    Andrew Morton authored
    From: Nick Piggin <piggin@cyberone.com.au>
    
    This patch removes the per runqueue array of NR_CPU arrays.  Each time we
    want to check a remote CPU's load we check nr_running as well anyway, so
    introduce a cpu_load which is the load of the local runqueue and is kept
    updated in the timer tick.  Put them in the same cacheline.
    
    This has additional benefits of having the cpu_load consistent across all
    CPUs and more up to date.  It is sampled better too, being updated once per
    timer tick.
    
    This shouldn't make much difference in scheduling behaviour, but all
    benchmarks are either as good or better on the 16-way NUMAQ: hackbench,
    reaim, volanomark are about the same, tbench and dbench are maybe a bit
    better.  kernbench is about one percent better.
    
    John reckons it isn't a big deal, but it does save 4K per CPU or 2MB total
    on his big systems, so I figure it must be a bit kinder on the caches.  I
    think it is just nicer in general anyway.
    1ec43096
sched.c 89.5 KB