Commit db1b1fef authored by Jack Steiner's avatar Jack Steiner Committed by Linus Torvalds

[PATCH] sched: reduce overhead of calc_load

Currently, count_active_tasks() calls both nr_running() &
nr_interruptible().  Each of these functions does a "for_each_cpu" & reads
values from the runqueue of each cpu.  Although this is not a lot of
instructions, each runqueue may be located on different node.  Depending on
the architecture, a unique TLB entry may be required to access each
runqueue.

Since there may be more runqueues than cpu TLB entries, a scan of all
runqueues can trash the TLB.  Each memory reference incurs a TLB miss &
refill.

In addition, the runqueue cacheline that contains nr_running &
nr_uninterruptible may be evicted from the cache between the two passes.
This causes unnecessary cache misses.

Combining nr_running() & nr_interruptible() into a single function
substantially reduces the TLB & cache misses on large systems.  This should
have no measureable effect on smaller systems.

On a 128p IA64 system running a memory stress workload, the new function
reduced the overhead of calc_load() from 605 usec/call to 324 usec/call.
Signed-off-by: default avatarJack Steiner <steiner@sgi.com>
Acked-by: default avatarIngo Molnar <mingo@elte.hu>
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 3055adda
...@@ -100,6 +100,7 @@ DECLARE_PER_CPU(unsigned long, process_counts); ...@@ -100,6 +100,7 @@ DECLARE_PER_CPU(unsigned long, process_counts);
extern int nr_processes(void); extern int nr_processes(void);
extern unsigned long nr_running(void); extern unsigned long nr_running(void);
extern unsigned long nr_uninterruptible(void); extern unsigned long nr_uninterruptible(void);
extern unsigned long nr_active(void);
extern unsigned long nr_iowait(void); extern unsigned long nr_iowait(void);
#include <linux/time.h> #include <linux/time.h>
......
...@@ -1658,6 +1658,21 @@ unsigned long nr_iowait(void) ...@@ -1658,6 +1658,21 @@ unsigned long nr_iowait(void)
return sum; return sum;
} }
unsigned long nr_active(void)
{
unsigned long i, running = 0, uninterruptible = 0;
for_each_online_cpu(i) {
running += cpu_rq(i)->nr_running;
uninterruptible += cpu_rq(i)->nr_uninterruptible;
}
if (unlikely((long)uninterruptible < 0))
uninterruptible = 0;
return running + uninterruptible;
}
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
/* /*
......
...@@ -825,7 +825,7 @@ void update_process_times(int user_tick) ...@@ -825,7 +825,7 @@ void update_process_times(int user_tick)
*/ */
static unsigned long count_active_tasks(void) static unsigned long count_active_tasks(void)
{ {
return (nr_running() + nr_uninterruptible()) * FIXED_1; return nr_active() * FIXED_1;
} }
/* /*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment