• Andrew Morton's avatar
    [PATCH] scheduler infrastructure · f221af36
    Andrew Morton authored
    From: Ingo Molnar <mingo@elte.hu>
    
    the attached scheduler patch (against test2-mm2) adds the scheduling
    infrastructure items discussed on lkml. I got good feedback - and while i
    dont expect it to solve all problems, it does solve a number of bad ones:
    
     - test_starve.c code from David Mosberger
    
     - thud.c making the system unusuable due to unfairness
    
     - fair/accurate sleep average based on a finegrained clock
    
     - audio skipping way too easily
    
    other changes in sched-test2-mm2-A3:
    
     - ia64 sched_clock() code, from David Mosberger.
    
     - migration thread startup without relying on implicit scheduling
       behavior. While the current 2.6 code is correct (due to the cpu-up code
       adding CPUs one by one), but it's also fragile - and this code cannot
       be carried over into the 2.4 backports. So adding this method would
       clean up the startup and would make it easier to have 2.4 backports.
    
    and here's the original changelog for the scheduler changes:
    
     - cycle accuracy (nanosec resolution) timekeeping within the scheduler.
       This fixes a number of audio artifacts (skipping) i've reproduced. I
       dont think we can get away without going cycle accuracy - reading the
       cycle counter adds some overhead, but it's acceptable. The first
       nanosec-accuracy patch was done by Mike Galbraith - this patch is
       different but similar in nature. I went further in also changing the
       sleep_avg to be of nanosec resolution.
    
     - more finegrained timeslices: there's now a timeslice 'sub unit' of 50
       usecs (TIMESLICE_GRANULARITY) - CPU hogs on the same priority level
       will roundrobin with this unit. This change is intended to make gaming
       latencies shorter.
    
     - include scheduling latency in sleep bonus calculation. This change
       extends the sleep-average calculation to the period of time a task
       spends on the runqueue but doesnt get scheduled yet, right after
       wakeup. Note that tasks that were preempted (ie. not woken up) and are
       still on the runqueue do not get this benefit. This change closes one
       of the last hole in the dynamic priority estimation, it should result
       in interactive tasks getting more priority under heavy load. This
       change also fixes the test-starve.c testcase from David Mosberger.
    
    
    The TSC-based scheduler clock is disabled on ia32 NUMA platforms.  (ie. 
    platforms that have unsynched TSC for sure.) Those platforms should provide
    the proper code to rely on the TSC in a global way.  (no such infrastructure
    exists at the moment - the monotonic TSC-based clock doesnt deal with TSC
    offsets either, as far as i can tell.)
    f221af36
sched.c 66.4 KB