1. 02 Aug, 2009 5 commits
    • Peter Zijlstra's avatar
      sched: Fix cgroup smp fairness · a5004278
      Peter Zijlstra authored
      Commit ec4e0e2f ("fix
      inconsistency when redistribute per-cpu tg->cfs_rq shares")
      broke cgroup smp fairness.
      
      In order to avoid starvation of newly placed tasks, we never
      quite set the share of an empty cpu group-task to 0, but
      instead we set it as if there's a single NICE-0 task present.
      
      If however we actually set this in cfs_rq[cpu]->shares, that
      means the total shares for that group will be slightly inflated
      every time we balance, causing the observed unfairness.
      
      Fix this by setting cfs_rq[cpu]->shares to 0 but actually
      setting the effective weight of the related se to the inflated
      number.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1248696557.6987.1615.camel@twins>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a5004278
    • Ingo Molnar's avatar
      Merge branch 'sched/urgent' into sched/core · 8e9ed8b0
      Ingo Molnar authored
      Merge reason: avoid upcoming patch conflict.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8e9ed8b0
    • Gregory Haskins's avatar
      sched: Fix race in cpupri introduced by cpumask_var changes · 07903af1
      Gregory Haskins authored
      Background:
      
      Several race conditions in the scheduler have cropped up
      recently, which Steven and I have tracked down using ftrace.
      The most recent one turns out to be a race in how the scheduler
      determines a suitable migration target for RT tasks, introduced
      recently with commit:
      
          commit 68e74568
          Date:   Tue Nov 25 02:35:13 2008 +1030
      
              sched: convert struct cpupri_vec cpumask_var_t.
      
      The original design of cpupri allowed lockless readers to
      quickly determine a best-estimate target.  Races between the
      pri_active bitmap and the vec->mask were handled in the
      original code because we would detect and return "0" when this
      occured.  The design was predicated on the *effective*
      atomicity (*) of caching the result of cpus_and() between the
      cpus_allowed and the vec->mask.
      
      Commit 68e74568 changed the behavior such that vec->mask is
      accessed multiple times.  This introduces a subtle race, the
      result of which means we can have a result that returns "1",
      but with an empty bitmap.
      
      *) yes, we know cpus_and() is not a locked operator across the
         entire composite array, but it is implicitly atomic on a
         per-word basis which is all the design required to work.
      
      Implementation:
      
      Rather than forgoing the lockless design, or reverting to a
      stack-based cpumask_t, we simply check for when the race has
      been encountered and continue processing in the event that the
      race is hit.  This renders the removal race as if the priority
      bit had been atomically cleared as well, and allows the
      algorithm to execute correctly.
      Signed-off-by: default avatarGregory Haskins <ghaskins@novell.com>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: Steven Rostedt <srostedt@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20090730145728.25226.92769.stgit@dev.haskins.net>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      07903af1
    • Peter Zijlstra's avatar
      sched: Fix latencytop and sleep profiling vs group scheduling · e414314c
      Peter Zijlstra authored
      The latencytop and sleep accounting code assumes that any
      scheduler entity represents a task, this is not so.
      
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e414314c
    • Frederic Weisbecker's avatar
      sched: Fix cond_resched_lock() in !CONFIG_PREEMPT · 716a4234
      Frederic Weisbecker authored
      The might_sleep() test inside cond_resched_lock() assumes the
      spinlock is held and then preemption is disabled. This is true
      with CONFIG_PREEMPT but the preempt_count() doesn't change
      otherwise.
      
      Check by starting from the appropriate preempt offset depending
      on the config.
      Reported-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1248458723-12146-1-git-send-email-fweisbec@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      716a4234
  2. 01 Aug, 2009 1 commit
  3. 31 Jul, 2009 19 commits
  4. 30 Jul, 2009 15 commits