Commit 2f950354 authored by Peter Zijlstra's avatar Peter Zijlstra Committed by Ingo Molnar

sched/fair: Fix fairness issue on migration

Pavan reported that in the presence of very light tasks (or cgroups)
the placement of migrated tasks can cause severe fairness issues.

The problem is that enqueue_entity() places the task before it updates
time, thereby it can place the task far in the past (remember that
light tasks will shoot virtual time forward at a high speed, so in
relation to the pre-existing light task, we can land far in the past).

This is done because update_curr() needs the current task, and we
might be placing the current task.

The obvious solution is to differentiate between the current and any
other task; placing the current before we update time, and placing any
other task after, such that !curr tasks end up at the current moment
in time, and not in the past.

This commit re-introduces the previously reverted commit:

  3a47d512 ("sched/fair: Fix fairness issue on migration")

... which is now safe to do, after we've also fixed another
underlying bug first, in:

  sched/fair: Prepare to fix fairness problems on migration

and cleaned up other details in the migration code:

  sched/core: Kill sched_class::task_waking
Reported-by: default avatarPavan Kondeti <pkondeti@codeaurora.org>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent 59efa0ba
...@@ -3288,17 +3288,27 @@ static inline void check_schedstat_required(void) ...@@ -3288,17 +3288,27 @@ static inline void check_schedstat_required(void)
static void static void
enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
{ {
bool renorm = !(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_MIGRATED);
bool curr = cfs_rq->curr == se;
/* /*
* Update the normalized vruntime before updating min_vruntime * If we're the current task, we must renormalise before calling
* through calling update_curr(). * update_curr().
*/ */
if (!(flags & ENQUEUE_WAKEUP) || (flags & ENQUEUE_MIGRATED)) if (renorm && curr)
se->vruntime += cfs_rq->min_vruntime; se->vruntime += cfs_rq->min_vruntime;
update_curr(cfs_rq);
/* /*
* Update run-time statistics of the 'current'. * Otherwise, renormalise after, such that we're placed at the current
* moment in time, instead of some random moment in the past. Being
* placed in the past could significantly boost this task to the
* fairness detriment of existing tasks.
*/ */
update_curr(cfs_rq); if (renorm && !curr)
se->vruntime += cfs_rq->min_vruntime;
enqueue_entity_load_avg(cfs_rq, se); enqueue_entity_load_avg(cfs_rq, se);
account_entity_enqueue(cfs_rq, se); account_entity_enqueue(cfs_rq, se);
update_cfs_shares(cfs_rq); update_cfs_shares(cfs_rq);
...@@ -3314,7 +3324,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) ...@@ -3314,7 +3324,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
update_stats_enqueue(cfs_rq, se); update_stats_enqueue(cfs_rq, se);
check_spread(cfs_rq, se); check_spread(cfs_rq, se);
} }
if (se != cfs_rq->curr) if (!curr)
__enqueue_entity(cfs_rq, se); __enqueue_entity(cfs_rq, se);
se->on_rq = 1; se->on_rq = 1;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment