• Kirill Tkhai's avatar
    sched/fair: Disable runtime_enabled on dying rq · 0e59bdae
    Kirill Tkhai authored
    We kill rq->rd on the CPU_DOWN_PREPARE stage:
    
    	cpuset_cpu_inactive -> cpuset_update_active_cpus -> partition_sched_domains ->
    	-> cpu_attach_domain -> rq_attach_root -> set_rq_offline
    
    This unthrottles all throttled cfs_rqs.
    
    But the cpu is still able to call schedule() till
    
    	take_cpu_down->__cpu_disable()
    
    is called from stop_machine.
    
    This case the tasks from just unthrottled cfs_rqs are pickable
    in a standard scheduler way, and they are picked by dying cpu.
    The cfs_rqs becomes throttled again, and migrate_tasks()
    in migration_call skips their tasks (one more unthrottle
    in migrate_tasks()->CPU_DYING does not happen, because rq->rd
    is already NULL).
    
    Patch sets runtime_enabled to zero. This guarantees, the runtime
    is not accounted, and the cfs_rqs won't exceed given
    cfs_rq->runtime_remaining = 1, and tasks will be pickable
    in migrate_tasks(). runtime_enabled is recalculated again
    when rq becomes online again.
    
    Ben Segall also noticed, we always enable runtime in
    tg_set_cfs_bandwidth(). Actually, we should do that for online
    cpus only. To prevent races with unthrottle_offline_cfs_rqs()
    we take get_online_cpus() lock.
    Reviewed-by: default avatarBen Segall <bsegall@google.com>
    Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
    Signed-off-by: default avatarKirill Tkhai <ktkhai@parallels.com>
    CC: Konstantin Khorenko <khorenko@parallels.com>
    CC: Paul Turner <pjt@google.com>
    CC: Mike Galbraith <umgwanakikbuti@gmail.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/1403684382.3462.42.camel@tkhaiSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    0e59bdae
core.c 192 KB