1. 29 Oct, 2019 9 commits
  2. 21 Oct, 2019 11 commits
    • Vincent Guittot's avatar
      sched/fair: Rework find_idlest_group() · 57abff06
      Vincent Guittot authored
      The slow wake up path computes per sched_group statisics to select the
      idlest group, which is quite similar to what load_balance() is doing
      for selecting busiest group. Rework find_idlest_group() to classify the
      sched_group and select the idlest one following the same steps as
      load_balance().
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-12-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      57abff06
    • Vincent Guittot's avatar
      sched/fair: Optimize find_idlest_group() · fc1273f4
      Vincent Guittot authored
      find_idlest_group() now reads CPU's load_avg in two different ways.
      
      Consolidate the function to read and use load_avg only once and simplify
      the algorithm to only look for the group with lowest load_avg.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-11-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fc1273f4
    • Vincent Guittot's avatar
      sched/fair: Use load instead of runnable load in wakeup path · 11f10e54
      Vincent Guittot authored
      Runnable load was originally introduced to take into account the case where
      blocked load biases the wake up path which may end to select an overloaded
      CPU with a large number of runnable tasks instead of an underutilized
      CPU with a huge blocked load.
      
      Tha wake up path now starts looking for idle CPUs before comparing
      runnable load and it's worth aligning the wake up path with the
      load_balance() logic.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-10-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      11f10e54
    • Vincent Guittot's avatar
      sched/fair: Use utilization to select misfit task · c63be7be
      Vincent Guittot authored
      Utilization is used to detect a misfit task but the load is then used to
      select the task on the CPU which can lead to select a small task with
      high weight instead of the task that triggered the misfit migration.
      
      Check that task can't fit the CPU's capacity when selecting the misfit
      task instead of using the load.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Acked-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-9-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c63be7be
    • Vincent Guittot's avatar
      sched/fair: Spread out tasks evenly when not overloaded · 2ab4092f
      Vincent Guittot authored
      When there is only one CPU per group, using the idle CPUs to evenly spread
      tasks doesn't make sense and nr_running is a better metrics.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-8-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2ab4092f
    • Vincent Guittot's avatar
      sched/fair: Use load instead of runnable load in load_balance() · b0fb1eb4
      Vincent Guittot authored
      'runnable load' was originally introduced to take into account the case
      where blocked load biases the load balance decision which was selecting
      underutilized groups with huge blocked load whereas other groups were
      overloaded.
      
      The load is now only used when groups are overloaded. In this case,
      it's worth being conservative and taking into account the sleeping
      tasks that might wake up on the CPU.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-7-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b0fb1eb4
    • Vincent Guittot's avatar
      sched/fair: Use rq->nr_running when balancing load · 5e23e474
      Vincent Guittot authored
      CFS load_balance() only takes care of CFS tasks whereas CPUs can be used by
      other scheduling classes. Typically, a CFS task preempted by an RT or deadline
      task will not get a chance to be pulled by another CPU because
      load_balance() doesn't take into account tasks from other classes.
      Add sum of nr_running in the statistics and use it to detect such
      situations.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-6-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5e23e474
    • Vincent Guittot's avatar
      sched/fair: Rework load_balance() · 0b0695f2
      Vincent Guittot authored
      The load_balance() algorithm contains some heuristics which have become
      meaningless since the rework of the scheduler's metrics like the
      introduction of PELT.
      
      Furthermore, load is an ill-suited metric for solving certain task
      placement imbalance scenarios.
      
      For instance, in the presence of idle CPUs, we should simply try to get at
      least one task per CPU, whereas the current load-based algorithm can actually
      leave idle CPUs alone simply because the load is somewhat balanced.
      
      The current algorithm ends up creating virtual and meaningless values like
      the avg_load_per_task or tweaks the state of a group to make it overloaded
      whereas it's not, in order to try to migrate tasks.
      
      load_balance() should better qualify the imbalance of the group and clearly
      define what has to be moved to fix this imbalance.
      
      The type of sched_group has been extended to better reflect the type of
      imbalance. We now have:
      
      	group_has_spare
      	group_fully_busy
      	group_misfit_task
      	group_asym_packing
      	group_imbalanced
      	group_overloaded
      
      Based on the type of sched_group, load_balance now sets what it wants to
      move in order to fix the imbalance. It can be some load as before but also
      some utilization, a number of task or a type of task:
      
      	migrate_task
      	migrate_util
      	migrate_load
      	migrate_misfit
      
      This new load_balance() algorithm fixes several pending wrong tasks
      placement:
      
       - the 1 task per CPU case with asymmetric system
       - the case of cfs task preempted by other class
       - the case of tasks not evenly spread on groups with spare capacity
      
      Also the load balance decisions have been consolidated in the 3 functions
      below after removing the few bypasses and hacks of the current code:
      
       - update_sd_pick_busiest() select the busiest sched_group.
       - find_busiest_group() checks if there is an imbalance between local and
         busiest group.
       - calculate_imbalance() decides what have to be moved.
      
      Finally, the now unused field total_running of struct sd_lb_stats has been
      removed.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: riel@surriel.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-5-git-send-email-vincent.guittot@linaro.org
      [ Small readability and spelling updates. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0b0695f2
    • Vincent Guittot's avatar
      sched/fair: Remove meaningless imbalance calculation · fcf0553d
      Vincent Guittot authored
      Clean up load_balance() and remove meaningless calculation and fields before
      adding a new algorithm.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-4-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fcf0553d
    • Vincent Guittot's avatar
      sched/fair: Rename sg_lb_stats::sum_nr_running to sum_h_nr_running · a3498347
      Vincent Guittot authored
      Rename sum_nr_running to sum_h_nr_running because it effectively tracks
      cfs->h_nr_running so we can use sum_nr_running to track rq->nr_running
      when needed.
      
      There are no functional changes.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Reviewed-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: srikar@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-3-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a3498347
    • Vincent Guittot's avatar
      sched/fair: Clean up asym packing · 490ba971
      Vincent Guittot authored
      Clean up asym packing to follow the default load balance behavior:
      
      - classify the group by creating a group_asym_packing field.
      - calculate the imbalance in calculate_imbalance() instead of bypassing it.
      
      We don't need to test twice same conditions anymore to detect asym packing
      and we consolidate the calculation of imbalance in calculate_imbalance().
      
      There is no functional changes.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Morten.Rasmussen@arm.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: hdanton@sina.com
      Cc: parth@linux.ibm.com
      Cc: pauld@redhat.com
      Cc: quentin.perret@arm.com
      Cc: srikar@linux.vnet.ibm.com
      Cc: valentin.schneider@arm.com
      Link: https://lkml.kernel.org/r/1571405198-27570-2-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      490ba971
  3. 17 Oct, 2019 1 commit
  4. 09 Oct, 2019 4 commits
    • Frederic Weisbecker's avatar
      sched/cputime: Spare a seqcount lock/unlock cycle on context switch · 8d495477
      Frederic Weisbecker authored
      On context switch we are locking the vtime seqcount of the scheduling-out
      task twice:
      
       * On vtime_task_switch_common(), when we flush the pending vtime through
         vtime_account_system()
      
       * On arch_vtime_task_switch() to reset the vtime state.
      
      This is pointless as these actions can be performed without the need
      to unlock/lock in the middle. The reason these steps are separated is to
      consolidate a very small amount of common code between
      CONFIG_VIRT_CPU_ACCOUNTING_GEN and CONFIG_VIRT_CPU_ACCOUNTING_NATIVE.
      
      Performance in this fast path is definitely a priority over artificial
      code factorization so split the task switch code between GEN and
      NATIVE and mutualize the parts than can run under a single seqcount
      locked block.
      
      As a side effect, vtime_account_idle() becomes included in the seqcount
      protection. This happens to be a welcome preparation in order to
      properly support kcpustat under vtime in the future and fetch
      CPUTIME_IDLE without race.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
      Link: https://lkml.kernel.org/r/20191003161745.28464-3-frederic@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8d495477
    • Frederic Weisbecker's avatar
      sched/cputime: Rename vtime_account_system() to vtime_account_kernel() · f83eeb1a
      Frederic Weisbecker authored
      vtime_account_system() decides if we need to account the time to the
      system (__vtime_account_system()) or to the guest (vtime_account_guest()).
      
      So this function is a misnomer as we are on a higher level than
      "system". All we know when we call that function is that we are
      accounting kernel cputime. Whether it belongs to guest or system time
      is a lower level detail.
      
      Rename this function to vtime_account_kernel(). This will clarify things
      and avoid too many underscored vtime_account_system() versions.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
      Link: https://lkml.kernel.org/r/20191003161745.28464-2-frederic@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f83eeb1a
    • Frederic Weisbecker's avatar
      sched/vtime: Fix guest/system mis-accounting on task switch · 68e7a4d6
      Frederic Weisbecker authored
      vtime_account_system() assumes that the target task to account cputime
      to is always the current task. This is most often true indeed except on
      task switch where we call:
      
      	vtime_common_task_switch(prev)
      		vtime_account_system(prev)
      
      Here prev is the scheduling-out task where we account the cputime to. It
      doesn't match current that is already the scheduling-in task at this
      stage of the context switch.
      
      So we end up checking the wrong task flags to determine if we are
      accounting guest or system time to the previous task.
      
      As a result the wrong task is used to check if the target is running in
      guest mode. We may then spuriously account or leak either system or
      guest time on task switch.
      
      Fix this assumption and also turn vtime_guest_enter/exit() to use the
      task passed in parameter as well to avoid future similar issues.
      Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Fixes: 2a42eb95 ("sched/cputime: Accumulate vtime on top of nsec clocksource")
      Link: https://lkml.kernel.org/r/20190925214242.21873-1-frederic@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      68e7a4d6
    • Xuewei Zhang's avatar
      sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision · 4929a4e6
      Xuewei Zhang authored
      The quota/period ratio is used to ensure a child task group won't get
      more bandwidth than the parent task group, and is calculated as:
      
        normalized_cfs_quota() = [(quota_us << 20) / period_us]
      
      If the quota/period ratio was changed during this scaling due to
      precision loss, it will cause inconsistency between parent and child
      task groups.
      
      See below example:
      
      A userspace container manager (kubelet) does three operations:
      
       1) Create a parent cgroup, set quota to 1,000us and period to 10,000us.
       2) Create a few children cgroups.
       3) Set quota to 1,000us and period to 10,000us on a child cgroup.
      
      These operations are expected to succeed. However, if the scaling of
      147/128 happens before step 3, quota and period of the parent cgroup
      will be changed:
      
        new_quota: 1148437ns,   1148us
       new_period: 11484375ns, 11484us
      
      And when step 3 comes in, the ratio of the child cgroup will be
      104857, which will be larger than the parent cgroup ratio (104821),
      and will fail.
      
      Scaling them by a factor of 2 will fix the problem.
      Tested-by: default avatarPhil Auld <pauld@redhat.com>
      Signed-off-by: default avatarXuewei Zhang <xueweiz@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarPhil Auld <pauld@redhat.com>
      Cc: Anton Blanchard <anton@ozlabs.org>
      Cc: Ben Segall <bsegall@google.com>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Fixes: 2e8e1922 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup")
      Link: https://lkml.kernel.org/r/20191004001243.140897-1-xueweiz@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4929a4e6
  5. 01 Oct, 2019 1 commit
    • Peter Zijlstra's avatar
      membarrier: Fix RCU locking bug caused by faulty merge · 73956fc0
      Peter Zijlstra authored
      The following commit:
      
        227a4aad ("sched/membarrier: Fix p->mm->membarrier_state racy load")
      
      got fat fingered by me when merging it with other patches. It meant to move
      the RCU section out of the for loop but ended up doing it partially, leaving
      a superfluous rcu_read_lock() inside, causing havok.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Kirill Tkhai <tkhai@yandex.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paul E. McKenney <paulmck@linux.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King - ARM Linux admin <linux@armlinux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-tip-commits@vger.kernel.org
      Fixes: 227a4aad ("sched/membarrier: Fix p->mm->membarrier_state racy load")
      Link: https://lkml.kernel.org/r/20191001085033.GP4519@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      73956fc0
  6. 30 Sep, 2019 14 commits
    • Linus Torvalds's avatar
      Linux 5.4-rc1 · 54ecb8f7
      Linus Torvalds authored
      54ecb8f7
    • Linus Torvalds's avatar
      Merge tag 'for-5.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · bb48a591
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A bunch of fixes that accumulated in recent weeks, mostly material for
        stable.
      
        Summary:
      
         - fix for regression from 5.3 that prevents to use balance convert
           with single profile
      
         - qgroup fixes: rescan race, accounting leak with multiple writers,
           potential leak after io failure recovery
      
         - fix for use after free in relocation (reported by KASAN)
      
         - other error handling fixups"
      
      * tag 'for-5.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: qgroup: Fix reserved data space leak if we have multiple reserve calls
        btrfs: qgroup: Fix the wrong target io_tree when freeing reserved data space
        btrfs: Fix a regression which we can't convert to SINGLE profile
        btrfs: relocation: fix use-after-free on dead relocation roots
        Btrfs: fix race setting up and completing qgroup rescan workers
        Btrfs: fix missing error return if writeback for extent buffer never started
        btrfs: adjust dirty_metadata_bytes after writeback failure of extent buffer
        Btrfs: fix selftests failure due to uninitialized i_mode in test inodes
      bb48a591
    • Linus Torvalds's avatar
      Merge tag 'csky-for-linus-5.4-rc1' of git://github.com/c-sky/csky-linux · 80b29b6b
      Linus Torvalds authored
      Pull csky updates from Guo Ren:
       "This round of csky subsystem just some fixups:
      
         - Fix mb() synchronization problem
      
         - Fix dma_alloc_coherent with PAGE_SO attribute
      
         - Fix cache_op failed when cross memory ZONEs
      
         - Optimize arch_sync_dma_for_cpu/device with dma_inv_range
      
         - Fix ioremap function losing
      
         - Fix arch_get_unmapped_area() implementation
      
         - Fix defer cache flush for 610
      
         - Support kernel non-aligned access
      
         - Fix 610 vipt cache flush mechanism
      
         - Fix add zero_fp fixup perf backtrace panic
      
         - Move static keyword to the front of declaration
      
         - Fix csky_pmu.max_period assignment
      
         - Use generic free_initrd_mem()
      
         - entry: Remove unneeded need_resched() loop"
      
      * tag 'csky-for-linus-5.4-rc1' of git://github.com/c-sky/csky-linux:
        csky: Move static keyword to the front of declaration
        csky: entry: Remove unneeded need_resched() loop
        csky: Fixup csky_pmu.max_period assignment
        csky: Fixup add zero_fp fixup perf backtrace panic
        csky: Use generic free_initrd_mem()
        csky: Fixup 610 vipt cache flush mechanism
        csky: Support kernel non-aligned access
        csky: Fixup defer cache flush for 610
        csky: Fixup arch_get_unmapped_area() implementation
        csky: Fixup ioremap function losing
        csky: Optimize arch_sync_dma_for_cpu/device with dma_inv_range
        csky/dma: Fixup cache_op failed when cross memory ZONEs
        csky: Fixup dma_alloc_coherent with PAGE_SO attribute
        csky: Fixup mb() synchronization problem
      80b29b6b
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · cef0aa0c
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A few fixes that have trickled in through the merge window:
      
         - Video fixes for OMAP due to panel-dpi driver removal
      
         - Clock fixes for OMAP that broke no-idle quirks + nfsroot on DRA7
      
         - Fixing arch version on ASpeed ast2500
      
         - Two fixes for reset handling on ARM SCMI"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        ARM: aspeed: ast2500 is ARMv6K
        reset: reset-scmi: add missing handle initialisation
        firmware: arm_scmi: reset: fix reset_state assignment in scmi_domain_reset
        bus: ti-sysc: Remove unpaired sysc_clkdm_deny_idle()
        ARM: dts: logicpd-som-lv: Fix i2c2 and i2c3 Pin mux
        ARM: dts: am3517-evm: Fix missing video
        ARM: dts: logicpd-torpedo-baseboard: Fix missing video
        ARM: omap2plus_defconfig: Fix missing video
        bus: ti-sysc: Fix handling of invalid clocks
        bus: ti-sysc: Fix clock handling for no-idle quirks
      cef0aa0c
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · cf4f493b
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "A few more tracing fixes:
      
         - Fix a buffer overflow by checking nr_args correctly in probes
      
         - Fix a warning that is reported by clang
      
         - Fix a possible memory leak in error path of filter processing
      
         - Fix the selftest that checks for failures, but wasn't failing
      
         - Minor clean up on call site output of a memory trace event"
      
      * tag 'trace-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        selftests/ftrace: Fix same probe error test
        mm, tracing: Print symbol name for call_site in trace events
        tracing: Have error path in predicate_parse() free its allocated memory
        tracing: Fix clang -Wint-in-bool-context warnings in IF_ASSIGN macro
        tracing/probe: Fix to check the difference of nr_args before adding probe
      cf4f493b
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · c710364f
      Linus Torvalds authored
      Pull more MMC updates from Ulf Hansson:
       "A couple more updates/fixes for MMC:
      
         - sdhci-pci: Add Genesys Logic GL975x support
      
         - sdhci-tegra: Recover loss in throughput for DMA
      
         - sdhci-of-esdhc: Fix DMA bug"
      
      * tag 'mmc-v5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: host: sdhci-pci: Add Genesys Logic GL975x support
        mmc: tegra: Implement ->set_dma_mask()
        mmc: sdhci: Let drivers define their DMA mask
        mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence
        mmc: sdhci: improve ADMA error reporting
      c710364f
    • Krzysztof Wilczynski's avatar
      csky: Move static keyword to the front of declaration · 9af032a3
      Krzysztof Wilczynski authored
      Move the static keyword to the front of declaration of
      csky_pmu_of_device_ids, and resolve the following compiler
      warning that can be seen when building with warnings
      enabled (W=1):
      
      arch/csky/kernel/perf_event.c:1340:1: warning:
        ‘static’ is not at beginning of declaration [-Wold-style-declaration]
      Signed-off-by: default avatarKrzysztof Wilczynski <kw@linux.com>
      Signed-off-by: default avatarGuo Ren <guoren@kernel.org>
      9af032a3
    • Valentin Schneider's avatar
      csky: entry: Remove unneeded need_resched() loop · a2139d3b
      Valentin Schneider authored
      Since the enabling and disabling of IRQs within preempt_schedule_irq()
      is contained in a need_resched() loop, we don't need the outer arch
      code loop.
      Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Signed-off-by: default avatarGuo Ren <guoren@kernel.org>
      a2139d3b
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 97f9a3c4
      Linus Torvalds authored
      Pull Documentation/process update from Greg KH:
       "Here are two small Documentation/process/embargoed-hardware-issues.rst
        file updates that missed my previous char/misc pull request.
      
        The first one adds an Intel representative for the process, and the
        second one cleans up the text a bit more when it comes to how the
        disclosure rules work, as it was a bit confusing to some companies"
      
      * tag 'char-misc-5.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        Documentation/process: Clarify disclosure rules
        Documentation/process: Volunteer as the ambassador for Intel
      97f9a3c4
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 1eb80d6f
      Linus Torvalds authored
      Pull more vfs updates from Al Viro:
       "A couple of misc patches"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        afs dynroot: switch to simple_dir_operations
        fs/handle.c - fix up kerneldoc
      1eb80d6f
    • Linus Torvalds's avatar
      Merge tag '5.4-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 7edee522
      Linus Torvalds authored
      Pull more cifs updates from Steve French:
       "Fixes from the recent SMB3 Test events and Storage Developer
        Conference (held the last two weeks).
      
        Here are nine smb3 patches including an important patch for debugging
        traces with wireshark, with three patches marked for stable.
      
        Additional fixes from last week to better handle some newly discovered
        reparse points, and a fix the create/mkdir path for setting the mode
        more atomically (in SMB3 Create security descriptor context), and one
        for path name processing are still being tested so are not included
        here"
      
      * tag '5.4-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        CIFS: Fix oplock handling for SMB 2.1+ protocols
        smb3: missing ACL related flags
        smb3: pass mode bits into create calls
        smb3: Add missing reparse tags
        CIFS: fix max ea value size
        fs/cifs/sess.c: Remove set but not used variable 'capabilities'
        fs/cifs/smb2pdu.c: Make SMB2_notify_init static
        smb3: fix leak in "open on server" perf counter
        smb3: allow decryption keys to be dumped by admin for debugging
      7edee522
    • Mao Han's avatar
      csky: Fixup csky_pmu.max_period assignment · 3a09d8e2
      Mao Han authored
      The csky_pmu.max_period has type u64, and BIT() can only return
      32 bits unsigned long on C-SKY. The initialization for max_period
      will be incorrect when count_width is bigger than 32.
      
      Use BIT_ULL()
      Signed-off-by: default avatarMao Han <han_mao@c-sky.com>
      Signed-off-by: default avatarGuo Ren <ren_guo@c-sky.com>
      3a09d8e2
    • Guo Ren's avatar
      csky: Fixup add zero_fp fixup perf backtrace panic · 48ede51f
      Guo Ren authored
      We need set fp zero to let backtrace know the end. The patch fixup perf
      callchain panic problem, because backtrace didn't know what is the end
      of fp.
      Signed-off-by: default avatarGuo Ren <ren_guo@c-sky.com>
      Reported-by: default avatarMao Han <han_mao@c-sky.com>
      48ede51f
    • Mike Rapoport's avatar
      csky: Use generic free_initrd_mem() · fdbdcddc
      Mike Rapoport authored
      The csky implementation of free_initrd_mem() is an open-coded version of
      free_reserved_area() without poisoning.
      
      Remove it and make csky use the generic version of free_initrd_mem().
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarGuo Ren <guoren@kernel.org>
      fdbdcddc