1. 13 Jan, 2023 1 commit
  2. 12 Jan, 2023 5 commits
    • Valentin Schneider's avatar
      workqueue: Unbind kworkers before sending them to exit() · e02b9312
      Valentin Schneider authored
      It has been reported that isolated CPUs can suffer from interference due to
      per-CPU kworkers waking up just to die.
      
      A surge of workqueue activity during initial setup of a latency-sensitive
      application (refresh_vm_stats() being one of the culprits) can cause extra
      per-CPU kworkers to be spawned. Then, said latency-sensitive task can be
      running merrily on an isolated CPU only to be interrupted sometime later by
      a kworker marked for death (cf. IDLE_WORKER_TIMEOUT, 5 minutes after last
      kworker activity).
      
      Prevent this by affining kworkers to the wq_unbound_cpumask (which doesn't
      contain isolated CPUs, cf. HK_TYPE_WQ) before waking them up after marking
      them with WORKER_DIE.
      
      Changing the affinity does require a sleepable context, leverage the newly
      introduced pool->idle_cull_work to get that.
      
      Remove dying workers from pool->workers and keep track of them in a
      separate list. This intentionally prevents for_each_loop_worker() from
      iterating over workers that are marked for death.
      
      Rename destroy_worker() to set_working_dying() to better reflect its
      effects and relationship with wake_dying_workers().
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      e02b9312
    • Valentin Schneider's avatar
      workqueue: Don't hold any lock while rcuwait'ing for !POOL_MANAGER_ACTIVE · 9ab03be4
      Valentin Schneider authored
      put_unbound_pool() currently passes wq_manager_inactive() as exit condition
      to rcuwait_wait_event(), which grabs pool->lock to check for
      
        pool->flags & POOL_MANAGER_ACTIVE
      
      A later patch will require destroy_worker() to be invoked with
      wq_pool_attach_mutex held, which needs to be acquired before
      pool->lock. A mutex cannot be acquired within rcuwait_wait_event(), as
      it could clobber the task state set by rcuwait_wait_event()
      
      Instead, restructure the waiting logic to acquire any necessary lock
      outside of rcuwait_wait_event().
      
      Since further work cannot be inserted into unbound pwqs that have reached
      ->refcnt==0, this is bound to make forward progress as eventually the
      worklist will be drained and need_more_worker(pool) will remain false,
      preventing any worker from stealing the manager position from us.
      Suggested-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      9ab03be4
    • Valentin Schneider's avatar
      workqueue: Convert the idle_timer to a timer + work_struct · 3f959aa3
      Valentin Schneider authored
      A later patch will require a sleepable context in the idle worker timeout
      function. Converting worker_pool.idle_timer to a delayed_work gives us just
      that, however this would imply turning all idle_timer expiries into
      scheduler events (waking up a worker to handle the dwork).
      
      Instead, implement a "custom dwork" where the timer callback does some
      extra checks before queuing the associated work.
      
      No change in functionality intended.
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Reviewed-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      3f959aa3
    • Valentin Schneider's avatar
      workqueue: Factorize unbind/rebind_workers() logic · 793777bc
      Valentin Schneider authored
      Later patches will reuse this code, move it into reusable functions.
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Reviewed-by: default avatarLai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      793777bc
    • Lai Jiangshan's avatar
      workqueue: Protects wq_unbound_cpumask with wq_pool_attach_mutex · 99c621ef
      Lai Jiangshan authored
      When unbind_workers() reads wq_unbound_cpumask to set the affinity of
      freshly-unbound kworkers, it only holds wq_pool_attach_mutex. This isn't
      sufficient as wq_unbound_cpumask is only protected by wq_pool_mutex.
      
      Make wq_unbound_cpumask protected with wq_pool_attach_mutex and also
      remove the need of temporary saved_cpumask.
      
      Fixes: 10a5a651 ("workqueue: Restrict kworker in the offline CPU pool running on housekeeping CPUs")
      Reported-by: default avatarValentin Schneider <vschneid@redhat.com>
      Signed-off-by: default avatarLai Jiangshan <jiangshan.ljs@antgroup.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      99c621ef
  3. 07 Jan, 2023 1 commit
    • Paul E. McKenney's avatar
      workqueue: Make show_pwq() use run-length encoding · c76feb0d
      Paul E. McKenney authored
      The show_pwq() function dumps out a pool_workqueue structure's activity,
      including the pending work-queue handlers:
      
       Showing busy workqueues and worker pools:
       workqueue events: flags=0x0
         pwq 0: cpus=0 node=0 flags=0x1 nice=0 active=10/256 refcnt=11
           in-flight: 7:test_work_func, 64:test_work_func, 249:test_work_func
           pending: test_work_func, test_work_func, test_work_func1, test_work_func1, test_work_func1, test_work_func1, test_work_func1
      
      When large systems are facing certain types of hang conditions, it is not
      unusual for this "pending" list to contain runs of hundreds of identical
      function names.  This "wall of text" is difficult to read, and worse yet,
      it can be interleaved with other output such as stack traces.
      
      Therefore, make show_pwq() use run-length encoding so that the above
      printout instead looks like this:
      
       Showing busy workqueues and worker pools:
       workqueue events: flags=0x0
         pwq 0: cpus=0 node=0 flags=0x1 nice=0 active=10/256 refcnt=11
           in-flight: 7:test_work_func, 64:test_work_func, 249:test_work_func
           pending: 2*test_work_func, 5*test_work_func1
      
      When no comma would be printed, including the WORK_STRUCT_LINKED case,
      a new run is started unconditionally.
      
      This output is more readable, places less stress on the hardware,
      firmware, and software on the console-log path, and reduces interference
      with other output.
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      c76feb0d
  4. 04 Jan, 2023 5 commits
  5. 03 Jan, 2023 7 commits
  6. 02 Jan, 2023 4 commits
  7. 01 Jan, 2023 6 commits
  8. 31 Dec, 2022 2 commits
  9. 30 Dec, 2022 9 commits