Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Register
  • Sign in
  • L linux
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 0
    • Issues 0
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Kirill Smelkov
  • linux
  • Repository
  • linux
  • kernel
  • workqueue.c
Find file BlameHistoryPermalink
  • Tejun Heo's avatar
    workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup · 98516b60
    Tejun Heo authored Feb 03, 2016
    BugLink: http://bugs.launchpad.net/bugs/1553179
    
    commit d6e022f1 upstream.
    
    When looking up the pool_workqueue to use for an unbound workqueue,
    workqueue assumes that the target CPU is always bound to a valid NUMA
    node.  However, currently, when a CPU goes offline, the mapping is
    destroyed and cpu_to_node() returns NUMA_NO_NODE.
    
    This has always been broken but hasn't triggered often enough before
    874bbfe6 ("workqueue: make sure delayed work run in local cpu").
    After the commit, workqueue forcifully assigns the local CPU for
    delayed work items without explicit target CPU to fix a different
    issue.  This widens the window where CPU can go offline while a
    delayed work item is pending causing delayed work items dispatched
    with target CPU set to an already offlined CPU.  The resulting
    NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
    NULL pool_workqueue and thus crash.
    
    While 874bbfe6
    
     has been reverted for a different reason making the
    bug less visible again, it can still happen.  Fix it by mapping
    NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
    This is a temporary workaround.  The long term solution is keeping CPU
    -> NODE mapping stable across CPU off/online cycles which is being
    worked on.
    
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarMike Galbraith <umgwanakikbuti@gmail.com>
    Cc: Tang Chen <tangchen@cn.fujitsu.com>
    Cc: Rafael J. Wysocki <rafael@kernel.org>
    Cc: Len Brown <len.brown@intel.com>
    Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
    Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com
    
    
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    
    Signed-off-by: default avatarTim Gardner <tim.gardner@canonical.com>
    98516b60
GitLab Nexedi Edition | About GitLab | About Nexedi | 沪ICP备2021021310号-2 | 沪ICP备2021021310号-7