• Tejun Heo's avatar
    cpuset: break kernfs active protection in cpuset_write_resmask() · 76bb5ab8
    Tejun Heo authored
    Writing to either "cpuset.cpus" or "cpuset.mems" file flushes
    cpuset_hotplug_work so that cpu or memory hotunplug doesn't end up
    migrating tasks off a cpuset after new resources are added to it.
    
    As cpuset_hotplug_work calls into cgroup core via
    cgroup_transfer_tasks(), this flushing adds the dependency to cgroup
    core locking from cpuset_write_resmak().  This used to be okay because
    cgroup interface files were protected by a different mutex; however,
    8353da1f ("cgroup: remove cgroup_tree_mutex") simplified the
    cgroup core locking and this dependency became a deadlock hazard -
    cgroup file removal performed under cgroup core lock tries to drain
    on-going file operation which is trying to flush cpuset_hotplug_work
    blocked on the same cgroup core lock.
    
    The locking simplification was done because kernfs added an a lot
    easier way to deal with circular dependencies involving kernfs active
    protection.  Let's use the same strategy in cpuset and break active
    protection in cpuset_write_resmask().  While it isn't the prettiest,
    this is a very rare, likely unique, situation which also goes away on
    the unified hierarchy.
    
    The commands to trigger the deadlock warning without the patch and the
    lockdep output follow.
    
     localhost:/ # mount -t cgroup -o cpuset xxx /cpuset
     localhost:/ # mkdir /cpuset/tmp
     localhost:/ # echo 1 > /cpuset/tmp/cpuset.cpus
     localhost:/ # echo 0 > cpuset/tmp/cpuset.mems
     localhost:/ # echo $$ > /cpuset/tmp/tasks
     localhost:/ # echo 0 > /sys/devices/system/cpu/cpu1/online
    
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      3.16.0-rc1-0.1-default+ #7 Not tainted
      -------------------------------------------------------
      kworker/1:0/32649 is trying to acquire lock:
       (cgroup_mutex){+.+.+.}, at: [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150
    
      but task is already holding lock:
       (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
    
      which lock already depends on the new lock.
    
      the existing dependency chain (in reverse order) is:
    
      -> #2 (cpuset_hotplug_work){+.+...}:
      ...
      -> #1 (s_active#175){++++.+}:
      ...
      -> #0 (cgroup_mutex){+.+.+.}:
      ...
    
      other info that might help us debug this:
    
      Chain exists of:
        cgroup_mutex --> s_active#175 --> cpuset_hotplug_work
    
       Possible unsafe locking scenario:
    
    	 CPU0                    CPU1
    	 ----                    ----
        lock(cpuset_hotplug_work);
    				 lock(s_active#175);
    				 lock(cpuset_hotplug_work);
        lock(cgroup_mutex);
    
       *** DEADLOCK ***
    
      2 locks held by kworker/1:0/32649:
       #0:  ("events"){.+.+.+}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
       #1:  (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520
    
      stack backtrace:
      CPU: 1 PID: 32649 Comm: kworker/1:0 Not tainted 3.16.0-rc1-0.1-default+ #7
     ...
      Call Trace:
       [<ffffffff815a5f78>] dump_stack+0x72/0x8a
       [<ffffffff810c263f>] print_circular_bug+0x10f/0x120
       [<ffffffff810c481e>] check_prev_add+0x43e/0x4b0
       [<ffffffff810c4ee6>] validate_chain+0x656/0x7c0
       [<ffffffff810c53d2>] __lock_acquire+0x382/0x660
       [<ffffffff810c57a9>] lock_acquire+0xf9/0x170
       [<ffffffff815aa13f>] mutex_lock_nested+0x6f/0x380
       [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150
       [<ffffffff811129c0>] hotplug_update_tasks_insane+0x110/0x1d0
       [<ffffffff81112bbd>] cpuset_hotplug_update_tasks+0x13d/0x180
       [<ffffffff811148ec>] cpuset_hotplug_workfn+0x18c/0x630
       [<ffffffff810854d4>] process_one_work+0x254/0x520
       [<ffffffff810875dd>] worker_thread+0x13d/0x3d0
       [<ffffffff8108e0c8>] kthread+0xf8/0x100
       [<ffffffff815acaec>] ret_from_fork+0x7c/0xb0
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarLi Zefan <lizefan@huawei.com>
    Tested-by: default avatarLi Zefan <lizefan@huawei.com>
    76bb5ab8
cpuset.c 75 KB