• Tejun Heo's avatar
    cgroup: implement cgroup v2 thread support · 8cfd8147
    Tejun Heo authored
    This patch implements cgroup v2 thread support.  The goal of the
    thread mode is supporting hierarchical accounting and control at
    thread granularity while staying inside the resource domain model
    which allows coordination across different resource controllers and
    handling of anonymous resource consumptions.
    
    A cgroup is always created as a domain and can be made threaded by
    writing to the "cgroup.type" file.  When a cgroup becomes threaded, it
    becomes a member of a threaded subtree which is anchored at the
    closest ancestor which isn't threaded.
    
    The threads of the processes which are in a threaded subtree can be
    placed anywhere without being restricted by process granularity or
    no-internal-process constraint.  Note that the threads aren't allowed
    to escape to a different threaded subtree.  To be used inside a
    threaded subtree, a controller should explicitly support threaded mode
    and be able to handle internal competition in the way which is
    appropriate for the resource.
    
    The root of a threaded subtree, the nearest ancestor which isn't
    threaded, is called the threaded domain and serves as the resource
    domain for the whole subtree.  This is the last cgroup where domain
    controllers are operational and where all the domain-level resource
    consumptions in the subtree are accounted.  This allows threaded
    controllers to operate at thread granularity when requested while
    staying inside the scope of system-level resource distribution.
    
    As the root cgroup is exempt from the no-internal-process constraint,
    it can serve as both a threaded domain and a parent to normal cgroups,
    so, unlike non-root cgroups, the root cgroup can have both domain and
    threaded children.
    
    Internally, in a threaded subtree, each css_set has its ->dom_cset
    pointing to a matching css_set which belongs to the threaded domain.
    This ensures that thread root level cgroup_subsys_state for all
    threaded controllers are readily accessible for domain-level
    operations.
    
    This patch enables threaded mode for the pids and perf_events
    controllers.  Neither has to worry about domain-level resource
    consumptions and it's enough to simply set the flag.
    
    For more details on the interface and behavior of the thread mode,
    please refer to the section 2-2-2 in Documentation/cgroup-v2.txt added
    by this patch.
    
    v5: - Dropped silly no-op ->dom_cgrp init from cgroup_create().
          Spotted by Waiman.
        - Documentation updated as suggested by Waiman.
        - cgroup.type content slightly reformatted.
        - Mark the debug controller threaded.
    
    v4: - Updated to the general idea of marking specific cgroups
          domain/threaded as suggested by PeterZ.
    
    v3: - Dropped "join" and always make mixed children join the parent's
          threaded subtree.
    
    v2: - After discussions with Waiman, support for mixed thread mode is
          added.  This should address the issue that Peter pointed out
          where any nesting should be avoided for thread subtrees while
          coexisting with other domain cgroups.
        - Enabling / disabling thread mode now piggy backs on the existing
          control mask update mechanism.
        - Bug fixes and cleanup.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Waiman Long <longman@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    8cfd8147
cgroup-v2.txt 68.9 KB