Merge branch 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo: "Documentation updates and the addition of cgroup_parse_float() which will be used by new controllers including blk-iocost" * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1: convert docs to ReST and rename to *.rst cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS cgroup: add cgroup_parse_float()

Merge branch 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo: "Documentation updates and the addition of cgroup_parse_float() which will be used by new controllers including blk-iocost" * 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: docs: cgroup-v1: convert docs to ReST and rename to *.rst cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS cgroup: add cgroup_parse_float()
92c1d652 · Linus Torvalds · df2a40f5 · 99c8b231 · 92c1d652 · 92c1d652
Commit 92c1d652 authored Jul 08, 2019 by Linus Torvalds
37 changed files
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -705,6 +705,12 @@ Conventions
  informational files on the root cgroup which end up showing global
  information available elsewhere shouldn't exist.

+- The default time unit is microseconds.  If a different unit is ever
+  used, an explicit unit suffix must be present.
+
+- A parts-per quantity should use a percentage decimal with at least
+  two digit fractional part - e.g. 13.40.
+
 - If a controller implements weight based resource distribution, its
  interface file should be named "weight" and have the range [1,
  10000] with 100 as the default.  The values are chosen to allow

--- a/Documentation/admin-guide/hw-vuln/l1tf.rst
+++ b/Documentation/admin-guide/hw-vuln/l1tf.rst
@@ -241,7 +241,7 @@ Guest mitigation mechanisms
   For further information about confining guests to a single or to a group
   of cores consult the cpusets documentation:

-   https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt
+   https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.rst

 .. _interrupt_isolation:


--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4084,7 +4084,7 @@

 	relax_domain_level=
 			[KNL, SMP] Set scheduler's default relax_domain_level.
-			See Documentation/cgroup-v1/cpusets.txt.
+			See Documentation/cgroup-v1/cpusets.rst.

 	reserve=	[KNL,BUGS] Force kernel to ignore I/O ports or memory
 			Format: <base1>,<size1>[,<base2>,<size2>,...]
@@ -4594,7 +4594,7 @@
 	swapaccount=[0|1]
 			[KNL] Enable accounting of swap in memory resource
 			controller if no parameter or 1 is given or disable
-			it if 0 is given (See Documentation/cgroup-v1/memory.txt)
+			it if 0 is given (See Documentation/cgroup-v1/memory.rst)

 	swiotlb=	[ARM,IA-64,PPC,MIPS,X86]
 			Format: { <int> | force | noforce }

--- a/Documentation/admin-guide/mm/numa_memory_policy.rst
+++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
@@ -15,7 +15,7 @@ document attempts to describe the concepts and APIs of the 2.6 memory policy
 support.

 Memory policies should not be confused with cpusets
-(``Documentation/cgroup-v1/cpusets.txt``)
+(``Documentation/cgroup-v1/cpusets.rst``)
 which is an administrative mechanism for restricting the nodes from which
 memory may be allocated by a set of processes. Memory policies are a
 programming interface that a NUMA-aware application can take advantage of.  When

--- a/Documentation/block/bfq-iosched.txt
+++ b/Documentation/block/bfq-iosched.txt
@@ -539,7 +539,7 @@ As for cgroups-v1 (blkio controller), the exact set of stat files
 created, and kept up-to-date by bfq, depends on whether
 CONFIG_DEBUG_BLK_CGROUP is set. If it is set, then bfq creates all
 the stat files documented in
-Documentation/cgroup-v1/blkio-controller.txt. If, instead,
+Documentation/cgroup-v1/blkio-controller.rst. If, instead,
 CONFIG_DEBUG_BLK_CGROUP is not set, then bfq creates only the files
 blkio.bfq.io_service_bytes
 blkio.bfq.io_service_bytes_recursive

--- a/Documentation/cgroup-v1/blkio-controller.txt
+++ b/Documentation/cgroup-v1/blkio-controller.txt
-				Block IO Controller
-				===================
+===================
+Block IO Controller
+===================
+
 Overview
 ========
 cgroup subsys "blkio" implements the block io controller. There seems to be
@@ -17,24 +19,27 @@ HOWTO
 =====
 Throttling/Upper Limit policy
 -----------------------------
- Enable Block IO controller
+- Enable Block IO controller::
+
 	CONFIG_BLK_CGROUP=y

- Enable throttling in block layer
+- Enable throttling in block layer::
+
 	CONFIG_BLK_DEV_THROTTLING=y

- Mount blkio controller (see cgroups.txt, Why are cgroups needed?)
+- Mount blkio controller (see cgroups.txt, Why are cgroups needed?)::
+
        mount -t cgroup -o blkio none /sys/fs/cgroup/blkio

 - Specify a bandwidth rate on particular device for root group. The format
-  for policy is "<major>:<minor>  <bytes_per_second>".
+  for policy is "<major>:<minor>  <bytes_per_second>"::

        echo "8:16  1048576" > /sys/fs/cgroup/blkio/blkio.throttle.read_bps_device

  Above will put a limit of 1MB/second on reads happening for root group
  on device having major/minor number 8:16.

- Run dd to read a file and see if rate is throttled to 1MB/s or not.
+- Run dd to read a file and see if rate is throttled to 1MB/s or not::

        # dd iflag=direct if=/mnt/common/zerofile of=/dev/null bs=4K count=1024
        1024+0 records in
@@ -51,7 +56,7 @@ throttling's hierarchy support is enabled iff "sane_behavior" is
 enabled from cgroup side, which currently is a development option and
 not publicly available.

-If somebody created a hierarchy like as follows.
+If somebody created a hierarchy like as follows::

 			root
 			/  \
@@ -66,7 +71,7 @@ directly generated by tasks in that cgroup.

 Throttling without "sane_behavior" enabled from cgroup side will
 practically treat all groups at same level as if it looks like the
-following.
+following::

 				pivot
 			     /  /   \  \
@@ -99,27 +104,31 @@ Proportional weight policy files
 	  These rules override the default value of group weight as specified
 	  by blkio.weight.

-	  Following is the format.
+	  Following is the format::
+
+	    # echo dev_maj:dev_minor weight > blkio.weight_device
+
+	  Configure weight=300 on /dev/sdb (8:16) in this cgroup::
+
+	    # echo 8:16 300 > blkio.weight_device
+	    # cat blkio.weight_device
+	    dev     weight
+	    8:16    300
+
+	  Configure weight=500 on /dev/sda (8:0) in this cgroup::

-	  # echo dev_maj:dev_minor weight > blkio.weight_device
-	  Configure weight=300 on /dev/sdb (8:16) in this cgroup
-	  # echo 8:16 300 > blkio.weight_device
-	  # cat blkio.weight_device
-	  dev     weight
-	  8:16    300
+	    # echo 8:0 500 > blkio.weight_device
+	    # cat blkio.weight_device
+	    dev     weight
+	    8:0     500
+	    8:16    300

-	  Configure weight=500 on /dev/sda (8:0) in this cgroup
-	  # echo 8:0 500 > blkio.weight_device
-	  # cat blkio.weight_device
-	  dev     weight
-	  8:0     500
-	  8:16    300
+	  Remove specific weight for /dev/sda in this cgroup::

-	  Remove specific weight for /dev/sda in this cgroup
-	  # echo 8:0 0 > blkio.weight_device
-	  # cat blkio.weight_device
-	  dev     weight
-	  8:16    300
+	    # echo 8:0 0 > blkio.weight_device
+	    # cat blkio.weight_device
+	    dev     weight
+	    8:16    300

 - blkio.leaf_weight[_device]
 	- Equivalents of blkio.weight[_device] for the purpose of
@@ -244,30 +253,30 @@ Throttling/Upper limit policy files
 - blkio.throttle.read_bps_device
 	- Specifies upper limit on READ rate from the device. IO rate is
 	  specified in bytes per second. Rules are per device. Following is
-	  the format.
+	  the format::

-  echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.read_bps_device
+	    echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.read_bps_device

 - blkio.throttle.write_bps_device
 	- Specifies upper limit on WRITE rate to the device. IO rate is
 	  specified in bytes per second. Rules are per device. Following is
-	  the format.
+	  the format::

-  echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.write_bps_device
+	    echo "<major>:<minor>  <rate_bytes_per_second>" > /cgrp/blkio.throttle.write_bps_device

 - blkio.throttle.read_iops_device
 	- Specifies upper limit on READ rate from the device. IO rate is
 	  specified in IO per second. Rules are per device. Following is
-	  the format.
+	  the format::

-  echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.read_iops_device
+	   echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.read_iops_device

 - blkio.throttle.write_iops_device
 	- Specifies upper limit on WRITE rate to the device. IO rate is
 	  specified in io per second. Rules are per device. Following is
-	  the format.
+	  the format::

-  echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.write_iops_device
+	    echo "<major>:<minor>  <rate_io_per_second>" > /cgrp/blkio.throttle.write_iops_device

 Note: If both BW and IOPS rules are specified for a device, then IO is
      subjected to both the constraints.

--- a/Documentation/cgroup-v1/cgroups.txt
+++ b/Documentation/cgroup-v1/cgroups.txt
--- a/Documentation/cgroup-v1/cpuacct.txt
+++ b/Documentation/cgroup-v1/cpuacct.txt
+=========================
 CPU Accounting Controller
-------------------------
+=========================

 The CPU accounting controller is used to group tasks using cgroups and
 account the CPU usage of these groups of tasks.
@@ -8,9 +9,9 @@ The CPU accounting controller supports multi-hierarchy groups. An accounting
 group accumulates the CPU usage of all of its child groups and the tasks
 directly present in its group.

-Accounting groups can be created by first mounting the cgroup filesystem.
+Accounting groups can be created by first mounting the cgroup filesystem::

-# mount -t cgroup -ocpuacct none /sys/fs/cgroup
+  # mount -t cgroup -ocpuacct none /sys/fs/cgroup

 With the above step, the initial or the parent accounting group becomes
 visible at /sys/fs/cgroup. At bootup, this group includes all the tasks in
@@ -19,11 +20,11 @@ the system. /sys/fs/cgroup/tasks lists the tasks in this cgroup.
 by this group which is essentially the CPU time obtained by all the tasks
 in the system.

-New accounting groups can be created under the parent group /sys/fs/cgroup.
+New accounting groups can be created under the parent group /sys/fs/cgroup::

-# cd /sys/fs/cgroup
-# mkdir g1
-# echo $$ > g1/tasks
+  # cd /sys/fs/cgroup
+  # mkdir g1
+  # echo $$ > g1/tasks

 The above steps create a new group g1 and move the current shell
 process (bash) into it. CPU time consumed by this bash and its children

--- a/Documentation/cgroup-v1/cpusets.txt
+++ b/Documentation/cgroup-v1/cpusets.txt
--- a/Documentation/cgroup-v1/devices.txt
+++ b/Documentation/cgroup-v1/devices.txt
+===========================
 Device Whitelist Controller
+===========================

-1. Description:
+1. Description
+==============

 Implement a cgroup to track and enforce open and mknod restrictions
 on device files.  A device cgroup associates a device access
@@ -16,24 +19,26 @@ devices from the whitelist or add new entries.  A child cgroup can
 never receive a device access which is denied by its parent.

 2. User Interface
+=================

 An entry is added using devices.allow, and removed using
-devices.deny.  For instance
+devices.deny.  For instance::

 	echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow

 allows cgroup 1 to read and mknod the device usually known as
-/dev/null.  Doing
+/dev/null.  Doing::

 	echo a > /sys/fs/cgroup/1/devices.deny

-will remove the default 'a *:* rwm' entry. Doing
+will remove the default 'a *:* rwm' entry. Doing::

 	echo a > /sys/fs/cgroup/1/devices.allow

 will add the 'a *:* rwm' entry to the whitelist.

 3. Security
+===========

 Any task can move itself between cgroups.  This clearly won't
 suffice, but we can decide the best way to adequately restrict
@@ -50,6 +55,7 @@ A cgroup may not be granted more permissions than the cgroup's
 parent has.

 4. Hierarchy
+============

 device cgroups maintain hierarchy by making sure a cgroup never has more
 access permissions than its parent.  Every time an entry is written to
@@ -58,7 +64,8 @@ from their whitelist and all the locally set whitelist entries will be
 re-evaluated.  In case one of the locally set whitelist entries would provide
 more access than the cgroup's parent, it'll be removed from the whitelist.

-Example:
+Example::
+
      A
     / \
        B
@@ -67,10 +74,12 @@ Example:
    A            allow		"b 8:* rwm", "c 116:1 rw"
    B            deny		"c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"

-If a device is denied in group A:
+If a device is denied in group A::
+
 	# echo "c 116:* r" > A/devices.deny
+
 it'll propagate down and after revalidating B's entries, the whitelist entry
-"c 116:2 rwm" will be removed:
+"c 116:2 rwm" will be removed::

    group        whitelist entries                        denied devices
    A            all                                      "b 8:* rwm", "c 116:* rw"
@@ -79,7 +88,8 @@ it'll propagate down and after revalidating B's entries, the whitelist entry
 In case parent's exceptions change and local exceptions are not allowed
 anymore, they'll be deleted.

-Notice that new whitelist entries will not be propagated:
+Notice that new whitelist entries will not be propagated::
+
      A
     / \
        B
@@ -88,24 +98,30 @@ Notice that new whitelist entries will not be propagated:
    A            "c 1:3 rwm", "c 1:5 r"                   all the rest
    B            "c 1:3 rwm", "c 1:5 r"                   all the rest

-when adding "c *:3 rwm":
+when adding ``c *:3 rwm``::
+
 	# echo "c *:3 rwm" >A/devices.allow

-the result:
+the result::
+
    group        whitelist entries                        denied devices
    A            "c *:3 rwm", "c 1:5 r"                   all the rest
    B            "c 1:3 rwm", "c 1:5 r"                   all the rest

-but now it'll be possible to add new entries to B:
+but now it'll be possible to add new entries to B::
+
 	# echo "c 2:3 rwm" >B/devices.allow
 	# echo "c 50:3 r" >B/devices.allow
-or even
+
+or even::
+
 	# echo "c *:3 rwm" >B/devices.allow

 Allowing or denying all by writing 'a' to devices.allow or devices.deny will
 not be possible once the device cgroups has children.

 4.1 Hierarchy (internal implementation)
+---------------------------------------

 device cgroups is implemented internally using a behavior (ALLOW, DENY) and a
 list of exceptions.  The internal state is controlled using the same user

--- a/Documentation/cgroup-v1/freezer-subsystem.txt
+++ b/Documentation/cgroup-v1/freezer-subsystem.txt
+==============
+Cgroup Freezer
+==============
+
 The cgroup freezer is useful to batch job management system which start
 and stop sets of tasks in order to schedule the resources of a machine
 according to the desires of a system administrator. This sort of program
@@ -23,7 +27,7 @@ blocked, or ignored it can be seen by waiting or ptracing parent tasks.
 SIGCONT is especially unsuitable since it can be caught by the task. Any
 programs designed to watch for SIGSTOP and SIGCONT could be broken by
 attempting to use SIGSTOP and SIGCONT to stop and resume tasks. We can
-demonstrate this problem using nested bash shells:
+demonstrate this problem using nested bash shells::

 	$ echo $$
 	16644
@@ -93,19 +97,19 @@ The following cgroupfs files are created by cgroup freezer.
 The root cgroup is non-freezable and the above interface files don't
 exist.

-* Examples of usage :
+* Examples of usage::

   # mkdir /sys/fs/cgroup/freezer
   # mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer
   # mkdir /sys/fs/cgroup/freezer/0
   # echo $some_pid > /sys/fs/cgroup/freezer/0/tasks

-to get status of the freezer subsystem :
+to get status of the freezer subsystem::

   # cat /sys/fs/cgroup/freezer/0/freezer.state
   THAWED

-to freeze all tasks in the container :
+to freeze all tasks in the container::

   # echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state
   # cat /sys/fs/cgroup/freezer/0/freezer.state
@@ -113,7 +117,7 @@ to freeze all tasks in the container :
   # cat /sys/fs/cgroup/freezer/0/freezer.state
   FROZEN

-to unfreeze all tasks in the container :
+to unfreeze all tasks in the container::

   # echo THAWED > /sys/fs/cgroup/freezer/0/freezer.state
   # cat /sys/fs/cgroup/freezer/0/freezer.state

--- a/Documentation/cgroup-v1/hugetlb.txt
+++ b/Documentation/cgroup-v1/hugetlb.txt
+==================
 HugeTLB Controller
-------------------
+==================

 The HugeTLB controller allows to limit the HugeTLB usage per control group and
 enforces the controller limit during page fault. Since HugeTLB doesn't
@@ -16,16 +17,16 @@ With the above step, the initial or the parent HugeTLB group becomes
 visible at /sys/fs/cgroup. At bootup, this group includes all the tasks in
 the system. /sys/fs/cgroup/tasks lists the tasks in this cgroup.

-New groups can be created under the parent group /sys/fs/cgroup.
+New groups can be created under the parent group /sys/fs/cgroup::

-# cd /sys/fs/cgroup
-# mkdir g1
-# echo $$ > g1/tasks
+  # cd /sys/fs/cgroup
+  # mkdir g1
+  # echo $$ > g1/tasks

 The above steps create a new group g1 and move the current shell
 process (bash) into it.

-Brief summary of control files
+Brief summary of control files::

 hugetlb.<hugepagesize>.limit_in_bytes     # set/show limit of "hugepagesize" hugetlb usage
 hugetlb.<hugepagesize>.max_usage_in_bytes # show max "hugepagesize" hugetlb  usage recorded
@@ -33,17 +34,17 @@ Brief summary of control files
 hugetlb.<hugepagesize>.failcnt		   # show the number of allocation failure due to HugeTLB limit

 For a system supporting three hugepage sizes (64k, 32M and 1G), the control
-files include:
-
-hugetlb.1GB.limit_in_bytes
-hugetlb.1GB.max_usage_in_bytes
-hugetlb.1GB.usage_in_bytes
-hugetlb.1GB.failcnt
-hugetlb.64KB.limit_in_bytes
-hugetlb.64KB.max_usage_in_bytes
-hugetlb.64KB.usage_in_bytes
-hugetlb.64KB.failcnt
-hugetlb.32MB.limit_in_bytes
-hugetlb.32MB.max_usage_in_bytes
-hugetlb.32MB.usage_in_bytes
-hugetlb.32MB.failcnt
+files include::
+
+  hugetlb.1GB.limit_in_bytes
+  hugetlb.1GB.max_usage_in_bytes
+  hugetlb.1GB.usage_in_bytes
+  hugetlb.1GB.failcnt
+  hugetlb.64KB.limit_in_bytes
+  hugetlb.64KB.max_usage_in_bytes
+  hugetlb.64KB.usage_in_bytes
+  hugetlb.64KB.failcnt
+  hugetlb.32MB.limit_in_bytes
+  hugetlb.32MB.max_usage_in_bytes
+  hugetlb.32MB.usage_in_bytes
+  hugetlb.32MB.failcnt
--- a/Documentation/cgroup-v1/index.rst
+++ b/Documentation/cgroup-v1/index.rst
+:orphan:
+
+========================
+Control Groups version 1
+========================
+
+.. toctree::
+    :maxdepth: 1
+
+    cgroups
+
+    blkio-controller
+    cpuacct
+    cpusets
+    devices
+    freezer-subsystem
+    hugetlb
+    memcg_test
+    memory
+    net_cls
+    net_prio
+    pids
+    rdma
+
+.. only::  subproject and html
+
+   Indices
+   =======
+
+   * :ref:`genindex`
--- a/Documentation/cgroup-v1/memcg_test.txt
+++ b/Documentation/cgroup-v1/memcg_test.txt
--- a/Documentation/cgroup-v1/memory.txt
+++ b/Documentation/cgroup-v1/memory.txt
--- a/Documentation/cgroup-v1/net_cls.txt
+++ b/Documentation/cgroup-v1/net_cls.txt
+=========================
 Network classifier cgroup
-------------------------
+=========================

 The Network classifier cgroup provides an interface to
 tag network packets with a class identifier (classid).
@@ -17,23 +18,27 @@ values is 0xAAAABBBB; AAAA is the major handle number and BBBB
 is the minor handle number.
 Reading net_cls.classid yields a decimal result.

-Example:
-mkdir /sys/fs/cgroup/net_cls
-mount -t cgroup -onet_cls net_cls /sys/fs/cgroup/net_cls
-mkdir /sys/fs/cgroup/net_cls/0
-echo 0x100001 >  /sys/fs/cgroup/net_cls/0/net_cls.classid
-	- setting a 10:1 handle.
+Example::

-cat /sys/fs/cgroup/net_cls/0/net_cls.classid
-1048577
+	mkdir /sys/fs/cgroup/net_cls
+	mount -t cgroup -onet_cls net_cls /sys/fs/cgroup/net_cls
+	mkdir /sys/fs/cgroup/net_cls/0
+	echo 0x100001 >  /sys/fs/cgroup/net_cls/0/net_cls.classid

-configuring tc:
-tc qdisc add dev eth0 root handle 10: htb
+- setting a 10:1 handle::

-tc class add dev eth0 parent 10: classid 10:1 htb rate 40mbit
- - creating traffic class 10:1
+	cat /sys/fs/cgroup/net_cls/0/net_cls.classid
+	1048577

-tc filter add dev eth0 parent 10: protocol ip prio 10 handle 1: cgroup
+- configuring tc::

-configuring iptables, basic example:
-iptables -A OUTPUT -m cgroup ! --cgroup 0x100001 -j DROP
+	tc qdisc add dev eth0 root handle 10: htb
+	tc class add dev eth0 parent 10: classid 10:1 htb rate 40mbit
+
+- creating traffic class 10:1::
+
+	tc filter add dev eth0 parent 10: protocol ip prio 10 handle 1: cgroup
+
+configuring iptables, basic example::
+
+	iptables -A OUTPUT -m cgroup ! --cgroup 0x100001 -j DROP
--- a/Documentation/cgroup-v1/net_prio.txt
+++ b/Documentation/cgroup-v1/net_prio.txt
+=======================
 Network priority cgroup
-------------------------
+=======================

 The Network priority cgroup provides an interface to allow an administrator to
 dynamically set the priority of network traffic generated by various
@@ -14,9 +15,9 @@ SO_PRIORITY socket option.  This however, is not always possible because:

 This cgroup allows an administrator to assign a process to a group which defines
 the priority of egress traffic on a given interface. Network priority groups can
-be created by first mounting the cgroup filesystem.
+be created by first mounting the cgroup filesystem::

-# mount -t cgroup -onet_prio none /sys/fs/cgroup/net_prio
+	# mount -t cgroup -onet_prio none /sys/fs/cgroup/net_prio

 With the above step, the initial group acting as the parent accounting group
 becomes visible at '/sys/fs/cgroup/net_prio'.  This group includes all tasks in
@@ -25,17 +26,18 @@ the system. '/sys/fs/cgroup/net_prio/tasks' lists the tasks in this cgroup.
 Each net_prio cgroup contains two files that are subsystem specific

 net_prio.prioidx
-This file is read-only, and is simply informative.  It contains a unique integer
-value that the kernel uses as an internal representation of this cgroup.
+  This file is read-only, and is simply informative.  It contains a unique
+  integer value that the kernel uses as an internal representation of this
+  cgroup.

 net_prio.ifpriomap
-This file contains a map of the priorities assigned to traffic originating from
-processes in this group and egressing the system on various interfaces. It
-contains a list of tuples in the form <ifname priority>.  Contents of this file
-can be modified by echoing a string into the file using the same tuple format.
-for example:
+  This file contains a map of the priorities assigned to traffic originating
+  from processes in this group and egressing the system on various interfaces.
+  It contains a list of tuples in the form <ifname priority>.  Contents of this
+  file can be modified by echoing a string into the file using the same tuple
+  format. For example::

-echo "eth0 5" > /sys/fs/cgroups/net_prio/iscsi/net_prio.ifpriomap
+	echo "eth0 5" > /sys/fs/cgroups/net_prio/iscsi/net_prio.ifpriomap

 This command would force any traffic originating from processes belonging to the
 iscsi net_prio cgroup and egressing on interface eth0 to have the priority of

--- a/Documentation/cgroup-v1/pids.txt
+++ b/Documentation/cgroup-v1/pids.txt
-						   Process Number Controller
-						   =========================
+=========================
+Process Number Controller
+=========================

 Abstract
 --------
@@ -34,55 +35,58 @@ pids.current tracks all child cgroup hierarchies, so parent/pids.current is a
 superset of parent/child/pids.current.

 The pids.events file contains event counters:
+
  - max: Number of times fork failed because limit was hit.

 Example
 -------

-First, we mount the pids controller:
-# mkdir -p /sys/fs/cgroup/pids
-# mount -t cgroup -o pids none /sys/fs/cgroup/pids
+First, we mount the pids controller::
+
+	# mkdir -p /sys/fs/cgroup/pids
+	# mount -t cgroup -o pids none /sys/fs/cgroup/pids
+
+Then we create a hierarchy, set limits and attach processes to it::

-Then we create a hierarchy, set limits and attach processes to it:
-# mkdir -p /sys/fs/cgroup/pids/parent/child
-# echo 2 > /sys/fs/cgroup/pids/parent/pids.max
-# echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs
-# cat /sys/fs/cgroup/pids/parent/pids.current
-2
-#
+	# mkdir -p /sys/fs/cgroup/pids/parent/child
+	# echo 2 > /sys/fs/cgroup/pids/parent/pids.max
+	# echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs
+	# cat /sys/fs/cgroup/pids/parent/pids.current
+	2
+	#

 It should be noted that attempts to overcome the set limit (2 in this case) will
-fail:
+fail::

-# cat /sys/fs/cgroup/pids/parent/pids.current
-2
-# ( /bin/echo "Here's some processes for you." | cat )
-sh: fork: Resource temporary unavailable
-#
+	# cat /sys/fs/cgroup/pids/parent/pids.current
+	2
+	# ( /bin/echo "Here's some processes for you." | cat )
+	sh: fork: Resource temporary unavailable
+	#

 Even if we migrate to a child cgroup (which doesn't have a set limit), we will
 not be able to overcome the most stringent limit in the hierarchy (in this case,
-parent's):
-
-# echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs
-# cat /sys/fs/cgroup/pids/parent/pids.current
-2
-# cat /sys/fs/cgroup/pids/parent/child/pids.current
-2
-# cat /sys/fs/cgroup/pids/parent/child/pids.max
-max
-# ( /bin/echo "Here's some processes for you." | cat )
-sh: fork: Resource temporary unavailable
-#
+parent's)::
+
+	# echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs
+	# cat /sys/fs/cgroup/pids/parent/pids.current
+	2
+	# cat /sys/fs/cgroup/pids/parent/child/pids.current
+	2
+	# cat /sys/fs/cgroup/pids/parent/child/pids.max
+	max
+	# ( /bin/echo "Here's some processes for you." | cat )
+	sh: fork: Resource temporary unavailable
+	#

 We can set a limit that is smaller than pids.current, which will stop any new
 processes from being forked at all (note that the shell itself counts towards
-pids.current):
-
-# echo 1 > /sys/fs/cgroup/pids/parent/pids.max
-# /bin/echo "We can't even spawn a single process now."
-sh: fork: Resource temporary unavailable
-# echo 0 > /sys/fs/cgroup/pids/parent/pids.max
-# /bin/echo "We can't even spawn a single process now."
-sh: fork: Resource temporary unavailable
-#
+pids.current)::
+
+	# echo 1 > /sys/fs/cgroup/pids/parent/pids.max
+	# /bin/echo "We can't even spawn a single process now."
+	sh: fork: Resource temporary unavailable
+	# echo 0 > /sys/fs/cgroup/pids/parent/pids.max
+	# /bin/echo "We can't even spawn a single process now."
+	sh: fork: Resource temporary unavailable
+	#
--- a/Documentation/cgroup-v1/rdma.txt
+++ b/Documentation/cgroup-v1/rdma.txt
-				RDMA Controller
-				----------------
+===============
+RDMA Controller
+===============

-Contents
--------
+.. Contents

-1. Overview
-  1-1. What is RDMA controller?
-  1-2. Why RDMA controller needed?
-  1-3. How is RDMA controller implemented?
-2. Usage Examples
+   1. Overview
+     1-1. What is RDMA controller?
+     1-2. Why RDMA controller needed?
+     1-3. How is RDMA controller implemented?
+   2. Usage Examples

 1. Overview
+===========

 1-1. What is RDMA controller?
 -----------------------------
@@ -83,27 +84,34 @@ what is configured by user for a given cgroup and what is supported by
 IB device.

 Following resources can be accounted by rdma controller.
+
+  ==========    =============================
  hca_handle	Maximum number of HCA Handles
  hca_object 	Maximum number of HCA Objects
+  ==========    =============================

 2. Usage Examples
-----------------
-
-(a) Configure resource limit:
-echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max
-echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max
-
-(b) Query resource limit:
-cat /sys/fs/cgroup/rdma/2/rdma.max
-#Output:
-mlx4_0 hca_handle=2 hca_object=2000
-ocrdma1 hca_handle=3 hca_object=max
-
-(c) Query current usage:
-cat /sys/fs/cgroup/rdma/2/rdma.current
-#Output:
-mlx4_0 hca_handle=1 hca_object=20
-ocrdma1 hca_handle=1 hca_object=23
-
-(d) Delete resource limit:
-echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
+=================
+
+(a) Configure resource limit::
+
+	echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max
+	echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max
+
+(b) Query resource limit::
+
+	cat /sys/fs/cgroup/rdma/2/rdma.max
+	#Output:
+	mlx4_0 hca_handle=2 hca_object=2000
+	ocrdma1 hca_handle=3 hca_object=max
+
+(c) Query current usage::
+
+	cat /sys/fs/cgroup/rdma/2/rdma.current
+	#Output:
+	mlx4_0 hca_handle=1 hca_object=20
+	ocrdma1 hca_handle=1 hca_object=23
+
+(d) Delete resource limit::
+
+	echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
--- a/Documentation/filesystems/tmpfs.txt
+++ b/Documentation/filesystems/tmpfs.txt
@@ -98,7 +98,7 @@ A memory policy with a valid NodeList will be saved, as specified, for
 use at file creation time.  When a task allocates a file in the file
 system, the mount option memory policy will be applied with a NodeList,
 if any, modified by the calling task's cpuset constraints
-[See Documentation/cgroup-v1/cpusets.txt] and any optional flags, listed
+[See Documentation/cgroup-v1/cpusets.rst] and any optional flags, listed
 below.  If the resulting NodeLists is the empty set, the effective memory
 policy for the file will revert to "default" policy.


--- a/Documentation/scheduler/sched-deadline.txt
+++ b/Documentation/scheduler/sched-deadline.txt
@@ -652,7 +652,7 @@ CONTENTS

 -deadline tasks cannot have an affinity mask smaller that the entire
 root_domain they are created on. However, affinities can be specified
- through the cpuset facility (Documentation/cgroup-v1/cpusets.txt).
+ through the cpuset facility (Documentation/cgroup-v1/cpusets.rst).

 5.1 SCHED_DEADLINE and cpusets HOWTO
 ------------------------------------

--- a/Documentation/scheduler/sched-design-CFS.txt
+++ b/Documentation/scheduler/sched-design-CFS.txt
@@ -215,7 +215,7 @@ SCHED_BATCH) tasks.

   These options need CONFIG_CGROUPS to be defined, and let the administrator
   create arbitrary groups of tasks, using the "cgroup" pseudo filesystem.  See
-   Documentation/cgroup-v1/cgroups.txt for more information about this filesystem.
+   Documentation/cgroup-v1/cgroups.rst for more information about this filesystem.

 When CONFIG_FAIR_GROUP_SCHED is defined, a "cpu.shares" file is created for each
 group created using the pseudo filesystem.  See example steps below to create

--- a/Documentation/scheduler/sched-rt-group.txt
+++ b/Documentation/scheduler/sched-rt-group.txt
@@ -133,7 +133,7 @@ This uses the cgroup virtual file system and "<cgroup>/cpu.rt_runtime_us"
 to control the CPU time reserved for each control group.

 For more information on working with control groups, you should read
-Documentation/cgroup-v1/cgroups.txt as well.
+Documentation/cgroup-v1/cgroups.rst as well.

 Group settings are checked against the following limits in order to keep the
 configuration schedulable:

--- a/Documentation/vm/numa.rst
+++ b/Documentation/vm/numa.rst
@@ -67,7 +67,7 @@ nodes.  Each emulated node will manage a fraction of the underlying cells'
 physical memory.  NUMA emluation is useful for testing NUMA kernel and
 application features on non-NUMA platforms, and as a sort of memory resource
 management mechanism when used together with cpusets.
-[see Documentation/cgroup-v1/cpusets.txt]
+[see Documentation/cgroup-v1/cpusets.rst]

 For each node with memory, Linux constructs an independent memory management
 subsystem, complete with its own free page lists, in-use page lists, usage
@@ -114,7 +114,7 @@ allocation behavior using Linux NUMA memory policy. [see

 System administrators can restrict the CPUs and nodes' memories that a non-
 privileged user can specify in the scheduling or NUMA commands and functions
-using control groups and CPUsets.  [see Documentation/cgroup-v1/cpusets.txt]
+using control groups and CPUsets.  [see Documentation/cgroup-v1/cpusets.rst]

 On architectures that do not hide memoryless nodes, Linux will include only
 zones [nodes] with memory in the zonelists.  This means that for a memoryless

--- a/Documentation/vm/page_migration.rst
+++ b/Documentation/vm/page_migration.rst
@@ -41,7 +41,7 @@ locations.
 Larger installations usually partition the system using cpusets into
 sections of nodes. Paul Jackson has equipped cpusets with the ability to
 move pages when a task is moved to another cpuset (See
-Documentation/cgroup-v1/cpusets.txt).
+Documentation/cgroup-v1/cpusets.rst).
 Cpusets allows the automation of process locality. If a task is moved to
 a new cpuset then also all its pages are moved with it so that the
 performance of the process does not sink dramatically. Also the pages

--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -98,7 +98,7 @@ Memory Control Group Interaction
 --------------------------------

 The unevictable LRU facility interacts with the memory control group [aka
-memory controller; see Documentation/cgroup-v1/memory.txt] by extending the
+memory controller; see Documentation/cgroup-v1/memory.rst] by extending the
 lru_list enum.

 The memory controller data structure automatically gets a per-zone unevictable

--- a/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
+++ b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
@@ -15,7 +15,7 @@ assign them to cpusets and their attached tasks.  This is a way of limiting the
 amount of system memory that are available to a certain class of tasks.

 For more information on the features of cpusets, see
-Documentation/cgroup-v1/cpusets.txt.
+Documentation/cgroup-v1/cpusets.rst.
 There are a number of different configurations you can use for your needs.  For
 more information on the numa=fake command line option and its various ways of
 configuring fake nodes, see Documentation/x86/x86_64/boot-options.txt.
@@ -40,7 +40,7 @@ A machine may be split as follows with "numa=fake=4*512," as reported by dmesg::
 	On node 3 totalpages: 131072

 Now following the instructions for mounting the cpusets filesystem from
-Documentation/cgroup-v1/cpusets.txt, you can assign fake nodes (i.e. contiguous memory
+Documentation/cgroup-v1/cpusets.rst, you can assign fake nodes (i.e. contiguous memory
 address spaces) to individual cpusets::

 	[root@xroads /]# mkdir exampleset

--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4122,7 +4122,7 @@ W:	http://www.bullopensource.org/cpuset/
 W:	http://oss.sgi.com/projects/cpusets/
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
 S:	Maintained
-F:	Documentation/cgroup-v1/cpusets.txt
+F:	Documentation/cgroup-v1/cpusets.rst
 F:	include/linux/cpuset.h
 F:	kernel/cgroup/cpuset.c


--- a/block/Kconfig
+++ b/block/Kconfig
@@ -89,7 +89,7 @@ config BLK_DEV_THROTTLING
 	one needs to mount and use blkio cgroup controller for creating
 	cgroups and specifying per device IO rate policies.

-	See Documentation/cgroup-v1/blkio-controller.txt for more information.
+	See Documentation/cgroup-v1/blkio-controller.rst for more information.

 config BLK_DEV_THROTTLING_LOW
 	bool "Block throttling .low limit interface support (EXPERIMENTAL)"

--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -624,7 +624,7 @@ struct cftype {

 /*
 * Control Group subsystem type.
- * See Documentation/cgroup-v1/cgroups.txt for details
+ * See Documentation/cgroup-v1/cgroups.rst for details
 */
 struct cgroup_subsys {
 	struct cgroup_subsys_state *(*css_alloc)(struct cgroup_subsys_state *parent_css);

--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -131,6 +131,8 @@ void cgroup_free(struct task_struct *p);
 int cgroup_init_early(void);
 int cgroup_init(void);

+int cgroup_parse_float(const char *input, unsigned dec_shift, s64 *v);
+
 /*
 * Iteration helpers and macros.
 */

--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -785,7 +785,7 @@ union bpf_attr {
 * 		based on a user-provided identifier for all traffic coming from
 * 		the tasks belonging to the related cgroup. See also the related
 * 		kernel documentation, available from the Linux sources in file
- * 		*Documentation/cgroup-v1/net_cls.txt*.
+ * 		*Documentation/cgroup-v1/net_cls.rst*.
 *
 * 		The Linux kernel has two versions for cgroups: there are
 * 		cgroups v1 and cgroups v2. Both are available to users, who can

--- a/init/Kconfig
+++ b/init/Kconfig
@@ -850,7 +850,7 @@ config BLK_CGROUP
 	CONFIG_CFQ_GROUP_IOSCHED=y; for enabling throttling policy, set
 	CONFIG_BLK_DEV_THROTTLING=y.

-	See Documentation/cgroup-v1/blkio-controller.txt for more information.
+	See Documentation/cgroup-v1/blkio-controller.rst for more information.

 config DEBUG_BLK_CGROUP
 	bool "IO controller debugging"

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6240,6 +6240,48 @@ struct cgroup *cgroup_get_from_fd(int fd)
 }
 EXPORT_SYMBOL_GPL(cgroup_get_from_fd);

+static u64 power_of_ten(int power)
+{
+	u64 v = 1;
+	while (power--)
+		v *= 10;
+	return v;
+}
+
+/**
+ * cgroup_parse_float - parse a floating number
+ * @input: input string
+ * @dec_shift: number of decimal digits to shift
+ * @v: output
+ *
+ * Parse a decimal floating point number in @input and store the result in
+ * @v with decimal point right shifted @dec_shift times.  For example, if
+ * @input is "12.3456" and @dec_shift is 3, *@v will be set to 12345.
+ * Returns 0 on success, -errno otherwise.
+ *
+ * There's nothing cgroup specific about this function except that it's
+ * currently the only user.
+ */
+int cgroup_parse_float(const char *input, unsigned dec_shift, s64 *v)
+{
+	s64 whole, frac = 0;
+	int fstart = 0, fend = 0, flen;
+
+	if (!sscanf(input, "%lld.%n%lld%n", &whole, &fstart, &frac, &fend))
+		return -EINVAL;
+	if (frac < 0)
+		return -EINVAL;
+
+	flen = fend > fstart ? fend - fstart : 0;
+	if (flen < dec_shift)
+		frac *= power_of_ten(dec_shift - flen);
+	else
+		frac = DIV_ROUND_CLOSEST_ULL(frac, power_of_ten(flen - dec_shift));
+
+	*v = whole * power_of_ten(dec_shift) + frac;
+	return 0;
+}
+
 /*
 * sock->sk_cgrp_data handling.  For more info, see sock_cgroup_data
 * definition in cgroup-defs.h.
@@ -6402,4 +6444,5 @@ static int __init cgroup_sysfs_init(void)
 	return sysfs_create_group(kernel_kobj, &cgroup_sysfs_attr_group);
 }
 subsys_initcall(cgroup_sysfs_init);
+
 #endif /* CONFIG_SYSFS */
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -729,7 +729,7 @@ static inline int nr_cpusets(void)
 * load balancing domains (sched domains) as specified by that partial
 * partition.
 *
- * See "What is sched_load_balance" in Documentation/cgroup-v1/cpusets.txt
+ * See "What is sched_load_balance" in Documentation/cgroup-v1/cpusets.rst
 * for a background explanation of this.
 *
 * Does not return errors, on the theory that the callers of this

--- a/security/device_cgroup.c
+++ b/security/device_cgroup.c
@@ -509,7 +509,7 @@ static inline int may_allow_all(struct dev_cgroup *parent)
 * This is one of the three key functions for hierarchy implementation.
 * This function is responsible for re-evaluating all the cgroup's active
 * exceptions due to a parent's exception change.
- * Refer to Documentation/cgroup-v1/devices.txt for more details.
+ * Refer to Documentation/cgroup-v1/devices.rst for more details.
 */
 static void revalidate_active_exceptions(struct dev_cgroup *devcg)
 {

--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -785,7 +785,7 @@ union bpf_attr {
 * 		based on a user-provided identifier for all traffic coming from
 * 		the tasks belonging to the related cgroup. See also the related
 * 		kernel documentation, available from the Linux sources in file
- * 		*Documentation/cgroup-v1/net_cls.txt*.
+ * 		*Documentation/cgroup-v1/net_cls.rst*.
 *
 * 		The Linux kernel has two versions for cgroups: there are
 * 		cgroups v1 and cgroups v2. Both are available to users, who can