Commit c3123552 authored by Mauro Carvalho Chehab's avatar Mauro Carvalho Chehab

docs: accounting: convert to ReST

Rename the accounting documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
parent a36d0538
==================
Control Groupstats
==================
Control Groupstats is inspired by the discussion at Control Groupstats is inspired by the discussion at
http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics as http://lkml.org/lkml/2007/4/11/187 and implements per cgroup statistics as
suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263. suggested by Andrew Morton in http://lkml.org/lkml/2007/4/11/263.
...@@ -19,9 +23,9 @@ about tasks blocked on I/O. If CONFIG_TASK_DELAY_ACCT is disabled, this ...@@ -19,9 +23,9 @@ about tasks blocked on I/O. If CONFIG_TASK_DELAY_ACCT is disabled, this
information will not be available. information will not be available.
To extract cgroup statistics a utility very similar to getdelays.c To extract cgroup statistics a utility very similar to getdelays.c
has been developed, the sample output of the utility is shown below has been developed, the sample output of the utility is shown below::
~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup/a" ~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup/a"
sleeping 1, blocked 0, running 1, stopped 0, uninterruptible 0 sleeping 1, blocked 0, running 1, stopped 0, uninterruptible 0
~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup" ~/balbir/cgroupstats # ./getdelays -C "/sys/fs/cgroup"
sleeping 155, blocked 0, running 1, stopped 0, uninterruptible 2 sleeping 155, blocked 0, running 1, stopped 0, uninterruptible 2
================
Delay accounting Delay accounting
---------------- ================
Tasks encounter delays in execution when they wait Tasks encounter delays in execution when they wait
for some kernel resource to become available e.g. a for some kernel resource to become available e.g. a
...@@ -39,7 +40,9 @@ in detail in a separate document in this directory. Taskstats returns a ...@@ -39,7 +40,9 @@ in detail in a separate document in this directory. Taskstats returns a
generic data structure to userspace corresponding to per-pid and per-tgid generic data structure to userspace corresponding to per-pid and per-tgid
statistics. The delay accounting functionality populates specific fields of statistics. The delay accounting functionality populates specific fields of
this structure. See this structure. See
include/linux/taskstats.h include/linux/taskstats.h
for a description of the fields pertaining to delay accounting. for a description of the fields pertaining to delay accounting.
It will generally be in the form of counters returning the cumulative It will generally be in the form of counters returning the cumulative
delay seen for cpu, sync block I/O, swapin, memory reclaim etc. delay seen for cpu, sync block I/O, swapin, memory reclaim etc.
...@@ -61,13 +64,16 @@ also serves as an example of using the taskstats interface. ...@@ -61,13 +64,16 @@ also serves as an example of using the taskstats interface.
Usage Usage
----- -----
Compile the kernel with Compile the kernel with::
CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASKSTATS=y CONFIG_TASKSTATS=y
Delay accounting is enabled by default at boot up. Delay accounting is enabled by default at boot up.
To disable, add To disable, add::
nodelayacct nodelayacct
to the kernel boot options. The rest of the instructions to the kernel boot options. The rest of the instructions
below assume this has not been done. below assume this has not been done.
...@@ -78,40 +84,43 @@ The utility also allows a given command to be ...@@ -78,40 +84,43 @@ The utility also allows a given command to be
executed and the corresponding delays to be executed and the corresponding delays to be
seen. seen.
General format of the getdelays command General format of the getdelays command::
getdelays [-t tgid] [-p pid] [-c cmd...] getdelays [-t tgid] [-p pid] [-c cmd...]
Get delays, since system boot, for pid 10 Get delays, since system boot, for pid 10::
# ./getdelays -p 10
(output similar to next case)
Get sum of delays, since system boot, for all pids with tgid 5 # ./getdelays -p 10
# ./getdelays -t 5 (output similar to next case)
Get sum of delays, since system boot, for all pids with tgid 5::
CPU count real total virtual total delay total # ./getdelays -t 5
7876 92005750 100000000 24001500
IO count delay total
0 0 CPU count real total virtual total delay total
SWAP count delay total 7876 92005750 100000000 24001500
0 0 IO count delay total
RECLAIM count delay total 0 0
0 0 SWAP count delay total
0 0
RECLAIM count delay total
0 0
Get delays seen in executing a given simple command::
Get delays seen in executing a given simple command # ./getdelays -c ls /
# ./getdelays -c ls /
bin data1 data3 data5 dev home media opt root srv sys usr bin data1 data3 data5 dev home media opt root srv sys usr
boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var
CPU count real total virtual total delay total CPU count real total virtual total delay total
6 4000250 4000000 0 6 4000250 4000000 0
IO count delay total IO count delay total
0 0 0 0
SWAP count delay total SWAP count delay total
0 0 0 0
RECLAIM count delay total RECLAIM count delay total
0 0 0 0
:orphan:
==========
Accounting
==========
.. toctree::
:maxdepth: 1
cgroupstats
delay-accounting
psi
taskstats
taskstats-struct
...@@ -35,14 +35,14 @@ Pressure interface ...@@ -35,14 +35,14 @@ Pressure interface
Pressure information for each resource is exported through the Pressure information for each resource is exported through the
respective file in /proc/pressure/ -- cpu, memory, and io. respective file in /proc/pressure/ -- cpu, memory, and io.
The format for CPU is as such: The format for CPU is as such::
some avg10=0.00 avg60=0.00 avg300=0.00 total=0 some avg10=0.00 avg60=0.00 avg300=0.00 total=0
and for memory and IO: and for memory and IO::
some avg10=0.00 avg60=0.00 avg300=0.00 total=0 some avg10=0.00 avg60=0.00 avg300=0.00 total=0
full avg10=0.00 avg60=0.00 avg300=0.00 total=0 full avg10=0.00 avg60=0.00 avg300=0.00 total=0
The "some" line indicates the share of time in which at least some The "some" line indicates the share of time in which at least some
tasks are stalled on a given resource. tasks are stalled on a given resource.
...@@ -77,9 +77,9 @@ To register a trigger user has to open psi interface file under ...@@ -77,9 +77,9 @@ To register a trigger user has to open psi interface file under
/proc/pressure/ representing the resource to be monitored and write the /proc/pressure/ representing the resource to be monitored and write the
desired threshold and time window. The open file descriptor should be desired threshold and time window. The open file descriptor should be
used to wait for trigger events using select(), poll() or epoll(). used to wait for trigger events using select(), poll() or epoll().
The following format is used: The following format is used::
<some|full> <stall amount in us> <time window in us> <some|full> <stall amount in us> <time window in us>
For example writing "some 150000 1000000" into /proc/pressure/memory For example writing "some 150000 1000000" into /proc/pressure/memory
would add 150ms threshold for partial memory stall measured within would add 150ms threshold for partial memory stall measured within
...@@ -115,18 +115,20 @@ trigger is closed. ...@@ -115,18 +115,20 @@ trigger is closed.
Userspace monitor usage example Userspace monitor usage example
=============================== ===============================
#include <errno.h> ::
#include <fcntl.h>
#include <stdio.h> #include <errno.h>
#include <poll.h> #include <fcntl.h>
#include <string.h> #include <stdio.h>
#include <unistd.h> #include <poll.h>
#include <string.h>
/* #include <unistd.h>
* Monitor memory partial stall with 1s tracking window size
* and 150ms threshold. /*
*/ * Monitor memory partial stall with 1s tracking window size
int main() { * and 150ms threshold.
*/
int main() {
const char trig[] = "some 150000 1000000"; const char trig[] = "some 150000 1000000";
struct pollfd fds; struct pollfd fds;
int n; int n;
...@@ -165,7 +167,7 @@ int main() { ...@@ -165,7 +167,7 @@ int main() {
} }
return 0; return 0;
} }
Cgroup2 interface Cgroup2 interface
================= =================
......
====================
The struct taskstats The struct taskstats
-------------------- ====================
This document contains an explanation of the struct taskstats fields. This document contains an explanation of the struct taskstats fields.
...@@ -10,16 +11,24 @@ There are three different groups of fields in the struct taskstats: ...@@ -10,16 +11,24 @@ There are three different groups of fields in the struct taskstats:
the common fields and basic accounting fields are collected for the common fields and basic accounting fields are collected for
delivery at do_exit() of a task. delivery at do_exit() of a task.
2) Delay accounting fields 2) Delay accounting fields
These fields are placed between These fields are placed between::
/* Delay accounting fields start */
and /* Delay accounting fields start */
/* Delay accounting fields end */
and::
/* Delay accounting fields end */
Their values are collected if CONFIG_TASK_DELAY_ACCT is set. Their values are collected if CONFIG_TASK_DELAY_ACCT is set.
3) Extended accounting fields 3) Extended accounting fields
These fields are placed between These fields are placed between::
/* Extended accounting fields start */
and /* Extended accounting fields start */
/* Extended accounting fields end */
and::
/* Extended accounting fields end */
Their values are collected if CONFIG_TASK_XACCT is set. Their values are collected if CONFIG_TASK_XACCT is set.
4) Per-task and per-thread context switch count statistics 4) Per-task and per-thread context switch count statistics
...@@ -31,31 +40,33 @@ There are three different groups of fields in the struct taskstats: ...@@ -31,31 +40,33 @@ There are three different groups of fields in the struct taskstats:
Future extension should add fields to the end of the taskstats struct, and Future extension should add fields to the end of the taskstats struct, and
should not change the relative position of each field within the struct. should not change the relative position of each field within the struct.
::
struct taskstats { struct taskstats {
1) Common and basic accounting fields::
1) Common and basic accounting fields:
/* The version number of this struct. This field is always set to /* The version number of this struct. This field is always set to
* TAKSTATS_VERSION, which is defined in <linux/taskstats.h>. * TAKSTATS_VERSION, which is defined in <linux/taskstats.h>.
* Each time the struct is changed, the value should be incremented. * Each time the struct is changed, the value should be incremented.
*/ */
__u16 version; __u16 version;
/* The exit code of a task. */ /* The exit code of a task. */
__u32 ac_exitcode; /* Exit status */ __u32 ac_exitcode; /* Exit status */
/* The accounting flags of a task as defined in <linux/acct.h> /* The accounting flags of a task as defined in <linux/acct.h>
* Defined values are AFORK, ASU, ACOMPAT, ACORE, and AXSIG. * Defined values are AFORK, ASU, ACOMPAT, ACORE, and AXSIG.
*/ */
__u8 ac_flag; /* Record flags */ __u8 ac_flag; /* Record flags */
/* The value of task_nice() of a task. */ /* The value of task_nice() of a task. */
__u8 ac_nice; /* task_nice */ __u8 ac_nice; /* task_nice */
/* The name of the command that started this task. */ /* The name of the command that started this task. */
char ac_comm[TS_COMM_LEN]; /* Command name */ char ac_comm[TS_COMM_LEN]; /* Command name */
/* The scheduling discipline as set in task->policy field. */ /* The scheduling discipline as set in task->policy field. */
__u8 ac_sched; /* Scheduling discipline */ __u8 ac_sched; /* Scheduling discipline */
__u8 ac_pad[3]; __u8 ac_pad[3];
...@@ -64,26 +75,27 @@ struct taskstats { ...@@ -64,26 +75,27 @@ struct taskstats {
__u32 ac_pid; /* Process ID */ __u32 ac_pid; /* Process ID */
__u32 ac_ppid; /* Parent process ID */ __u32 ac_ppid; /* Parent process ID */
/* The time when a task begins, in [secs] since 1970. */ /* The time when a task begins, in [secs] since 1970. */
__u32 ac_btime; /* Begin time [sec since 1970] */ __u32 ac_btime; /* Begin time [sec since 1970] */
/* The elapsed time of a task, in [usec]. */ /* The elapsed time of a task, in [usec]. */
__u64 ac_etime; /* Elapsed time [usec] */ __u64 ac_etime; /* Elapsed time [usec] */
/* The user CPU time of a task, in [usec]. */ /* The user CPU time of a task, in [usec]. */
__u64 ac_utime; /* User CPU time [usec] */ __u64 ac_utime; /* User CPU time [usec] */
/* The system CPU time of a task, in [usec]. */ /* The system CPU time of a task, in [usec]. */
__u64 ac_stime; /* System CPU time [usec] */ __u64 ac_stime; /* System CPU time [usec] */
/* The minor page fault count of a task, as set in task->min_flt. */ /* The minor page fault count of a task, as set in task->min_flt. */
__u64 ac_minflt; /* Minor Page Fault Count */ __u64 ac_minflt; /* Minor Page Fault Count */
/* The major page fault count of a task, as set in task->maj_flt. */ /* The major page fault count of a task, as set in task->maj_flt. */
__u64 ac_majflt; /* Major Page Fault Count */ __u64 ac_majflt; /* Major Page Fault Count */
2) Delay accounting fields: 2) Delay accounting fields::
/* Delay accounting fields start /* Delay accounting fields start
* *
* All values, until the comment "Delay accounting fields end" are * All values, until the comment "Delay accounting fields end" are
...@@ -134,7 +146,8 @@ struct taskstats { ...@@ -134,7 +146,8 @@ struct taskstats {
/* version 1 ends here */ /* version 1 ends here */
3) Extended accounting fields 3) Extended accounting fields::
/* Extended accounting fields start */ /* Extended accounting fields start */
/* Accumulated RSS usage in duration of a task, in MBytes-usecs. /* Accumulated RSS usage in duration of a task, in MBytes-usecs.
...@@ -145,15 +158,15 @@ struct taskstats { ...@@ -145,15 +158,15 @@ struct taskstats {
*/ */
__u64 coremem; /* accumulated RSS usage in MB-usec */ __u64 coremem; /* accumulated RSS usage in MB-usec */
/* Accumulated virtual memory usage in duration of a task. /* Accumulated virtual memory usage in duration of a task.
* Same as acct_rss_mem1 above except that we keep track of VM usage. * Same as acct_rss_mem1 above except that we keep track of VM usage.
*/ */
__u64 virtmem; /* accumulated VM usage in MB-usec */ __u64 virtmem; /* accumulated VM usage in MB-usec */
/* High watermark of RSS usage in duration of a task, in KBytes. */ /* High watermark of RSS usage in duration of a task, in KBytes. */
__u64 hiwater_rss; /* High-watermark of RSS usage */ __u64 hiwater_rss; /* High-watermark of RSS usage */
/* High watermark of VM usage in duration of a task, in KBytes. */ /* High watermark of VM usage in duration of a task, in KBytes. */
__u64 hiwater_vm; /* High-water virtual memory usage */ __u64 hiwater_vm; /* High-water virtual memory usage */
/* The following four fields are I/O statistics of a task. */ /* The following four fields are I/O statistics of a task. */
...@@ -164,17 +177,23 @@ struct taskstats { ...@@ -164,17 +177,23 @@ struct taskstats {
/* Extended accounting fields end */ /* Extended accounting fields end */
4) Per-task and per-thread statistics 4) Per-task and per-thread statistics::
__u64 nvcsw; /* Context voluntary switch counter */ __u64 nvcsw; /* Context voluntary switch counter */
__u64 nivcsw; /* Context involuntary switch counter */ __u64 nivcsw; /* Context involuntary switch counter */
5) Time accounting for SMT machines 5) Time accounting for SMT machines::
__u64 ac_utimescaled; /* utime scaled on frequency etc */ __u64 ac_utimescaled; /* utime scaled on frequency etc */
__u64 ac_stimescaled; /* stime scaled on frequency etc */ __u64 ac_stimescaled; /* stime scaled on frequency etc */
__u64 cpu_scaled_run_real_total; /* scaled cpu_run_real_total */ __u64 cpu_scaled_run_real_total; /* scaled cpu_run_real_total */
6) Extended delay accounting fields for memory reclaim 6) Extended delay accounting fields for memory reclaim::
/* Delay waiting for memory reclaim */ /* Delay waiting for memory reclaim */
__u64 freepages_count; __u64 freepages_count;
__u64 freepages_delay_total; __u64 freepages_delay_total;
}
::
}
=============================
Per-task statistics interface Per-task statistics interface
----------------------------- =============================
Taskstats is a netlink-based interface for sending per-task and Taskstats is a netlink-based interface for sending per-task and
...@@ -65,7 +66,7 @@ taskstats.h file. ...@@ -65,7 +66,7 @@ taskstats.h file.
The data exchanged between user and kernel space is a netlink message belonging The data exchanged between user and kernel space is a netlink message belonging
to the NETLINK_GENERIC family and using the netlink attributes interface. to the NETLINK_GENERIC family and using the netlink attributes interface.
The messages are in the format The messages are in the format::
+----------+- - -+-------------+-------------------+ +----------+- - -+-------------+-------------------+
| nlmsghdr | Pad | genlmsghdr | taskstats payload | | nlmsghdr | Pad | genlmsghdr | taskstats payload |
...@@ -167,15 +168,13 @@ extended and the number of cpus grows large. ...@@ -167,15 +168,13 @@ extended and the number of cpus grows large.
To avoid losing statistics, userspace should do one or more of the following: To avoid losing statistics, userspace should do one or more of the following:
- increase the receive buffer sizes for the netlink sockets opened by - increase the receive buffer sizes for the netlink sockets opened by
listeners to receive exit data. listeners to receive exit data.
- create more listeners and reduce the number of cpus being listened to by - create more listeners and reduce the number of cpus being listened to by
each listener. In the extreme case, there could be one listener for each cpu. each listener. In the extreme case, there could be one listener for each cpu.
Users may also consider setting the cpu affinity of the listener to the subset Users may also consider setting the cpu affinity of the listener to the subset
of cpus to which it listens, especially if they are listening to just one cpu. of cpus to which it listens, especially if they are listening to just one cpu.
Despite these measures, if the userspace receives ENOBUFS error messages Despite these measures, if the userspace receives ENOBUFS error messages
indicated overflow of receive buffers, it should take measures to handle the indicated overflow of receive buffers, it should take measures to handle the
loss of data. loss of data.
----
...@@ -1014,7 +1014,7 @@ All time durations are in microseconds. ...@@ -1014,7 +1014,7 @@ All time durations are in microseconds.
A read-only nested-key file which exists on non-root cgroups. A read-only nested-key file which exists on non-root cgroups.
Shows pressure stall information for CPU. See Shows pressure stall information for CPU. See
Documentation/accounting/psi.txt for details. Documentation/accounting/psi.rst for details.
Memory Memory
...@@ -1355,7 +1355,7 @@ PAGE_SIZE multiple when read back. ...@@ -1355,7 +1355,7 @@ PAGE_SIZE multiple when read back.
A read-only nested-key file which exists on non-root cgroups. A read-only nested-key file which exists on non-root cgroups.
Shows pressure stall information for memory. See Shows pressure stall information for memory. See
Documentation/accounting/psi.txt for details. Documentation/accounting/psi.rst for details.
Usage Guidelines Usage Guidelines
...@@ -1498,7 +1498,7 @@ IO Interface Files ...@@ -1498,7 +1498,7 @@ IO Interface Files
A read-only nested-key file which exists on non-root cgroups. A read-only nested-key file which exists on non-root cgroups.
Shows pressure stall information for IO. See Shows pressure stall information for IO. See
Documentation/accounting/psi.txt for details. Documentation/accounting/psi.rst for details.
Writeback Writeback
......
...@@ -550,7 +550,7 @@ config PSI ...@@ -550,7 +550,7 @@ config PSI
have cpu.pressure, memory.pressure, and io.pressure files, have cpu.pressure, memory.pressure, and io.pressure files,
which aggregate pressure stalls for the grouped tasks only. which aggregate pressure stalls for the grouped tasks only.
For more details see Documentation/accounting/psi.txt. For more details see Documentation/accounting/psi.rst.
Say N if unsure. Say N if unsure.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment