Commit 6f38be8f authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'docs-5.17' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "This isn't a hugely busy cycle for documentation, but a few
  significant things still showed up:

   - A documentation section for ARC processors

   - Reworked and enhanced KUnit documentation

   - The ability to pick your own theme for HTML builds; if the default
     "Read the Docs" theme isn't ugly enough for you, you can now pick
     an uglier one.

   - More Chinese translation work

  Plus the usual assortment of fixes and cleanups"

* tag 'docs-5.17' of git://git.lwn.net/linux: (53 commits)
  scripts: sphinx-pre-install: Fix ctex support on Debian
  docs: discourage use of list tables
  docs: 5.Posting.rst: describe Fixes: and Link: tags
  Documentation: kgdb: Replace deprecated remotebaud
  docs: automarkup.py: Fix invalid HTML link output and broken URI fragments
  Documentation: refer to config RANDOMIZE_BASE for kernel address-space randomization
  Documentation: kgdb: properly capitalize the MAGIC_SYSRQ config
  docs/zh_CN: Update and fix a couple of typos
  scripts: sphinx-pre-install: add required ctex dependency
  Documentation: KUnit: Restyled Frequently Asked Questions
  Documentation: KUnit: Restyle Test Style and Nomenclature page
  Documentation: KUnit: Rework writing page to focus on writing tests
  Documentation: kunit: Reorganize documentation related to running tests
  Documentation: KUnit: Added KUnit Architecture
  Documentation: KUnit: Rewrite getting started
  Documentation: KUnit: Rewrite main page
  docs/zh_CN: Add zh_CN/accounting/delay-accounting.rst
  Documentation/sphinx: fix typos of "its"
  docs/zh_CN: Add sched-domains translation
  doc: fs: remove bdev_try_to_free_page related doc
  ...
parents 1be5bdf8 87d6576d
......@@ -19,6 +19,8 @@ endif
SPHINXBUILD = sphinx-build
SPHINXOPTS =
SPHINXDIRS = .
DOCS_THEME =
DOCS_CSS =
_SPHINXDIRS = $(sort $(patsubst $(srctree)/Documentation/%/index.rst,%,$(wildcard $(srctree)/Documentation/*/index.rst)))
SPHINX_CONF = conf.py
PAPER =
......@@ -84,7 +86,10 @@ quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
-D version=$(KERNELVERSION) -D release=$(KERNELRELEASE) \
$(ALLSPHINXOPTS) \
$(abspath $(srctree)/$(src)/$5) \
$(abspath $(BUILDDIR)/$3/$4)
$(abspath $(BUILDDIR)/$3/$4) && \
if [ "x$(DOCS_CSS)" != "x" ]; then \
cp $(if $(patsubst /%,,$(DOCS_CSS)),$(abspath $(srctree)/$(DOCS_CSS)),$(DOCS_CSS)) $(BUILDDIR)/$3/_static/; \
fi
htmldocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
......@@ -154,4 +159,8 @@ dochelp:
@echo ' make SPHINX_CONF={conf-file} [target] use *additional* sphinx-build'
@echo ' configuration. This is e.g. useful to build with nit-picking config.'
@echo
@echo ' make DOCS_THEME={sphinx-theme} selects a different Sphinx theme.'
@echo
@echo ' make DOCS_CSS={a .css file} adds a DOCS_CSS override file for html/epub output.'
@echo
@echo ' Default location for the generated documents is Documentation/output'
......@@ -468,7 +468,7 @@ Spectre variant 2
before invoking any firmware code to prevent Spectre variant 2 exploits
using the firmware.
Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y
Using kernel address space randomization (CONFIG_RANDOMIZE_BASE=y
and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
attacks on the kernel generally more difficult.
......
.. SPDX-License-Identifier: GPL-2.0
Linux kernel for ARC processors
*******************************
Other sources of information
############################
Below are some resources where more information can be found on
ARC processors and relevant open source projects.
- `<https://embarc.org>`_ - Community portal for open source on ARC.
Good place to start to find relevant FOSS projects, toolchain releases,
news items and more.
- `<https://github.com/foss-for-synopsys-dwc-arc-processors>`_ -
Home for all development activities regarding open source projects for
ARC processors. Some of the projects are forks of various upstream projects,
where "work in progress" is hosted prior to submission to upstream projects.
Other projects are developed by Synopsys and made available to community
as open source for use on ARC Processors.
- `Official Synopsys ARC Processors website
<https://www.synopsys.com/designware-ip/processor-solutions.html>`_ -
location, with access to some IP documentation (`Programmer's Reference
Manual, AKA PRM for ARC HS processors
<https://www.synopsys.com/dw/doc.php/ds/cc/programmers-reference-manual-ARC-HS.pdf>`_)
and free versions of some commercial tools (`Free nSIM
<https://www.synopsys.com/cgi-bin/dwarcnsim/req1.cgi>`_ and
`MetaWare Light Edition <https://www.synopsys.com/cgi-bin/arcmwtk_lite/reg1.cgi>`_).
Please note though, registration is required to access both the documentation and
the tools.
Important note on ARC processors configurability
################################################
ARC processors are highly configurable and several configurable options
are supported in Linux. Some options are transparent to software
(i.e cache geometries, some can be detected at runtime and configured
and used accordingly, while some need to be explicitly selected or configured
in the kernel's configuration utility (AKA "make menuconfig").
However not all configurable options are supported when an ARC processor
is to run Linux. SoC design teams should refer to "Appendix E:
Configuration for ARC Linux" in the ARC HS Databook for configurability
guidelines.
Following these guidelines and selecting valid configuration options
up front is critical to help prevent any unwanted issues during
SoC bringup and software development in general.
Building the Linux kernel for ARC processors
############################################
The process of kernel building for ARC processors is the same as for any other
architecture and could be done in 2 ways:
- Cross-compilation: process of compiling for ARC targets on a development
host with a different processor architecture (generally x86_64/amd64).
- Native compilation: process of compiling for ARC on a ARC platform
(hardware board or a simulator like QEMU) with complete development environment
(GNU toolchain, dtc, make etc) installed on the platform.
In both cases, up-to-date GNU toolchain for ARC for the host is needed.
Synopsys offers prebuilt toolchain releases which can be used for this purpose,
available from:
- Synopsys GNU toolchain releases:
`<https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/releases>`_
- Linux kernel compilers collection:
`<https://mirrors.edge.kernel.org/pub/tools/crosstool>`_
- Bootlin's toolchain collection: `<https://toolchains.bootlin.com>`_
Once the toolchain is installed in the system, make sure its "bin" folder
is added in your ``PATH`` environment variable. Then set ``ARCH=arc`` &
``CROSS_COMPILE=arc-linux`` (or whatever matches installed ARC toolchain prefix)
and then as usual ``make defconfig && make``.
This will produce "vmlinux" file in the root of the kernel source tree
usable for loading on the target system via JTAG.
If you need to get an image usable with U-Boot bootloader,
type ``make uImage`` and ``uImage`` will be produced in ``arch/arc/boot``
folder.
.. SPDX-License-Identifier: GPL-2.0
.. kernel-feat:: $srctree/Documentation/features arc
===================
ARC architecture
===================
.. toctree::
:maxdepth: 1
arc
features
.. only:: subproject and html
Indices
=======
* :ref:`genindex`
......@@ -9,6 +9,7 @@ implementation.
.. toctree::
:maxdepth: 2
arc/index
arm/index
arm64/index
ia64/index
......
......@@ -208,16 +208,86 @@ highlight_language = 'none'
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
# The Read the Docs theme is available from
# - https://github.com/snide/sphinx_rtd_theme
# - https://pypi.python.org/pypi/sphinx_rtd_theme
# - python-sphinx-rtd-theme package (on Debian)
try:
import sphinx_rtd_theme
html_theme = 'sphinx_rtd_theme'
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
except ImportError:
sys.stderr.write('Warning: The Sphinx \'sphinx_rtd_theme\' HTML theme was not found. Make sure you have the theme installed to produce pretty HTML output. Falling back to the default theme.\n')
# Default theme
html_theme = 'sphinx_rtd_theme'
html_css_files = []
if "DOCS_THEME" in os.environ:
html_theme = os.environ["DOCS_THEME"]
if html_theme == 'sphinx_rtd_theme' or html_theme == 'sphinx_rtd_dark_mode':
# Read the Docs theme
try:
import sphinx_rtd_theme
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_css_files = [
'theme_overrides.css',
]
# Read the Docs dark mode override theme
if html_theme == 'sphinx_rtd_dark_mode':
try:
import sphinx_rtd_dark_mode
extensions.append('sphinx_rtd_dark_mode')
except ImportError:
html_theme == 'sphinx_rtd_theme'
if html_theme == 'sphinx_rtd_theme':
# Add color-specific RTD normal mode
html_css_files.append('theme_rtd_colors.css')
except ImportError:
html_theme = 'classic'
if "DOCS_CSS" in os.environ:
css = os.environ["DOCS_CSS"].split(" ")
for l in css:
html_css_files.append(l)
if major <= 1 and minor < 8:
html_context = {
'css_files': [],
}
for l in html_css_files:
html_context['css_files'].append('_static/' + l)
if html_theme == 'classic':
html_theme_options = {
'rightsidebar': False,
'stickysidebar': True,
'collapsiblesidebar': True,
'externalrefs': False,
'footerbgcolor': "white",
'footertextcolor': "white",
'sidebarbgcolor': "white",
'sidebarbtncolor': "black",
'sidebartextcolor': "black",
'sidebarlinkcolor': "#686bff",
'relbarbgcolor': "#133f52",
'relbartextcolor': "white",
'relbarlinkcolor': "white",
'bgcolor': "white",
'textcolor': "black",
'headbgcolor': "#f2f2f2",
'headtextcolor': "#20435c",
'headlinkcolor': "#c60f0f",
'linkcolor': "#355f7c",
'visitedlinkcolor': "#355f7c",
'codebgcolor': "#3f3f3f",
'codetextcolor': "white",
'bodyfont': "serif",
'headfont': "sans-serif",
}
sys.stderr.write("Using %s theme\n" % html_theme)
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
......@@ -246,20 +316,8 @@ except ImportError:
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['sphinx-static']
html_css_files = [
'theme_overrides.css',
]
if major <= 1 and minor < 8:
html_context = {
'css_files': [
'_static/theme_overrides.css',
],
}
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
......
......@@ -32,6 +32,7 @@ Documentation/dev-tools/testing-overview.rst
kgdb
kselftest
kunit/index
ktap
.. only:: subproject and html
......
......@@ -402,7 +402,7 @@ This is a quick example of how to use kdb.
2. Enter the kernel debugger manually or by waiting for an oops or
fault. There are several ways you can enter the kernel debugger
manually; all involve using the :kbd:`SysRq-G`, which means you must have
enabled ``CONFIG_MAGIC_SysRq=y`` in your kernel config.
enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config.
- When logged in as root or with a super user session you can run::
......@@ -461,7 +461,7 @@ This is a quick example of how to use kdb with a keyboard.
2. Enter the kernel debugger manually or by waiting for an oops or
fault. There are several ways you can enter the kernel debugger
manually; all involve using the :kbd:`SysRq-G`, which means you must have
enabled ``CONFIG_MAGIC_SysRq=y`` in your kernel config.
enabled ``CONFIG_MAGIC_SYSRQ=y`` in your kernel config.
- When logged in as root or with a super user session you can run::
......@@ -557,7 +557,7 @@ Connecting with gdb to a serial port
Example (using a directly connected port)::
% gdb ./vmlinux
(gdb) set remotebaud 115200
(gdb) set serial baud 115200
(gdb) target remote /dev/ttyS0
......
This diff is collapsed.
.. SPDX-License-Identifier: GPL-2.0
==================
KUnit Architecture
==================
The KUnit architecture can be divided into two parts:
- Kernel testing library
- kunit_tool (Command line test harness)
In-Kernel Testing Framework
===========================
The kernel testing library supports KUnit tests written in C using
KUnit. KUnit tests are kernel code. KUnit does several things:
- Organizes tests
- Reports test results
- Provides test utilities
Test Cases
----------
The fundamental unit in KUnit is the test case. The KUnit test cases are
grouped into KUnit suites. A KUnit test case is a function with type
signature ``void (*)(struct kunit *test)``.
These test case functions are wrapped in a struct called
``struct kunit_case``. For code, see:
.. kernel-doc:: include/kunit/test.h
:identifiers: kunit_case
.. note:
``generate_params`` is optional for non-parameterized tests.
Each KUnit test case gets a ``struct kunit`` context
object passed to it that tracks a running test. The KUnit assertion
macros and other KUnit utilities use the ``struct kunit`` context
object. As an exception, there are two fields:
- ``->priv``: The setup functions can use it to store arbitrary test
user data.
- ``->param_value``: It contains the parameter value which can be
retrieved in the parameterized tests.
Test Suites
-----------
A KUnit suite includes a collection of test cases. The KUnit suites
are represented by the ``struct kunit_suite``. For example:
.. code-block:: c
static struct kunit_case example_test_cases[] = {
KUNIT_CASE(example_test_foo),
KUNIT_CASE(example_test_bar),
KUNIT_CASE(example_test_baz),
{}
};
static struct kunit_suite example_test_suite = {
.name = "example",
.init = example_test_init,
.exit = example_test_exit,
.test_cases = example_test_cases,
};
kunit_test_suite(example_test_suite);
In the above example, the test suite ``example_test_suite``, runs the
test cases ``example_test_foo``, ``example_test_bar``, and
``example_test_baz``. Before running the test, the ``example_test_init``
is called and after running the test, ``example_test_exit`` is called.
The ``kunit_test_suite(example_test_suite)`` registers the test suite
with the KUnit test framework.
Executor
--------
The KUnit executor can list and run built-in KUnit tests on boot.
The Test suites are stored in a linker section
called ``.kunit_test_suites``. For code, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/vmlinux.lds.h?h=v5.15#n945.
The linker section consists of an array of pointers to
``struct kunit_suite``, and is populated by the ``kunit_test_suites()``
macro. To run all tests compiled into the kernel, the KUnit executor
iterates over the linker section array.
.. kernel-figure:: kunit_suitememorydiagram.svg
:alt: KUnit Suite Memory
KUnit Suite Memory Diagram
On the kernel boot, the KUnit executor uses the start and end addresses
of this section to iterate over and run all tests. For code, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/kunit/executor.c
When built as a module, the ``kunit_test_suites()`` macro defines a
``module_init()`` function, which runs all the tests in the compilation
unit instead of utilizing the executor.
In KUnit tests, some error classes do not affect other tests
or parts of the kernel, each KUnit case executes in a separate thread
context. For code, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/kunit/try-catch.c?h=v5.15#n58
Assertion Macros
----------------
KUnit tests verify state using expectations/assertions.
All expectations/assertions are formatted as:
``KUNIT_{EXPECT|ASSERT}_<op>[_MSG](kunit, property[, message])``
- ``{EXPECT|ASSERT}`` determines whether the check is an assertion or an
expectation.
- For an expectation, if the check fails, marks the test as failed
and logs the failure.
- An assertion, on failure, causes the test case to terminate
immediately.
- Assertions call function:
``void __noreturn kunit_abort(struct kunit *)``.
- ``kunit_abort`` calls function:
``void __noreturn kunit_try_catch_throw(struct kunit_try_catch *try_catch)``.
- ``kunit_try_catch_throw`` calls function:
``void complete_and_exit(struct completion *, long) __noreturn;``
and terminates the special thread context.
- ``<op>`` denotes a check with options: ``TRUE`` (supplied property
has the boolean value “true”), ``EQ`` (two supplied properties are
equal), ``NOT_ERR_OR_NULL`` (supplied pointer is not null and does not
contain an “err” value).
- ``[_MSG]`` prints a custom message on failure.
Test Result Reporting
---------------------
KUnit prints test results in KTAP format. KTAP is based on TAP14, see:
https://github.com/isaacs/testanything.github.io/blob/tap14/tap-version-14-specification.md.
KTAP (yet to be standardized format) works with KUnit and Kselftest.
The KUnit executor prints KTAP results to dmesg, and debugfs
(if configured).
Parameterized Tests
-------------------
Each KUnit parameterized test is associated with a collection of
parameters. The test is invoked multiple times, once for each parameter
value and the parameter is stored in the ``param_value`` field.
The test case includes a ``KUNIT_CASE_PARAM()`` macro that accepts a
generator function.
The generator function is passed the previous parameter and returns the next
parameter. It also provides a macro to generate common-case generators based on
arrays.
For code, see:
.. kernel-doc:: include/kunit/test.h
:identifiers: KUNIT_ARRAY_PARAM
kunit_tool (Command Line Test Harness)
======================================
kunit_tool is a Python script ``(tools/testing/kunit/kunit.py)``
that can be used to configure, build, exec, parse and run (runs other
commands in order) test results. You can either run KUnit tests using
kunit_tool or can include KUnit in kernel and parse manually.
- ``configure`` command generates the kernel ``.config`` from a
``.kunitconfig`` file (and any architecture-specific options).
For some architectures, additional config options are specified in the
``qemu_config`` Python script
(For example: ``tools/testing/kunit/qemu_configs/powerpc.py``).
It parses both the existing ``.config`` and the ``.kunitconfig`` files
and ensures that ``.config`` is a superset of ``.kunitconfig``.
If this is not the case, it will combine the two and run
``make olddefconfig`` to regenerate the ``.config`` file. It then
verifies that ``.config`` is now a superset. This checks if all
Kconfig dependencies are correctly specified in ``.kunitconfig``.
``kunit_config.py`` includes the parsing Kconfigs code. The code which
runs ``make olddefconfig`` is a part of ``kunit_kernel.py``. You can
invoke this command via: ``./tools/testing/kunit/kunit.py config`` and
generate a ``.config`` file.
- ``build`` runs ``make`` on the kernel tree with required options
(depends on the architecture and some options, for example: build_dir)
and reports any errors.
To build a KUnit kernel from the current ``.config``, you can use the
``build`` argument: ``./tools/testing/kunit/kunit.py build``.
- ``exec`` command executes kernel results either directly (using
User-mode Linux configuration), or via an emulator such
as QEMU. It reads results from the log via standard
output (stdout), and passes them to ``parse`` to be parsed.
If you already have built a kernel with built-in KUnit tests,
you can run the kernel and display the test results with the ``exec``
argument: ``./tools/testing/kunit/kunit.py exec``.
- ``parse`` extracts the KTAP output from a kernel log, parses
the test results, and prints a summary. For failed tests, any
diagnostic output will be included.
......@@ -4,56 +4,55 @@
Frequently Asked Questions
==========================
How is this different from Autotest, kselftest, etc?
====================================================
How is this different from Autotest, kselftest, and so on?
==========================================================
KUnit is a unit testing framework. Autotest, kselftest (and some others) are
not.
A `unit test <https://martinfowler.com/bliki/UnitTest.html>`_ is supposed to
test a single unit of code in isolation, hence the name. A unit test should be
the finest granularity of testing and as such should allow all possible code
paths to be tested in the code under test; this is only possible if the code
under test is very small and does not have any external dependencies outside of
test a single unit of code in isolation and hence the name *unit test*. A unit
test should be the finest granularity of testing and should allow all possible
code paths to be tested in the code under test. This is only possible if the
code under test is small and does not have any external dependencies outside of
the test's control like hardware.
There are no testing frameworks currently available for the kernel that do not
require installing the kernel on a test machine or in a VM and all require
tests to be written in userspace and run on the kernel under test; this is true
for Autotest, kselftest, and some others, disqualifying any of them from being
considered unit testing frameworks.
require installing the kernel on a test machine or in a virtual machine. All
testing frameworks require tests to be written in userspace and run on the
kernel under test. This is true for Autotest, kselftest, and some others,
disqualifying any of them from being considered unit testing frameworks.
Does KUnit support running on architectures other than UML?
===========================================================
Yes, well, mostly.
Yes, mostly.
For the most part, the KUnit core framework (what you use to write the tests)
can compile to any architecture; it compiles like just another part of the
For the most part, the KUnit core framework (what we use to write the tests)
can compile to any architecture. It compiles like just another part of the
kernel and runs when the kernel boots, or when built as a module, when the
module is loaded. However, there is some infrastructure,
like the KUnit Wrapper (``tools/testing/kunit/kunit.py``) that does not support
other architectures.
module is loaded. However, there is infrastructure, like the KUnit Wrapper
(``tools/testing/kunit/kunit.py``) that does not support other architectures.
In short, this means that, yes, you can run KUnit on other architectures, but
it might require more work than using KUnit on UML.
In short, yes, you can run KUnit on other architectures, but it might require
more work than using KUnit on UML.
For more information, see :ref:`kunit-on-non-uml`.
What is the difference between a unit test and these other kinds of tests?
==========================================================================
What is the difference between a unit test and other kinds of tests?
====================================================================
Most existing tests for the Linux kernel would be categorized as an integration
test, or an end-to-end test.
- A unit test is supposed to test a single unit of code in isolation, hence the
name. A unit test should be the finest granularity of testing and as such
should allow all possible code paths to be tested in the code under test; this
is only possible if the code under test is very small and does not have any
external dependencies outside of the test's control like hardware.
- A unit test is supposed to test a single unit of code in isolation. A unit
test should be the finest granularity of testing and, as such, allows all
possible code paths to be tested in the code under test. This is only possible
if the code under test is small and does not have any external dependencies
outside of the test's control like hardware.
- An integration test tests the interaction between a minimal set of components,
usually just two or three. For example, someone might write an integration
test to test the interaction between a driver and a piece of hardware, or to
test the interaction between the userspace libraries the kernel provides and
the kernel itself; however, one of these tests would probably not test the
the kernel itself. However, one of these tests would probably not test the
entire kernel along with hardware interactions and interactions with the
userspace.
- An end-to-end test usually tests the entire system from the perspective of the
......@@ -62,26 +61,26 @@ test, or an end-to-end test.
hardware with a production userspace and then trying to exercise some behavior
that depends on interactions between the hardware, the kernel, and userspace.
KUnit isn't working, what should I do?
======================================
KUnit is not working, what should I do?
=======================================
Unfortunately, there are a number of things which can break, but here are some
things to try.
1. Try running ``./tools/testing/kunit/kunit.py run`` with the ``--raw_output``
1. Run ``./tools/testing/kunit/kunit.py run`` with the ``--raw_output``
parameter. This might show details or error messages hidden by the kunit_tool
parser.
2. Instead of running ``kunit.py run``, try running ``kunit.py config``,
``kunit.py build``, and ``kunit.py exec`` independently. This can help track
down where an issue is occurring. (If you think the parser is at fault, you
can run it manually against stdin or a file with ``kunit.py parse``.)
3. Running the UML kernel directly can often reveal issues or error messages
kunit_tool ignores. This should be as simple as running ``./vmlinux`` after
building the UML kernel (e.g., by using ``kunit.py build``). Note that UML
has some unusual requirements (such as the host having a tmpfs filesystem
mounted), and has had issues in the past when built statically and the host
has KASLR enabled. (On older host kernels, you may need to run ``setarch
`uname -m` -R ./vmlinux`` to disable KASLR.)
can run it manually against ``stdin`` or a file with ``kunit.py parse``.)
3. Running the UML kernel directly can often reveal issues or error messages,
``kunit_tool`` ignores. This should be as simple as running ``./vmlinux``
after building the UML kernel (for example, by using ``kunit.py build``).
Note that UML has some unusual requirements (such as the host having a tmpfs
filesystem mounted), and has had issues in the past when built statically and
the host has KASLR enabled. (On older host kernels, you may need to run
``setarch `uname -m` -R ./vmlinux`` to disable KASLR.)
4. Make sure the kernel .config has ``CONFIG_KUNIT=y`` and at least one test
(e.g. ``CONFIG_KUNIT_EXAMPLE_TEST=y``). kunit_tool will keep its .config
around, so you can see what config was used after running ``kunit.py run``.
......
.. SPDX-License-Identifier: GPL-2.0
=========================================
KUnit - Unit Testing for the Linux Kernel
=========================================
=================================
KUnit - Linux Kernel Unit Testing
=================================
.. toctree::
:maxdepth: 2
:caption: Contents:
start
architecture
run_wrapper
run_manual
usage
kunit-tool
api/index
......@@ -16,82 +20,94 @@ KUnit - Unit Testing for the Linux Kernel
tips
running_tips
What is KUnit?
==============
KUnit is a lightweight unit testing framework for the Linux kernel.
KUnit is heavily inspired by JUnit, Python's unittest.mock, and
Googletest/Googlemock for C++. KUnit provides facilities for defining unit test
cases, grouping related test cases into test suites, providing common
infrastructure for running tests, and much more.
KUnit consists of a kernel component, which provides a set of macros for easily
writing unit tests. Tests written against KUnit will run on kernel boot if
built-in, or when loaded if built as a module. These tests write out results to
the kernel log in `TAP <https://testanything.org/>`_ format.
To make running these tests (and reading the results) easier, KUnit offers
:doc:`kunit_tool <kunit-tool>`, which builds a `User Mode Linux
<http://user-mode-linux.sourceforge.net>`_ kernel, runs it, and parses the test
results. This provides a quick way of running KUnit tests during development,
without requiring a virtual machine or separate hardware.
Get started now: Documentation/dev-tools/kunit/start.rst
Why KUnit?
==========
A unit test is supposed to test a single unit of code in isolation, hence the
name. A unit test should be the finest granularity of testing and as such should
allow all possible code paths to be tested in the code under test; this is only
possible if the code under test is very small and does not have any external
dependencies outside of the test's control like hardware.
KUnit provides a common framework for unit tests within the kernel.
KUnit tests can be run on most architectures, and most tests are architecture
independent. All built-in KUnit tests run on kernel startup. Alternatively,
KUnit and KUnit tests can be built as modules and tests will run when the test
module is loaded.
.. note::
KUnit can also run tests without needing a virtual machine or actual
hardware under User Mode Linux. User Mode Linux is a Linux architecture,
like ARM or x86, which compiles the kernel as a Linux executable. KUnit
can be used with UML either by building with ``ARCH=um`` (like any other
architecture), or by using :doc:`kunit_tool <kunit-tool>`.
KUnit is fast. Excluding build time, from invocation to completion KUnit can run
several dozen tests in only 10 to 20 seconds; this might not sound like a big
deal to some people, but having such fast and easy to run tests fundamentally
changes the way you go about testing and even writing code in the first place.
Linus himself said in his `git talk at Google
<https://gist.github.com/lorn/1272686/revisions#diff-53c65572127855f1b003db4064a94573R874>`_:
"... a lot of people seem to think that performance is about doing the
same thing, just doing it faster, and that is not true. That is not what
performance is all about. If you can do something really fast, really
well, people will start using it differently."
In this context Linus was talking about branching and merging,
but this point also applies to testing. If your tests are slow, unreliable, are
difficult to write, and require a special setup or special hardware to run,
then you wait a lot longer to write tests, and you wait a lot longer to run
tests; this means that tests are likely to break, unlikely to test a lot of
things, and are unlikely to be rerun once they pass. If your tests are really
fast, you run them all the time, every time you make a change, and every time
someone sends you some code. Why trust that someone ran all their tests
correctly on every change when you can just run them yourself in less time than
it takes to read their test log?
This section details the kernel unit testing framework.
Introduction
============
KUnit (Kernel unit testing framework) provides a common framework for
unit tests within the Linux kernel. Using KUnit, you can define groups
of test cases called test suites. The tests either run on kernel boot
if built-in, or load as a module. KUnit automatically flags and reports
failed test cases in the kernel log. The test results appear in `TAP
(Test Anything Protocol) format <https://testanything.org/>`_. It is inspired by
JUnit, Python’s unittest.mock, and GoogleTest/GoogleMock (C++ unit testing
framework).
KUnit tests are part of the kernel, written in the C (programming)
language, and test parts of the Kernel implementation (example: a C
language function). Excluding build time, from invocation to
completion, KUnit can run around 100 tests in less than 10 seconds.
KUnit can test any kernel component, for example: file system, system
calls, memory management, device drivers and so on.
KUnit follows the white-box testing approach. The test has access to
internal system functionality. KUnit runs in kernel space and is not
restricted to things exposed to user-space.
In addition, KUnit has kunit_tool, a script (``tools/testing/kunit/kunit.py``)
that configures the Linux kernel, runs KUnit tests under QEMU or UML (`User Mode
Linux <http://user-mode-linux.sourceforge.net/>`_), parses the test results and
displays them in a user friendly manner.
Features
--------
- Provides a framework for writing unit tests.
- Runs tests on any kernel architecture.
- Runs a test in milliseconds.
Prerequisites
-------------
- Any Linux kernel compatible hardware.
- For Kernel under test, Linux kernel version 5.5 or greater.
Unit Testing
============
A unit test tests a single unit of code in isolation. A unit test is the finest
granularity of testing and allows all possible code paths to be tested in the
code under test. This is possible if the code under test is small and does not
have any external dependencies outside of the test's control like hardware.
Write Unit Tests
----------------
To write good unit tests, there is a simple but powerful pattern:
Arrange-Act-Assert. This is a great way to structure test cases and
defines an order of operations.
- Arrange inputs and targets: At the start of the test, arrange the data
that allows a function to work. Example: initialize a statement or
object.
- Act on the target behavior: Call your function/code under test.
- Assert expected outcome: Verify that the result (or resulting state) is as
expected.
Unit Testing Advantages
-----------------------
- Increases testing speed and development in the long run.
- Detects bugs at initial stage and therefore decreases bug fix cost
compared to acceptance testing.
- Improves code quality.
- Encourages writing testable code.
How do I use it?
================
* Documentation/dev-tools/kunit/start.rst - for new users of KUnit
* Documentation/dev-tools/kunit/tips.rst - for short examples of best practices
* Documentation/dev-tools/kunit/usage.rst - for a more detailed explanation of KUnit features
* Documentation/dev-tools/kunit/api/index.rst - for the list of KUnit APIs used for testing
* Documentation/dev-tools/kunit/kunit-tool.rst - for more information on the kunit_tool helper script
* Documentation/dev-tools/kunit/faq.rst - for answers to some common questions about KUnit
* Documentation/dev-tools/kunit/start.rst - for KUnit new users.
* Documentation/dev-tools/kunit/architecture.rst - KUnit architecture.
* Documentation/dev-tools/kunit/run_wrapper.rst - run kunit_tool.
* Documentation/dev-tools/kunit/run_manual.rst - run tests without kunit_tool.
* Documentation/dev-tools/kunit/usage.rst - write tests.
* Documentation/dev-tools/kunit/tips.rst - best practices with
examples.
* Documentation/dev-tools/kunit/api/index.rst - KUnit APIs
used for testing.
* Documentation/dev-tools/kunit/kunit-tool.rst - kunit_tool helper
script.
* Documentation/dev-tools/kunit/faq.rst - KUnit common questions and
answers.
<?xml version="1.0" encoding="UTF-8"?>
<svg width="796.93" height="555.73" version="1.1" viewBox="0 0 796.93 555.73" xmlns="http://www.w3.org/2000/svg">
<g transform="translate(-13.724 -17.943)">
<g fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a">
<rect x="323.56" y="18.443" width="115.75" height="41.331"/>
<rect x="323.56" y="463.09" width="115.75" height="41.331"/>
<rect x="323.56" y="531.84" width="115.75" height="41.331"/>
<rect x="323.56" y="88.931" width="115.75" height="74.231"/>
</g>
<g>
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(0 -258.6)">
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(0 -217.27)">
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(0 -175.94)">
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(0 -134.61)">
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(0 -41.331)">
<rect x="323.56" y="421.76" width="115.75" height="41.331" fill="#b9dbc6" stroke="#1a1a1a"/>
<text x="328.00888" y="446.61826" fill="#000000" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="328.00888" y="446.61826" font-family="monospace" font-size="16px">kunit_suite</tspan></text>
</g>
<g transform="translate(3.4459e-5 -.71088)">
<rect x="502.19" y="143.16" width="201.13" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/>
<text x="512.02319" y="168.02026" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="512.02319" y="168.02026" font-family="monospace">_kunit_suites_start</tspan></text>
</g>
<g transform="translate(3.0518e-5 -3.1753)">
<rect x="502.19" y="445.69" width="201.13" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/>
<text x="521.61694" y="470.54846" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="521.61694" y="470.54846" font-family="monospace">_kunit_suites_end</tspan></text>
</g>
<rect x="14.224" y="277.78" width="134.47" height="41.331" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/>
<text x="32.062176" y="304.41287" font-family="sans-serif" font-size="16px" style="line-height:1.25" xml:space="preserve"><tspan x="32.062176" y="304.41287" font-family="monospace">.init.data</tspan></text>
<g transform="translate(217.98 145.12)" stroke="#1a1a1a">
<circle cx="149.97" cy="373.01" r="3.4012"/>
<circle cx="163.46" cy="373.01" r="3.4012"/>
<circle cx="176.95" cy="373.01" r="3.4012"/>
</g>
<g transform="translate(217.98 -298.66)" stroke="#1a1a1a">
<circle cx="149.97" cy="373.01" r="3.4012"/>
<circle cx="163.46" cy="373.01" r="3.4012"/>
<circle cx="176.95" cy="373.01" r="3.4012"/>
</g>
<g stroke="#1a1a1a">
<rect x="323.56" y="328.49" width="115.75" height="51.549" fill="#b9dbc6"/>
<g transform="translate(217.98 -18.75)">
<circle cx="149.97" cy="373.01" r="3.4012"/>
<circle cx="163.46" cy="373.01" r="3.4012"/>
<circle cx="176.95" cy="373.01" r="3.4012"/>
</g>
</g>
<g transform="scale(1.0933 .9147)" stroke-width="32.937" aria-label="{">
<path d="m275.49 545.57c-35.836-8.432-47.43-24.769-47.957-64.821v-88.536c-0.527-44.795-10.54-57.97-49.538-67.456 38.998-10.013 49.011-23.715 49.538-67.983v-88.536c0.527-40.052 12.121-56.389 47.957-64.821v-5.797c-65.348 0-85.901 17.391-86.955 73.253v93.806c-0.527 36.89-10.013 50.065-44.795 59.551 34.782 10.013 44.268 23.188 44.795 60.078v93.279c1.581 56.389 21.607 73.78 86.955 73.78z"/>
</g>
<g transform="scale(1.1071 .90325)" stroke-width="14.44" aria-label="{">
<path d="m461.46 443.55c-15.711-3.6967-20.794-10.859-21.025-28.418v-38.815c-0.23104-19.639-4.6209-25.415-21.718-29.574 17.097-4.3898 21.487-10.397 21.718-29.805v-38.815c0.23105-17.559 5.314-24.722 21.025-28.418v-2.5415c-28.649 0-37.66 7.6244-38.122 32.115v41.126c-0.23105 16.173-4.3898 21.949-19.639 26.108 15.249 4.3898 19.408 10.166 19.639 26.339v40.895c0.69313 24.722 9.4728 32.346 38.122 32.346z"/>
</g>
<path d="m449.55 161.84v2.5h49.504v-2.5z" color="#000000" style="-inkscape-stroke:none"/>
<g fill-rule="evenodd">
<path d="m443.78 163.09 8.65-5v10z" color="#000000" stroke-width="1pt" style="-inkscape-stroke:none"/>
<path d="m453.1 156.94-10.648 6.1543 0.99804 0.57812 9.6504 5.5781zm-1.334 2.3125v7.6856l-6.6504-3.8438z" color="#000000" style="-inkscape-stroke:none"/>
</g>
<path d="m449.55 461.91v2.5h49.504v-2.5z" color="#000000" style="-inkscape-stroke:none"/>
<g fill-rule="evenodd">
<path d="m443.78 463.16 8.65-5v10z" color="#000000" stroke-width="1pt" style="-inkscape-stroke:none"/>
<path d="m453.1 457-10.648 6.1562 0.99804 0.57617 9.6504 5.5781zm-1.334 2.3125v7.6856l-6.6504-3.8438z" color="#000000" style="-inkscape-stroke:none"/>
</g>
<rect x="515.64" y="223.9" width="294.52" height="178.49" fill="#dad4d4" fill-opacity=".91765" stroke="#1a1a1a"/>
<text x="523.33319" y="262.52542" font-family="monospace" font-size="14.667px" style="line-height:1.25" xml:space="preserve"><tspan x="523.33319" y="262.52542"><tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit_suite {</tspan><tspan x="523.33319" y="280.8588"><tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold"> const char</tspan> name[<tspan fill="#ff00ff" font-size="14.667px">256</tspan>];</tspan><tspan x="523.33319" y="299.19217"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">int</tspan> (*init)(<tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit *);</tspan><tspan x="523.33319" y="317.52554"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">void</tspan> (*exit)(<tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit *);</tspan><tspan x="523.33319" y="335.85892"> <tspan fill="#008000" font-family="monospace" font-size="14.667px" font-weight="bold">struct</tspan> kunit_case *test_cases;</tspan><tspan x="523.33319" y="354.19229"> ...</tspan><tspan x="523.33319" y="372.52567">};</tspan></text>
</g>
</svg>
.. SPDX-License-Identifier: GPL-2.0
============================
Run Tests without kunit_tool
============================
If we do not want to use kunit_tool (For example: we want to integrate
with other systems, or run tests on real hardware), we can
include KUnit in any kernel, read out results, and parse manually.
.. note:: KUnit is not designed for use in a production system. It is
possible that tests may reduce the stability or security of
the system.
Configure the Kernel
====================
KUnit tests can run without kunit_tool. This can be useful, if:
- We have an existing kernel configuration to test.
- Need to run on real hardware (or using an emulator/VM kunit_tool
does not support).
- Wish to integrate with some existing testing systems.
KUnit is configured with the ``CONFIG_KUNIT`` option, and individual
tests can also be built by enabling their config options in our
``.config``. KUnit tests usually (but don't always) have config options
ending in ``_KUNIT_TEST``. Most tests can either be built as a module,
or be built into the kernel.
.. note ::
We can enable the ``KUNIT_ALL_TESTS`` config option to
automatically enable all tests with satisfied dependencies. This is
a good way of quickly testing everything applicable to the current
config.
Once we have built our kernel (and/or modules), it is simple to run
the tests. If the tests are built-in, they will run automatically on the
kernel boot. The results will be written to the kernel log (``dmesg``)
in TAP format.
If the tests are built as modules, they will run when the module is
loaded.
.. code-block :: bash
# modprobe example-test
The results will appear in TAP format in ``dmesg``.
.. note ::
If ``CONFIG_KUNIT_DEBUGFS`` is enabled, KUnit test results will
be accessible from the ``debugfs`` filesystem (if mounted).
They will be in ``/sys/kernel/debug/kunit/<test_suite>/results``, in
TAP format.
.. SPDX-License-Identifier: GPL-2.0
=========================
Run Tests with kunit_tool
=========================
We can either run KUnit tests using kunit_tool or can run tests
manually, and then use kunit_tool to parse the results. To run tests
manually, see: Documentation/dev-tools/kunit/run_manual.rst.
As long as we can build the kernel, we can run KUnit.
kunit_tool is a Python script which configures and builds a kernel, runs
tests, and formats the test results.
Run command:
.. code-block::
./tools/testing/kunit/kunit.py run
We should see the following:
.. code-block::
Generating .config...
Building KUnit kernel...
Starting KUnit kernel...
We may want to use the following options:
.. code-block::
./tools/testing/kunit/kunit.py run --timeout=30 --jobs=`nproc --all
- ``--timeout`` sets a maximum amount of time for tests to run.
- ``--jobs`` sets the number of threads to build the kernel.
kunit_tool will generate a ``.kunitconfig`` with a default
configuration, if no other ``.kunitconfig`` file exists
(in the build directory). In addition, it verifies that the
generated ``.config`` file contains the ``CONFIG`` options in the
``.kunitconfig``.
It is also possible to pass a separate ``.kunitconfig`` fragment to
kunit_tool. This is useful if we have several different groups of
tests we want to run independently, or if we want to use pre-defined
test configs for certain subsystems.
To use a different ``.kunitconfig`` file (such as one
provided to test a particular subsystem), pass it as an option:
.. code-block::
./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4/.kunitconfig
To view kunit_tool flags (optional command-line arguments), run:
.. code-block::
./tools/testing/kunit/kunit.py run --help
Create a ``.kunitconfig`` File
===============================
If we want to run a specific set of tests (rather than those listed
in the KUnit ``defconfig``), we can provide Kconfig options in the
``.kunitconfig`` file. For default .kunitconfig, see:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/kunit/configs/default.config.
A ``.kunitconfig`` is a ``minconfig`` (a .config
generated by running ``make savedefconfig``), used for running a
specific set of tests. This file contains the regular Kernel configs
with specific test targets. The ``.kunitconfig`` also
contains any other config options required by the tests (For example:
dependencies for features under tests, configs that enable/disable
certain code blocks, arch configs and so on).
To create a ``.kunitconfig``, using the KUnit ``defconfig``:
.. code-block::
cd $PATH_TO_LINUX_REPO
cp tools/testing/kunit/configs/default.config .kunit/.kunitconfig
We can then add any other Kconfig options. For example:
.. code-block::
CONFIG_LIST_KUNIT_TEST=y
kunit_tool ensures that all config options in ``.kunitconfig`` are
set in the kernel ``.config`` before running the tests. It warns if we
have not included the options dependencies.
.. note:: Removing something from the ``.kunitconfig`` will
not rebuild the ``.config file``. The configuration is only
updated if the ``.kunitconfig`` is not a subset of ``.config``.
This means that we can use other tools
(For example: ``make menuconfig``) to adjust other config options.
The build dir needs to be set for ``make menuconfig`` to
work, therefore by default use ``make O=.kunit menuconfig``.
Configure, Build, and Run Tests
===============================
If we want to make manual changes to the KUnit build process, we
can run part of the KUnit build process independently.
When running kunit_tool, from a ``.kunitconfig``, we can generate a
``.config`` by using the ``config`` argument:
.. code-block::
./tools/testing/kunit/kunit.py config
To build a KUnit kernel from the current ``.config``, we can use the
``build`` argument:
.. code-block::
./tools/testing/kunit/kunit.py build
If we already have built UML kernel with built-in KUnit tests, we
can run the kernel, and display the test results with the ``exec``
argument:
.. code-block::
./tools/testing/kunit/kunit.py exec
The ``run`` command discussed in section: **Run Tests with kunit_tool**,
is equivalent to running the above three commands in sequence.
Parse Test Results
==================
KUnit tests output displays results in TAP (Test Anything Protocol)
format. When running tests, kunit_tool parses this output and prints
a summary. To see the raw test results in TAP format, we can pass the
``--raw_output`` argument:
.. code-block::
./tools/testing/kunit/kunit.py run --raw_output
If we have KUnit results in the raw TAP format, we can parse them and
print the human-readable summary with the ``parse`` command for
kunit_tool. This accepts a filename for an argument, or will read from
standard input.
.. code-block:: bash
# Reading from a file
./tools/testing/kunit/kunit.py parse /var/log/dmesg
# Reading from stdin
dmesg | ./tools/testing/kunit/kunit.py parse
Run Selected Test Suites
========================
By passing a bash style glob filter to the ``exec`` or ``run``
commands, we can run a subset of the tests built into a kernel . For
example: if we only want to run KUnit resource tests, use:
.. code-block::
./tools/testing/kunit/kunit.py run 'kunit-resource*'
This uses the standard glob format with wildcard characters.
Run Tests on qemu
=================
kunit_tool supports running tests on qemu as well as
via UML. To run tests on qemu, by default it requires two flags:
- ``--arch``: Selects a configs collection (Kconfig, qemu config options
and so on), that allow KUnit tests to be run on the specified
architecture in a minimal way. The architecture argument is same as
the option name passed to the ``ARCH`` variable used by Kbuild.
Not all architectures currently support this flag, but we can use
``--qemu_config`` to handle it. If ``um`` is passed (or this flag
is ignored), the tests will run via UML. Non-UML architectures,
for example: i386, x86_64, arm and so on; run on qemu.
- ``--cross_compile``: Specifies the Kbuild toolchain. It passes the
same argument as passed to the ``CROSS_COMPILE`` variable used by
Kbuild. As a reminder, this will be the prefix for the toolchain
binaries such as GCC. For example:
- ``sparc64-linux-gnu`` if we have the sparc toolchain installed on
our system.
- ``$HOME/toolchains/microblaze/gcc-9.2.0-nolibc/microblaze-linux/bin/microblaze-linux``
if we have downloaded the microblaze toolchain from the 0-day
website to a directory in our home directory called toolchains.
If we want to run KUnit tests on an architecture not supported by
the ``--arch`` flag, or want to run KUnit tests on qemu using a
non-default configuration; then we can write our own``QemuConfig``.
These ``QemuConfigs`` are written in Python. They have an import line
``from..qemu_config import QemuArchParams`` at the top of the file.
The file must contain a variable called ``QEMU_ARCH`` that has an
instance of ``QemuArchParams`` assigned to it. See example in:
``tools/testing/kunit/qemu_configs/x86_64.py``.
Once we have a ``QemuConfig``, we can pass it into kunit_tool,
using the ``--qemu_config`` flag. When used, this flag replaces the
``--arch`` flag. For example: using
``tools/testing/kunit/qemu_configs/x86_64.py``, the invocation appear
as
.. code-block:: bash
./tools/testing/kunit/kunit.py run \
--timeout=60 \
--jobs=12 \
--qemu_config=./tools/testing/kunit/qemu_configs/x86_64.py
To run existing KUnit tests on non-UML architectures, see:
Documentation/dev-tools/kunit/non_uml.rst.
Command-Line Arguments
======================
kunit_tool has a number of other command-line arguments which can
be useful for our test environment. Below the most commonly used
command line arguments:
- ``--help``: Lists all available options. To list common options,
place ``--help`` before the command. To list options specific to that
command, place ``--help`` after the command.
.. note:: Different commands (``config``, ``build``, ``run``, etc)
have different supported options.
- ``--build_dir``: Specifies kunit_tool build directory. It includes
the ``.kunitconfig``, ``.config`` files and compiled kernel.
- ``--make_options``: Specifies additional options to pass to make, when
compiling a kernel (using ``build`` or ``run`` commands). For example:
to enable compiler warnings, we can pass ``--make_options W=1``.
- ``--alltests``: Builds a UML kernel with all config options enabled
using ``make allyesconfig``. This allows us to run as many tests as
possible.
.. note:: It is slow and prone to breakage as new options are
added or modified. Instead, enable all tests
which have satisfied dependencies by adding
``CONFIG_KUNIT_ALL_TESTS=y`` to your ``.kunitconfig``.
This diff is collapsed.
......@@ -4,37 +4,36 @@
Test Style and Nomenclature
===========================
To make finding, writing, and using KUnit tests as simple as possible, it's
To make finding, writing, and using KUnit tests as simple as possible, it is
strongly encouraged that they are named and written according to the guidelines
below. While it's possible to write KUnit tests which do not follow these rules,
below. While it is possible to write KUnit tests which do not follow these rules,
they may break some tooling, may conflict with other tests, and may not be run
automatically by testing systems.
It's recommended that you only deviate from these guidelines when:
It is recommended that you only deviate from these guidelines when:
1. Porting tests to KUnit which are already known with an existing name, or
2. Writing tests which would cause serious problems if automatically run (e.g.,
non-deterministically producing false positives or negatives, or taking an
extremely long time to run).
1. Porting tests to KUnit which are already known with an existing name.
2. Writing tests which would cause serious problems if automatically run. For
example, non-deterministically producing false positives or negatives, or
taking a long time to run.
Subsystems, Suites, and Tests
=============================
In order to make tests as easy to find as possible, they're grouped into suites
and subsystems. A test suite is a group of tests which test a related area of
the kernel, and a subsystem is a set of test suites which test different parts
of the same kernel subsystem or driver.
To make tests easy to find, they are grouped into suites and subsystems. A test
suite is a group of tests which test a related area of the kernel. A subsystem
is a set of test suites which test different parts of a kernel subsystem
or a driver.
Subsystems
----------
Every test suite must belong to a subsystem. A subsystem is a collection of one
or more KUnit test suites which test the same driver or part of the kernel. A
rule of thumb is that a test subsystem should match a single kernel module. If
the code being tested can't be compiled as a module, in many cases the subsystem
should correspond to a directory in the source tree or an entry in the
MAINTAINERS file. If unsure, follow the conventions set by tests in similar
areas.
test subsystem should match a single kernel module. If the code being tested
cannot be compiled as a module, in many cases the subsystem should correspond to
a directory in the source tree or an entry in the ``MAINTAINERS`` file. If
unsure, follow the conventions set by tests in similar areas.
Test subsystems should be named after the code being tested, either after the
module (wherever possible), or after the directory or files being tested. Test
......@@ -42,9 +41,8 @@ subsystems should be named to avoid ambiguity where necessary.
If a test subsystem name has multiple components, they should be separated by
underscores. *Do not* include "test" or "kunit" directly in the subsystem name
unless you are actually testing other tests or the kunit framework itself.
Example subsystems could be:
unless we are actually testing other tests or the kunit framework itself. For
example, subsystems could be called:
``ext4``
Matches the module and filesystem name.
......@@ -56,48 +54,46 @@ Example subsystems could be:
Has several components (``snd``, ``hda``, ``codec``, ``hdmi``) separated by
underscores. Matches the module name.
Avoid names like these:
Avoid names as shown in examples below:
``linear-ranges``
Names should use underscores, not dashes, to separate words. Prefer
``linear_ranges``.
``qos-kunit-test``
As well as using underscores, this name should not have "kunit-test" as a
suffix, and ``qos`` is ambiguous as a subsystem name. ``power_qos`` would be a
better name.
This name should use underscores, and not have "kunit-test" as a
suffix. ``qos`` is also ambiguous as a subsystem name, because several parts
of the kernel have a ``qos`` subsystem. ``power_qos`` would be a better name.
``pc_parallel_port``
The corresponding module name is ``parport_pc``, so this subsystem should also
be named ``parport_pc``.
.. note::
The KUnit API and tools do not explicitly know about subsystems. They're
simply a way of categorising test suites and naming modules which
provides a simple, consistent way for humans to find and run tests. This
may change in the future, though.
The KUnit API and tools do not explicitly know about subsystems. They are
a way of categorizing test suites and naming modules which provides a
simple, consistent way for humans to find and run tests. This may change
in the future.
Suites
------
KUnit tests are grouped into test suites, which cover a specific area of
functionality being tested. Test suites can have shared initialisation and
shutdown code which is run for all tests in the suite.
Not all subsystems will need to be split into multiple test suites (e.g. simple drivers).
functionality being tested. Test suites can have shared initialization and
shutdown code which is run for all tests in the suite. Not all subsystems need
to be split into multiple test suites (for example, simple drivers).
Test suites are named after the subsystem they are part of. If a subsystem
contains several suites, the specific area under test should be appended to the
subsystem name, separated by an underscore.
In the event that there are multiple types of test using KUnit within a
subsystem (e.g., both unit tests and integration tests), they should be put into
separate suites, with the type of test as the last element in the suite name.
Unless these tests are actually present, avoid using ``_test``, ``_unittest`` or
similar in the suite name.
subsystem (for example, both unit tests and integration tests), they should be
put into separate suites, with the type of test as the last element in the suite
name. Unless these tests are actually present, avoid using ``_test``, ``_unittest``
or similar in the suite name.
The full test suite name (including the subsystem name) should be specified as
the ``.name`` member of the ``kunit_suite`` struct, and forms the base for the
module name (see below).
Example test suites could include:
module name. For example, test suites could include:
``ext4_inode``
Part of the ``ext4`` subsystem, testing the ``inode`` area.
......@@ -109,26 +105,27 @@ Example test suites could include:
The ``kasan`` subsystem has only one suite, so the suite name is the same as
the subsystem name.
Avoid names like:
Avoid names, for example:
``ext4_ext4_inode``
There's no reason to state the subsystem twice.
There is no reason to state the subsystem twice.
``property_entry``
The suite name is ambiguous without the subsystem name.
``kasan_integration_test``
Because there is only one suite in the ``kasan`` subsystem, the suite should
just be called ``kasan``. There's no need to redundantly add
``integration_test``. Should a separate test suite with, for example, unit
tests be added, then that suite could be named ``kasan_unittest`` or similar.
just be called as ``kasan``. Do not redundantly add
``integration_test``. It should be a separate test suite. For example, if the
unit tests are added, then that suite could be named as ``kasan_unittest`` or
similar.
Test Cases
----------
Individual tests consist of a single function which tests a constrained
codepath, property, or function. In the test output, individual tests' results
will show up as subtests of the suite's results.
codepath, property, or function. In the test output, an individual test's
results will show up as subtests of the suite's results.
Tests should be named after what they're testing. This is often the name of the
Tests should be named after what they are testing. This is often the name of the
function being tested, with a description of the input or codepath being tested.
As tests are C functions, they should be named and written in accordance with
the kernel coding style.
......@@ -136,7 +133,7 @@ the kernel coding style.
.. note::
As tests are themselves functions, their names cannot conflict with
other C identifiers in the kernel. This may require some creative
naming. It's a good idea to make your test functions `static` to avoid
naming. It is a good idea to make your test functions `static` to avoid
polluting the global namespace.
Example test names include:
......@@ -162,16 +159,16 @@ This Kconfig entry must:
* be named ``CONFIG_<name>_KUNIT_TEST``: where <name> is the name of the test
suite.
* be listed either alongside the config entries for the driver/subsystem being
tested, or be under [Kernel Hacking][Kernel Testing and Coverage]
* depend on ``CONFIG_KUNIT``
tested, or be under [Kernel Hacking]->[Kernel Testing and Coverage]
* depend on ``CONFIG_KUNIT``.
* be visible only if ``CONFIG_KUNIT_ALL_TESTS`` is not enabled.
* have a default value of ``CONFIG_KUNIT_ALL_TESTS``.
* have a brief description of KUnit in the help text
* have a brief description of KUnit in the help text.
Unless there's a specific reason not to (e.g. the test is unable to be built as
a module), Kconfig entries for tests should be tristate.
If we are not able to meet above conditions (for example, the test is unable to
be built as a module), Kconfig entries for tests should be tristate.
An example Kconfig entry:
For example, a Kconfig entry might look like:
.. code-block:: none
......@@ -182,8 +179,8 @@ An example Kconfig entry:
help
This builds unit tests for foo.
For more information on KUnit and unit tests in general, please refer
to the KUnit documentation in Documentation/dev-tools/kunit/.
For more information on KUnit and unit tests in general,
please refer to the KUnit documentation in Documentation/dev-tools/kunit/.
If unsure, say N.
......
This diff is collapsed.
......@@ -138,6 +138,17 @@ To pass extra options to Sphinx, you can use the ``SPHINXOPTS`` make
variable. For example, use ``make SPHINXOPTS=-v htmldocs`` to get more verbose
output.
It is also possible to pass an extra DOCS_CSS overlay file, in order to customize
the html layout, by using the ``DOCS_CSS`` make variable.
By default, the build will try to use the Read the Docs sphinx theme:
https://github.com/readthedocs/sphinx_rtd_theme
If the theme is not available, it will fall-back to the classic one.
The Sphinx theme can be overridden by using the ``DOCS_THEME`` make variable.
To remove the generated documentation, run ``make cleandocs``.
Writing Documentation
......@@ -250,12 +261,11 @@ please feel free to remove it.
list tables
-----------
We recommend the use of *list table* formats. The *list table* formats are
double-stage lists. Compared to the ASCII-art they might not be as
comfortable for
readers of the text files. Their advantage is that they are easy to
create or modify and that the diff of a modification is much more meaningful,
because it is limited to the modified content.
The list-table formats can be useful for tables that are not easily laid
out in the usual Sphinx ASCII-art formats. These formats are nearly
impossible for readers of the plain-text documents to understand, though,
and should be avoided in the absence of a strong justification for their
use.
The ``flat-table`` is a double-stage list similar to the ``list-table`` with
some additional features:
......
......@@ -169,7 +169,6 @@ prototypes::
int (*show_options)(struct seq_file *, struct dentry *);
ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
locking rules:
All may block [not true, see below]
......@@ -194,7 +193,6 @@ umount_begin: no
show_options: no (namespace_sem)
quota_read: no (see below)
quota_write: no (see below)
bdev_try_to_free_page: no (see below)
====================== ============ ========================
->statfs() has s_umount (shared) when called by ustat(2) (native or
......@@ -210,9 +208,6 @@ dqio_sem) (unless an admin really wants to screw up something and
writes to quota files with quotas on). For other details about locking
see also dquot_operations section.
->bdev_try_to_free_page is called from the ->releasepage handler of
the block device inode. See there for more details.
file_system_type
================
......
.. SPDX-License-Identifier: GPL-2.0
==
===
RDS
===
......
......@@ -197,14 +197,29 @@ the build process, for example, or editor backup files) in the patch. The
file "dontdiff" in the Documentation directory can help in this regard;
pass it to diff with the "-X" option.
The tags mentioned above are used to describe how various developers have
been associated with the development of this patch. They are described in
detail in
the :ref:`Documentation/process/submitting-patches.rst <submittingpatches>`
document; what follows here is a brief summary. Each of these lines has
the format:
The tags already briefly mentioned above are used to provide insights how
the patch came into being. They are described in detail in the
:ref:`Documentation/process/submitting-patches.rst <submittingpatches>`
document; what follows here is a brief summary.
::
One tag is used to refer to earlier commits which introduced problems fixed by
the patch::
Fixes: 1f2e3d4c5b6a ("The first line of the commit specified by the first 12 characters of its SHA-1 ID")
Another tag is used for linking web pages with additional backgrounds or
details, for example a report about a bug fixed by the patch or a document
with a specification implemented by the patch::
Link: https://example.com/somewhere.html optional-other-stuff
Many maintainers when applying a patch also add this tag to link to the
latest public review posting of the patch; often this is automatically done
by tools like b4 or a git hook like the one described in
'Documentation/maintainer/configure-git.rst'.
A third kind of tag is used to document who was involved in the development of
the patch. Each of these uses this format::
tag: Full Name <email address> optional-other-stuff
......
......@@ -271,25 +271,6 @@ least a notification of the change, so that some information makes its way
into the manual pages. User-space API changes should also be copied to
linux-api@vger.kernel.org.
For small patches you may want to CC the Trivial Patch Monkey
trivial@kernel.org which collects "trivial" patches. Have a look
into the MAINTAINERS file for its current manager.
Trivial patches must qualify for one of the following rules:
- Spelling fixes in documentation
- Spelling fixes for errors which could break :manpage:`grep(1)`
- Warning fixes (cluttering with useless warnings is bad)
- Compilation fixes (only if they are actually correct)
- Runtime fixes (only if they actually fix things)
- Removing use of deprecated functions/macros
- Contact detail and documentation fixes
- Non-portable code replaced by portable code (even in arch-specific,
since people copy, as long as it's trivial)
- Any fix by the author/maintainer of the file (ie. patch monkey
in re-transmission mode)
No MIME, no links, no compression, no attachments. Just plain text
-------------------------------------------------------------------
......
......@@ -74,7 +74,6 @@ Quota, period and burst are managed within the cpu subsystem via cgroupfs.
to cgroup v1. For cgroup v2, see
:ref:`Documentation/admin-guide/cgroup-v2.rst <cgroup-v2-cpu>`.
- cpu.cfs_quota_us: the total available run-time within a period (in
- cpu.cfs_quota_us: run-time replenished within a period (in microseconds)
- cpu.cfs_period_us: the length of a period (in microseconds)
- cpu.stat: exports throttling statistics [explained further below]
......@@ -135,7 +134,7 @@ cpu.stat:
of the group have been throttled.
- nr_bursts: Number of periods burst occurs.
- burst_time: Cumulative wall-time (in nanoseconds) that any CPUs has used
above quota in respective periods
above quota in respective periods.
This interface is read-only.
......@@ -238,7 +237,7 @@ Examples
additionally, in case accumulation has been done.
With 50ms period, 20ms quota will be equivalent to 40% of 1 CPU.
And 10ms burst will be equivalent to 20% of 1 CPU.
And 10ms burst will be equivalent to 20% of 1 CPU::
# echo 20000 > cpu.cfs_quota_us /* quota = 20ms */
# echo 50000 > cpu.cfs_period_us /* period = 50ms */
......
......@@ -81,8 +81,7 @@ of the kernel, gaining the protection of the kernel's strict memory
permissions as described above.
For variables that are initialized once at ``__init`` time, these can
be marked with the (new and under development) ``__ro_after_init``
attribute.
be marked with the ``__ro_after_init`` attribute.
What remains are variables that are updated rarely (e.g. GDT). These
will need another infrastructure (similar to the temporary exceptions
......
/* -*- coding: utf-8; mode: css -*-
*
* Sphinx HTML theme customization: read the doc
*
* Please don't add any color definition here, as the theme should
* work for both normal and dark modes.
*/
/* Improve contrast and increase size for easier reading. */
body {
font-family: serif;
color: black;
font-size: 100%;
}
......@@ -16,17 +16,8 @@ h1, h2, .rst-content .toctree-wrapper p.caption, h3, h4, h5, h6, legend {
font-family: sans-serif;
}
.wy-menu-vertical li.current a {
color: #505050;
}
.wy-menu-vertical li.on a, .wy-menu-vertical li.current > a {
color: #303030;
}
div[class^="highlight"] pre {
font-family: monospace;
color: black;
font-size: 100%;
}
......@@ -104,13 +95,10 @@ div[class^="highlight"] pre {
/* Menu selection and keystrokes */
span.menuselection {
color: blue;
font-family: "Courier New", Courier, monospace
}
code.kbd, code.kbd span {
color: white;
background-color: darkblue;
font-weight: bold;
font-family: "Courier New", Courier, monospace
}
......
/* -*- coding: utf-8; mode: css -*-
*
* Sphinx HTML theme customization: color settings for RTD (non-dark) theme
*
*/
/* Improve contrast and increase size for easier reading. */
body {
color: black;
}
.wy-menu-vertical li.current a {
color: #505050;
}
.wy-menu-vertical li.on a, .wy-menu-vertical li.current > a {
color: #303030;
}
div[class^="highlight"] pre {
color: black;
}
@media screen {
/* Menu selection and keystrokes */
span.menuselection {
color: blue;
}
code.kbd, code.kbd span {
color: white;
background-color: darkblue;
}
}
......@@ -271,19 +271,30 @@ def get_c_namespace(app, docname):
def auto_markup(app, doctree, name):
global c_namespace
c_namespace = get_c_namespace(app, name)
def text_but_not_a_reference(node):
# The nodes.literal test catches ``literal text``, its purpose is to
# avoid adding cross-references to functions that have been explicitly
# marked with cc:func:.
if not isinstance(node, nodes.Text) or isinstance(node.parent, nodes.literal):
return False
child_of_reference = False
parent = node.parent
while parent:
if isinstance(parent, nodes.Referential):
child_of_reference = True
break
parent = parent.parent
return not child_of_reference
#
# This loop could eventually be improved on. Someday maybe we
# want a proper tree traversal with a lot of awareness of which
# kinds of nodes to prune. But this works well for now.
#
# The nodes.literal test catches ``literal text``, its purpose is to
# avoid adding cross-references to functions that have been explicitly
# marked with cc:func:.
#
for para in doctree.traverse(nodes.paragraph):
for node in para.traverse(nodes.Text):
if not isinstance(node.parent, nodes.literal):
node.parent.replace(node, markup_refs(name, app, node))
for node in para.traverse(condition=text_but_not_a_reference):
node.parent.replace(node, markup_refs(name, app, node))
def setup(app):
app.connect('doctree-resolved', auto_markup)
......
......@@ -104,7 +104,7 @@ class KernelCmd(Directive):
return nodeList
def runCmd(self, cmd, **kwargs):
u"""Run command ``cmd`` and return it's stdout as unicode."""
u"""Run command ``cmd`` and return its stdout as unicode."""
try:
proc = subprocess.Popen(
......
......@@ -106,7 +106,7 @@ class KernelFeat(Directive):
return nodeList
def runCmd(self, cmd, **kwargs):
u"""Run command ``cmd`` and return it's stdout as unicode."""
u"""Run command ``cmd`` and return its stdout as unicode."""
try:
proc = subprocess.Popen(
......
......@@ -131,9 +131,7 @@ Ftrace Histogram Options
Since it is too long to write a histogram action as a string for per-event
action option, there are tree-style options under per-event 'hist' subkey
for the histogram actions. For the detail of the each parameter,
please read the event histogram document [3]_.
.. [3] See :ref:`Documentation/trace/histogram.rst <histogram>`
please read the event histogram document (Documentation/trace/histogram.rst)
ftrace.[instance.INSTANCE.]event.GROUP.EVENT.hist.[N.]keys = KEY1[, KEY2[...]]
Set histogram key parameters. (Mandatory)
......
......@@ -22,13 +22,14 @@ Linux PCI总线子系统
:numbered:
pci
Todolist:
pciebus-howto
pci-iov-howto
msi-howto
sysfs-pci
Todolist:
acpi-info
pci-error-recovery
pcieaer-howto
......
This diff is collapsed.
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/PCI/pci-iov-howto.rst
:翻译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
:校译:
.. _cn_pci-iov-howto:
==========================
PCI Express I/O 虚拟化指南
==========================
:版权: |copy| 2009 Intel Corporation
:作者: - Yu Zhao <yu.zhao@intel.com>
- Donald Dutile <ddutile@redhat.com>
概述
====
什么是SR-IOV
------------
单根I/O虚拟化(SR-IOV)是一种PCI Express扩展功能,它使一个物理设备显示为多个
虚拟设备。物理设备被称为物理功能(PF),而虚拟设备被称为虚拟功能(VF)。VF的分
配可以由PF通过封装在该功能中的寄存器动态控制。默认情况下,该功能未被启用,PF表
现为传统的PCIe设备。一旦开启,每个VF的PCI配置空间都可以通过自己的总线、设备和
功能编号(路由ID)来访问。而且每个VF也有PCI内存空间,用于映射其寄存器集。VF设
备驱动程序对寄存器集进行操作,这样它就可以发挥功能,并作为一个真正的现有PCI设备
出现。
使用指南
========
我怎样才能启用SR-IOV功能
------------------------
有多种方法可用于SR-IOV的启用。在第一种方法中,设备驱动(PF驱动)将通过SR-IOV
核心提供的API控制功能的启用和禁用。如果硬件具有SR-IOV能力,加载其PF驱动器将启
用它和与PF相关的所有VF。一些PF驱动需要设置一个模块参数,以确定要启用的VF的数量。
在第二种方法中,对sysfs文件sriov_numvfs的写入将启用和禁用与PCIe PF相关的VF。
这种方法实现了每个PF的VF启用/禁用值,而第一种方法则适用于同一设备的所有PF。此外,
PCI SRIOV核心支持确保启用/禁用操作是有效的,以减少同一检查在多个驱动程序中的重
复,例如,如果启用VF,检查numvfs == 0,确保numvfs <= totalvfs。
第二种方法是对新的/未来的VF设备的推荐方法。
我怎样才能使用虚拟功能
----------------------
在内核中,VF被视为热插拔的PCI设备,所以它们应该能够以与真正的PCI设备相同的方式
工作。VF需要的设备驱动与普通PCI设备的驱动相同。
开发者指南
==========
SR-IOV API
----------
用来开启SR-IOV功能:
(a) 对于第一种方法,在驱动程序中::
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
nr_virtfn'是要启用的VF的编号。
(b) 对于第二种方法,从sysfs::
echo 'nr_virtfn' > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs
用来关闭SR-IOV功能:
(a) 对于第一种方法,在驱动程序中::
void pci_disable_sriov(struct pci_dev *dev);
(b) 对于第二种方法,从sysfs::
echo 0 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_numvfs
要想通过主机上的兼容驱动启用自动探测VF,在启用SR-IOV功能之前运行下面的命令。这
是默认的行为。
::
echo 1 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
要禁止主机上的兼容驱动自动探测VF,请在启用SR-IOV功能之前运行以下命令。更新这个
入口不会影响已经被探测的VF。
::
echo 0 > \
/sys/bus/pci/devices/<DOMAIN:BUS:DEVICE.FUNCTION>/sriov_drivers_autoprobe
用例
----
下面的代码演示了SR-IOV API的用法
::
static int dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
pci_enable_sriov(dev, NR_VIRTFN);
...
return 0;
}
static void dev_remove(struct pci_dev *dev)
{
pci_disable_sriov(dev);
...
}
static int dev_suspend(struct pci_dev *dev, pm_message_t state)
{
...
return 0;
}
static int dev_resume(struct pci_dev *dev)
{
...
return 0;
}
static void dev_shutdown(struct pci_dev *dev)
{
...
}
static int dev_sriov_configure(struct pci_dev *dev, int numvfs)
{
if (numvfs > 0) {
...
pci_enable_sriov(dev, numvfs);
...
return numvfs;
}
if (numvfs == 0) {
....
pci_disable_sriov(dev);
...
return 0;
}
}
static struct pci_driver dev_driver = {
.name = "SR-IOV Physical Function driver",
.id_table = dev_id_table,
.probe = dev_probe,
.remove = dev_remove,
.suspend = dev_suspend,
.resume = dev_resume,
.shutdown = dev_shutdown,
.sriov_configure = dev_sriov_configure,
};
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/PCI/pciebus-howto.rst
:翻译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
:校译:
.. _cn_pciebus-howto:
===========================
PCI Express端口总线驱动指南
===========================
:作者: Tom L Nguyen tom.l.nguyen@intel.com 11/03/2004
:版权: |copy| 2004 Intel Corporation
关于本指南
==========
本指南介绍了PCI Express端口总线驱动程序的基本知识,并提供了如何使服务驱
动程序在PCI Express端口总线驱动程序中注册/取消注册的介绍。
什么是PCI Express端口总线驱动程序
=================================
一个PCI Express端口是一个逻辑的PCI-PCI桥结构。有两种类型的PCI Express端
口:根端口和交换端口。根端口从PCI Express根综合体发起一个PCI Express链接,
交换端口将PCI Express链接连接到内部逻辑PCI总线。交换机端口,其二级总线代表
交换机的内部路由逻辑,被称为交换机的上行端口。交换机的下行端口是从交换机的内部
路由总线桥接到代表来自PCI Express交换机的下游PCI Express链接的总线。
一个PCI Express端口可以提供多达四个不同的功能,在本文中被称为服务,这取决于
其端口类型。PCI Express端口的服务包括本地热拔插支持(HP)、电源管理事件支持(PME)、
高级错误报告支持(AER)和虚拟通道支持(VC)。这些服务可以由一个复杂的驱动程序
处理,也可以单独分布并由相应的服务驱动程序处理。
为什么要使用PCI Express端口总线驱动程序?
=========================================
在现有的Linux内核中,Linux设备驱动模型允许一个物理设备只由一个驱动处理。
PCI Express端口是一个具有多个不同服务的PCI-PCI桥设备。为了保持一个干净和简
单的解决方案,每个服务都可以有自己的软件服务驱动。在这种情况下,几个服务驱动将
竞争一个PCI-PCI桥设备。例如,如果PCI Express根端口的本机热拔插服务驱动程序
首先被加载,它就会要求一个PCI-PCI桥根端口。因此,内核不会为该根端口加载其他服
务驱动。换句话说,使用当前的驱动模型,不可能让多个服务驱动同时加载并运行在
PCI-PCI桥设备上。
为了使多个服务驱动程序同时运行,需要有一个PCI Express端口总线驱动程序,它管
理所有填充的PCI Express端口,并根据需要将所有提供的服务请求分配给相应的服务
驱动程序。下面列出了使用PCI Express端口总线驱动程序的一些关键优势:
- 允许在一个PCI-PCI桥接端口设备上同时运行多个服务驱动。
- 允许以独立的分阶段方式实施服务驱动程序。
- 允许一个服务驱动程序在多个PCI-PCI桥接端口设备上运行。
- 管理和分配PCI-PCI桥接端口设备的资源给要求的服务驱动程序。
配置PCI Express端口总线驱动程序与服务驱动程序
=============================================
将PCI Express端口总线驱动支持纳入内核
-------------------------------------
包括PCI Express端口总线驱动程序取决于内核配置中是否包含PCI Express支持。当内核
中的PCI Express支持被启用时,内核将自动包含PCI Express端口总线驱动程序作为内核
驱动程序。
启用服务驱动支持
----------------
PCI设备驱动是基于Linux设备驱动模型实现的。所有的服务驱动都是PCI设备驱动。如上所述,
一旦内核加载了PCI Express端口总线驱动程序,就不可能再加载任何服务驱动程序。为了满
足PCI Express端口总线驱动程序模型,需要对现有的服务驱动程序进行一些最小的改变,其
对现有的服务驱动程序的功能没有影响。
服务驱动程序需要使用下面所示的两个API,将其服务注册到PCI Express端口总线驱动程
序中(见第5.2.1和5.2.2节)。在调用这些API之前,服务驱动程序必须初始化头文件
/include/linux/pcieport_if.h中的pcie_port_service_driver数据结构。如果不这
样做,将导致身份不匹配,从而使PCI Express端口总线驱动程序无法加载服务驱动程序。
pcie_port_service_register
~~~~~~~~~~~~~~~~~~~~~~~~~~
::
int pcie_port_service_register(struct pcie_port_service_driver *new)
这个API取代了Linux驱动模型的 pci_register_driver API。一个服务驱动应该总是在模
块启动时调用 pcie_port_service_register。请注意,在服务驱动被加载后,诸如
pci_enable_device(dev) 和 pci_set_master(dev) 的调用不再需要,因为这些调用由
PCI端口总线驱动执行。
pcie_port_service_unregister
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
void pcie_port_service_unregister(struct pcie_port_service_driver *new)
pcie_port_service_unregister取代了Linux驱动模型的pci_unregister_driver。当一
个模块退出时,它总是被服务驱动调用。
示例代码
~~~~~~~~
下面是服务驱动代码示例,用于初始化端口服务的驱动程序数据结构。
::
static struct pcie_port_service_id service_id[] = { {
.vendor = PCI_ANY_ID,
.device = PCI_ANY_ID,
.port_type = PCIE_RC_PORT,
.service_type = PCIE_PORT_SERVICE_AER,
}, { /* end: all zeroes */ }
};
static struct pcie_port_service_driver root_aerdrv = {
.name = (char *)device_name,
.id_table = &service_id[0],
.probe = aerdrv_load,
.remove = aerdrv_unload,
.suspend = aerdrv_suspend,
.resume = aerdrv_resume,
};
下面是一个注册/取消注册服务驱动的示例代码。
::
static int __init aerdrv_service_init(void)
{
int retval = 0;
retval = pcie_port_service_register(&root_aerdrv);
if (!retval) {
/*
* FIX ME
*/
}
return retval;
}
static void __exit aerdrv_service_exit(void)
{
pcie_port_service_unregister(&root_aerdrv);
}
module_init(aerdrv_service_init);
module_exit(aerdrv_service_exit);
可能的资源冲突
==============
由于PCI-PCI桥接端口设备的所有服务驱动被允许同时运行,下面列出了一些可能的资源冲突和
建议的解决方案。
MSI 和 MSI-X 向量资源
---------------------
一旦设备上的MSI或MSI-X中断被启用,它就会一直保持这种模式,直到它们再次被禁用。由于同
一个PCI-PCI桥接端口的服务驱动程序共享同一个物理设备,如果一个单独的服务驱动程序启用或
禁用MSI/MSI-X模式,可能会导致不可预知的行为。
为了避免这种情况,所有的服务驱动程序都不允许在其设备上切换中断模式。PCI Express端口
总线驱动程序负责确定中断模式,这对服务驱动程序来说应该是透明的。服务驱动程序只需要知道
分配给结构体pcie_device的字段irq的向量IRQ,当PCI Express端口总线驱动程序探测每
个服务驱动程序时,它被传入。服务驱动应该使用(struct pcie_device*)dev->irq来调用
request_irq/free_irq。此外,中断模式被存储在struct pcie_device的interrupt_mode
字段中。
PCI内存/IO映射的区域
--------------------
PCI Express电源管理(PME)、高级错误报告(AER)、热插拔(HP)和虚拟通道(VC)的服务
驱动程序访问PCI Express端口的PCI配置空间。在所有情况下,访问的寄存器是相互独立的。这
个补丁假定所有的服务驱动程序都会表现良好,不会覆盖其他服务驱动程序的配置设置。
PCI配置寄存器
-------------
每个服务驱动都在自己的功能结构体上运行PCI配置操作,除了PCI Express功能结构体,其中根控制
寄存器和设备控制寄存器是在PME和AER之间共享。这个补丁假定所有的服务驱动都会表现良好,不会
覆盖其他服务驱动的配置设置。
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/PCI/sysfs-pci.rst
:翻译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
:校译:
========================
通过sysfs访问PCI设备资源
========================
sysfs,通常挂载在/sys,在支持它的平台上提供对PCI资源的访问。例如,一个特定的总线可能看起
来像这样::
/sys/devices/pci0000:17
|-- 0000:17:00.0
| |-- class
| |-- config
| |-- device
| |-- enable
| |-- irq
| |-- local_cpus
| |-- remove
| |-- resource
| |-- resource0
| |-- resource1
| |-- resource2
| |-- revision
| |-- rom
| |-- subsystem_device
| |-- subsystem_vendor
| `-- vendor
`-- ...
最上面的元素描述了PCI域和总线号码。在这种情况下,域号是0000,总线号是17(两个值都是十六进制)。
这个总线在0号插槽中包含一个单一功能的设备。为了方便起见,我们复制了域和总线的编号。在设备目录
下有几个文件,每个文件都有自己的功能。
=================== =====================================================
文件 功能
=================== =====================================================
class PCI级别 (ascii, ro)
config PCI配置空间 (binary, rw)
device PCI设备 (ascii, ro)
enable 设备是否被启用 (ascii, rw)
irq IRQ编号 (ascii, ro)
local_cpus 临近CPU掩码(cpumask, ro)
remove 从内核的列表中删除设备 (ascii, wo)
resource PCI资源主机地址 (ascii, ro)
resource0..N PCI资源N,如果存在的话 (binary, mmap, rw\ [1]_)
resource0_wc..N_wc PCI WC映射资源N,如果可预取的话 (binary, mmap)
revision PCI修订版 (ascii, ro)
rom PCI ROM资源,如果存在的话 (binary, ro)
subsystem_device PCI子系统设备 (ascii, ro)
subsystem_vendor PCI子系统供应商 (ascii, ro)
vendor PCI供应商 (ascii, ro)
=================== =====================================================
::
ro - 只读文件
rw - 文件是可读和可写的
wo - 只写文件
mmap - 文件是可移动的
ascii - 文件包含ascii文本
binary - 文件包含二进制数据
cpumask - 文件包含一个cpumask类型的
.. [1] rw 仅适用于 IORESOURCE_IO(I/O 端口)区域
只读文件是信息性的,对它们的写入将被忽略,但 "rom "文件除外。可写文件可以用来在设备上执
行操作(例如,改变配置空间,分离设备)。 mmapable文件可以通过偏移量为0的文件的mmap获得,
可以用来从用户空间进行实际的设备编程。注意,有些平台不支持某些资源的mmapping,所以一定要
检查任何尝试的mmap的返回值。其中最值得注意的是I/O端口资源,它也提供读/写访问。
enable "文件提供了一个计数器,表明设备已经被启用了多少次。如果'enable'文件目前返回'4',
而一个'1'被呼入它,它将返回'5'。向它呼入一个'0'会减少计数。不过,即使它返回到0,一些初始
化可能也不会被逆转。
rom "文件很特别,因为它提供了对设备ROM文件的只读访问,如果有的话。然而,它在默认情况下是
禁用的,所以应用程序应该在尝试读取调用之前将字符串 "1 "写入该文件以启用它,并在访问之后将
"0 "写入该文件以禁用它。请注意,设备必须被启用,才能成功返回数据。如果驱动没有被绑定到设备
上,可以使用上面提到的 "enable "文件将其启用。
remove "文件是用来移除PCI设备的,通过向该文件写入一个非零的整数。这并不涉及任何形式的热插
拔功能,例如关闭设备的电源。该设备被从内核的PCI设备列表中移除,它的sysfs目录被移除,并且该
设备将被从任何连接到它的驱动程序中移除。移除PCI根总线是不允许的。
通过sysfs访问原有资源
---------------------
如果底层平台支持的话,传统的I/O端口和ISA内存资源也会在sysfs中提供。它们位于PCI类的层次结构
中,例如::
/sys/class/pci_bus/0000:17/
|-- bridge -> ../../../devices/pci0000:17
|-- cpuaffinity
|-- legacy_io
`-- legacy_mem
legacy_io文件是一个读/写文件,可以被应用程序用来做传统的端口I/O。应用程序应该打开该文件,寻
找所需的端口(例如0x3e8),并进行1、2或4字节的读或写。legacy_mem文件应该被mmapped,其偏移
量与所需的内存偏移量相对应,例如0xa0000用于VGA帧缓冲器。然后,应用程序可以简单地解除引用返回
的指针(当然是在检查了错误之后)来访问遗留内存空间。
支持新平台上的PCI访问
---------------------
为了支持上述的PCI资源映射,Linux平台代码最好定义ARCH_GENERIC_PCI_MMAP_RESOURCE并使用该
功能的通用实现。为了支持通过/proc/bus/pci中的文件实现mmap()的历史接口,平台也可以设置
HAVE_PCI_MMAP。
另外,设置了 HAVE_PCI_MMAP 的平台可以提供他们自己的 pci_mmap_page_range() 实现,而不是定
义 ARCH_GENERIC_PCI_MMAP_RESOURCE。
支持PCI资源的写组合映射的平台必须定义arch_can_pci_mmap_wc(),当写组合被允许时,在运行时应
评估为非零。支持I/O资源映射的平台同样定义arch_can_pci_mmap_io()。
遗留资源由HAVE_PCI_LEGACY定义保护。希望支持遗留功能的平台应该定义它并提供 pci_legacy_read,
pci_legacy_write 和 pci_mmap_legacy_page_range 函数。
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/accounting/delay-accounting.rst
:Translator: Yang Yang <yang.yang29@zte.com.cn>
========
延时计数
========
任务在等待某些内核资源可用时,会造成延时。例如一个可运行的任务可能会等待
一个空闲CPU来运行。
基于每任务的延时计数功能度量由以下情况造成的任务延时:
a) 等待一个CPU(任务为可运行)
b) 完成由该任务发起的块I/O同步请求
c) 页面交换
d) 内存回收
并将这些统计信息通过taskstats接口提供给用户空间。
这些延时信息为适当的调整任务CPU优先级、io优先级、rss限制提供反馈。重要任务
长期延时,表示可能需要提高其相关优先级。
通过使用taskstats接口,本功能还可提供一个线程组(对应传统Unix进程)所有任务
(或线程)的总延时统计信息。此类汇总往往是需要的,由内核来完成更加高效。
用户空间的实体,特别是资源管理程序,可将延时统计信息汇总到任意组中。为实现
这一点,任务的延时统计信息在其生命周期内和退出时皆可获取,从而确保可进行
连续、完整的监控。
接口
----
延时计数使用taskstats接口,该接口由本目录另一个单独的文档详细描述。Taskstats
向用户态返回一个通用数据结构,对应每pid或每tgid的统计信息。延时计数功能填写
该数据结构的特定字段。见
include/linux/taskstats.h
其描述了延时计数相关字段。系统通常以计数器形式返回 CPU、同步块 I/O、交换、内存
回收等的累积延时。
取任务某计数器两个连续读数的差值,将得到任务在该时间间隔内等待对应资源的总延时。
当任务退出时,内核会将包含每任务的统计信息发送给用户空间,而无需额外的命令。
若其为线程组最后一个退出的任务,内核还会发送每tgid的统计信息。更多详细信息见
taskstats接口的描述。
tools/accounting目录中的用户空间程序getdelays.c提供了一些简单的命令,用以显示
延时统计信息。其也是使用taskstats接口的示例。
用法
----
使用以下配置编译内核::
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASKSTATS=y
延时计数在启动时默认关闭。
若需开启,在启动参数中增加::
delayacct
本文后续的说明基于延时计数已开启。也可在系统运行时,使用sysctl的
kernel.task_delayacct进行开关。注意,只有在启用延时计数后启动的
任务才会有相关信息。
系统启动后,使用类似getdelays.c的工具获取任务或线程组(tgid)的延时信息。
getdelays命令的一般格式::
getdelays [-t tgid] [-p pid] [-c cmd...]
获取pid为10的任务从系统启动后的延时信息::
# ./getdelays -p 10
(输出信息和下例相似)
获取所有tgid为5的任务从系统启动后的总延时信息::
# ./getdelays -t 5
CPU count real total virtual total delay total
7876 92005750 100000000 24001500
IO count delay total
0 0
SWAP count delay total
0 0
RECLAIM count delay total
0 0
获取指定简单命令运行时的延时信息::
# ./getdelays -c ls /
bin data1 data3 data5 dev home media opt root srv sys usr
boot data2 data4 data6 etc lib mnt proc sbin subdomain tmp var
CPU count real total virtual total delay total
6 4000250 4000000 0
IO count delay total
0 0
SWAP count delay total
0 0
RECLAIM count delay total
0 0
......@@ -15,11 +15,11 @@
.. toctree::
:maxdepth: 1
delay-accounting
psi
taskstats
Todolist:
cgroupstats
delay-accounting
taskstats
taskstats-struct
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/accounting/taskstats.rst
:Translator: Yang Yang <yang.yang29@zte.com.cn>
================
每任务的统计接口
================
Taskstats是一个基于netlink的接口,用于从内核向用户空间发送每任务及每进程的
统计信息。
Taskstats设计目的:
- 在任务生命周期内和退出时高效的提供统计信息
- 统一不同计数子系统的接口
- 支持未来计数系统的扩展
术语
----
“pid”、“tid”、“任务”互换使用,用于描述由struct task_struct定义的标准
Linux任务。“每pid的统计数据”等价于“每任务的统计数据”。
“tgid”、“进程”、“线程组”互换使用,用于描述共享mm_struct的任务集,
也就是传统的Unix进程。尽管使用了tgid这个词,即使一个任务是线程组组长,
对它的处理也没有什么不同。只要一个进程还有任何归属它的任务,它就被认为
活着。
用法
----
为了在任务生命周期内获得统计信息,用户空间需打开一个单播的netlink套接字
(NETLINK_GENERIC族)然后发送指定pid或tgid的命令。响应消息中包含单个
任务的统计信息(若指定了pid)或进程所有任务汇总的统计信息(若指定了tgid)。
为了在任务退出时获取统计信息,用户空间的监听者发送一个指定cpu掩码的注册命令。
cpu掩码内的cpu上有任务退出时,每pid的统计信息将发送给注册成功的监听者。使用
cpu掩码可以限制一个监听者收到的数据,并有助于对netlink接口进行流量控制,后文
将进行更详细的解释。
如果正在退出的任务是线程组中最后一个退出的线程,额外一条包含每tgid统计信息的
记录也将发送给用户空间。后者包含线程组中所有线程(包括过去和现在)的每pid统计
信息总和。
getdelays.c是一个简单的示例,用以演示如何使用taskstats接口获取延迟统计信息。
用户可注册cpu掩码、发送命令和处理响应、监听每tid/tgid退出数据、将收到的数据
写入文件、通过增大接收缓冲区进行基本的流量控制。
接口
----
内核用户接口封装在include/linux/taskstats.h。
为避免本文档随着接口的演进而过期,本文仅给出当前版本的概要。当本文与taskstats.h
不一致时,以taskstats.h为准。
struct taskstats是每pid和每tgid数据共用的计数结构体。它是版本化的,可在内核新增
计数子系统时进行扩展。taskstats.h中定义了各字段及语义。
用户、内核空间的数据交换是属于NETLINK_GENERIC族的netlink消息,使用netlink属性
接口。消息格式如下::
+----------+- - -+-------------+-------------------+
| nlmsghdr | Pad | genlmsghdr | taskstats payload |
+----------+- - -+-------------+-------------------+
Taskstats载荷有三种类型:
1. 命令:由用户发送给内核。获取指定pid/tgid数据的命令包含一个类型为
TASKSTATS_CMD_ATTR_PID/TGID的属性,该属性包含u32的pid或tgid载荷。
pid/tgid指示用户空间要统计的任务/进程。
注册/注销获取指定cpu集上退出数据的命令包含一个类型为
TASKSTATS_CMD_ATTR_REGISTER/DEREGISTER_CPUMASK的属性,该属性包含cpu掩码载荷。
cpu掩码是以ascii码表示,用逗号分隔的cpu范围。例如若需监听1,2,3,5,7,8号cpu的
退出数据,cpu掩码表示为"1-3,5,7-8"。若用户空间在关闭监听套接字前忘了注销监听
的cpu集,随着时间的推移,内核会清理此监听集。但是,出于提效的目的,建议明确
执行注销。
2. 命令的应答:内核发出应答用户空间的命令。载荷有三类属性:
a) TASKSTATS_TYPE_AGGR_PID/TGID: 本属性不包含载荷,用以指示其后为被统计对象
的pig/tgid。
b) TASKSTATS_TYPE_PID/TGID:本属性的载荷为pig/tgid,其统计信息将被返回。
c) TASKSTATS_TYPE_STATS:本属性的载荷为一个struct taskstats实例。每pid和
每tgid统计信息共用该结构体。
3. 内核会在任务退出时发送新消息。其载荷包含一系列以下类型的属性:
a) TASKSTATS_TYPE_AGGR_PID:指示其后两个属性为pid+stats。
b) TASKSTATS_TYPE_PID:包含退出任务的pid。
c) TASKSTATS_TYPE_STATS:包含退出任务的每pid统计信息
d) TASKSTATS_TYPE_AGGR_TGID:指示其后两个属性为tgid+stats。
e) TASKSTATS_TYPE_TGID:包含任务所属进程的tgid
f) TASKSTATS_TYPE_STATS:包含退出任务所属进程的每tgid统计信息
每tgid的统计
------------
除了每任务的统计信息,taskstats还提供每进程的统计信息,因为资源管理通常以进程
粒度完成,并且仅在用户空间聚合任务统计信息效率低下且可能不准确(缺乏原子性)。
然而,除了每任务统计信息,在内核中维护每进程统计信息存在额外的时间和空间开销。
为解决此问题,taskstats代码将退出任务的统计信息累积到进程范围的数据结构中。
当进程最后一个任务退出时,累积的进程级数据也会发送到用户空间(与每任务数据一起)。
当用户查询每tgid数据时,内核将指定线程组中所有活动线程的统计信息相加,并添加到
该线程组的累积总数(含之前退出的线程)。
扩展taskstats
-------------
有两种方法可在未来修改内核扩展taskstats接口,以导出更多的每任务/进程统计信息:
1. 在现有struct taskstats末尾增加字段。该结构体中的版本号确保了向后兼容性。
用户空间将仅使用与其版本对应的结构体字段。
2. 定义单独的统计结构体并使用netlink属性接口返回对应的数据。由于用户空间独立
处理每个netlink属性,所以总是可以忽略其不理解类型的属性(因为使用了旧版本接口)。
在1.和2.之间进行选择,属于权衡灵活性和开销的问题。若仅需增加少数字段,那么1.是
首选方法,因为内核和用户空间无需承担处理新netlink属性的开销。但若新字段过多的
扩展现有结构体,导致不同的用户空间计数程序不必要的接收大型结构体,而对结构体
字段并不感兴趣,那么2.是值得的。
Taskstats的流量控制
-------------------
当退出任务数速率变大,监听者可能跟不上内核发送每tid/tgid退出数据的速率,而导致
数据丢失。taskstats结构体变大、cpu数量上升,都会导致这种可能性增加。
为避免统计信息丢失,用户空间应执行以下操作中至少一项:
- 增大监听者用于接收退出数据的netlink套接字接收缓存区。
- 创建更多的监听者,减少每个监听者监听的cpu数量。极端情况下可为每个cpu创建
一个监听者。用户还可考虑将监听者的cpu亲和性设置为监听cpu的子集,特别是当他们
仅监听一个cpu。
尽管采取了这些措施,若用户空间仍收到指示接收缓存区溢出的ENOBUFS错误消息,
则应采取其他措施处理数据丢失。
......@@ -133,7 +133,7 @@ Linux内核5.x版本 <http://kernel.org/>
即使只升级一个小版本,也不要跳过此步骤。每个版本中都会添加新的配置选项,
如果配置文件没有按预定设置,就会出现奇怪的问题。如果您想以最少的工作量
将现有配置升级到新版本,请使用 ``makeoldconfig`` ,它只会询问您新配置
将现有配置升级到新版本,请使用 ``make oldconfig`` ,它只会询问您新配置
选项的答案。
- 其他配置命令包括::
......@@ -161,7 +161,7 @@ Linux内核5.x版本 <http://kernel.org/>
"make ${PLATFORM}_defconfig"
使用arch/$arch/configs/${PLATFORM}_defconfig中
的默认选项值创建一个./.config文件。
用“makehelp”来获取您体系架构中所有可用平台的列表。
用“make help”来获取您体系架构中所有可用平台的列表。
"make allyesconfig"
通过尽可能将选项值设置为“y”,创建一个
......@@ -197,9 +197,10 @@ Linux内核5.x版本 <http://kernel.org/>
"make localyesconfig" 与localmodconfig类似,只是它会将所有模块选项转换
为内置(=y)。你可以同时通过LMC_KEEP保留模块。
"make kvmconfig" 为kvm客体内核支持启用其他选项。
"make kvm_guest.config"
为kvm客户机内核支持启用其他选项。
"make xenconfig" 为xen dom0客体内核支持启用其他选项。
"make xen.config" 为xen dom0客户机内核支持启用其他选项。
"make tinyconfig" 配置尽可能小的内核。
......@@ -229,7 +230,7 @@ Linux内核5.x版本 <http://kernel.org/>
请注意,您仍然可以使用此内核运行a.out用户程序。
- 执行 ``make`` 来创建压缩内核映像。如果您安装了lilo以适配内核makefile,
那么也可以进行 ``makeinstall`` ,但是您可能需要先检查特定的lilo设置。
那么也可以进行 ``make install`` ,但是您可能需要先检查特定的lilo设置。
实际安装必须以root身份执行,但任何正常构建都不需要。
无须徒然使用root身份。
......
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/admin-guide/cputopology.rst
:翻译:
唐艺舟 Tang Yizhou <tangyeechou@gmail.com>
==========================
如何通过sysfs将CPU拓扑导出
==========================
CPU拓扑信息通过sysfs导出。显示的项(属性)和某些架构的/proc/cpuinfo输出相似。它们位于
/sys/devices/system/cpu/cpuX/topology/。请阅读ABI文件:
Documentation/ABI/stable/sysfs-devices-system-cpu。
drivers/base/topology.c是体系结构中性的,它导出了这些属性。然而,die、cluster、book、
draw这些层次结构相关的文件仅在体系结构提供了下文描述的宏的条件下被创建。
对于支持这个特性的体系结构,它必须在include/asm-XXX/topology.h中定义这些宏中的一部分::
#define topology_physical_package_id(cpu)
#define topology_die_id(cpu)
#define topology_cluster_id(cpu)
#define topology_core_id(cpu)
#define topology_book_id(cpu)
#define topology_drawer_id(cpu)
#define topology_sibling_cpumask(cpu)
#define topology_core_cpumask(cpu)
#define topology_cluster_cpumask(cpu)
#define topology_die_cpumask(cpu)
#define topology_book_cpumask(cpu)
#define topology_drawer_cpumask(cpu)
``**_id macros`` 的类型是int。
``**_cpumask macros`` 的类型是 ``(const) struct cpumask *`` 。后者和恰当的
``**_siblings`` sysfs属性对应(除了topology_sibling_cpumask(),它和thread_siblings
对应)。
为了在所有体系结构上保持一致,include/linux/topology.h提供了上述所有宏的默认定义,以防
它们未在include/asm-XXX/topology.h中定义:
1) topology_physical_package_id: -1
2) topology_die_id: -1
3) topology_cluster_id: -1
4) topology_core_id: 0
5) topology_book_id: -1
6) topology_drawer_id: -1
7) topology_sibling_cpumask: 仅入参CPU
8) topology_core_cpumask: 仅入参CPU
9) topology_cluster_cpumask: 仅入参CPU
10) topology_die_cpumask: 仅入参CPU
11) topology_book_cpumask: 仅入参CPU
12) topology_drawer_cpumask: 仅入参CPU
此外,CPU拓扑信息由/sys/devices/system/cpu提供,包含下述文件。输出对应的内部数据源放在
方括号("[]")中。
=========== ==================================================================
kernel_max: 内核配置允许的最大CPU下标值。[NR_CPUS-1]
offline: 由于热插拔移除或者超过内核允许的CPU上限(上文描述的kernel_max)
导致未上线的CPU。[~cpu_online_mask + cpus >= NR_CPUS]
online: 在线的CPU,可供调度使用。[cpu_online_mask]
possible: 已被分配资源的CPU,如果它们CPU实际存在,可以上线。
[cpu_possible_mask]
present: 被系统识别实际存在的CPU。[cpu_present_mask]
=========== ==================================================================
上述输出的格式和cpulist_parse()兼容[参见 <linux/cpumask.h>]。下面给些例子。
在本例中,系统中有64个CPU,但是CPU 32-63超过了kernel_max值,因为NR_CPUS配置项是32,
取值范围被限制为0..31。此外注意CPU2和4-31未上线,但是可以上线,因为它们同时存在于
present和possible::
kernel_max: 31
offline: 2,4-31,32-63
online: 0-1,3
possible: 0-31
present: 0-31
在本例中,NR_CPUS配置项是128,但内核启动时设置possible_cpus=144。系统中有4个CPU,
CPU2被手动设置下线(也是唯一一个可以上线的CPU)::
kernel_max: 127
offline: 2,4-127,128-143
online: 0-1,3
possible: 0-127
present: 0-3
阅读Documentation/core-api/cpu_hotplug.rst可了解开机参数possible_cpus=NUM,同时还
可以了解各种cpumask的信息。
......@@ -65,6 +65,7 @@ Todolist:
clearing-warn-once
cpu-load
cputopology
lockup-watchdogs
unicode
sysrq
......@@ -84,7 +85,6 @@ Todolist:
cgroup-v1/index
cgroup-v2
cifs/index
cputopology
dell_rbu
device-mapper/index
edid
......
......@@ -7,7 +7,9 @@
司延腾 Yanteng Si <siyanteng@loongson.cn>
.. _cn_core.rst:
:校译:
唐艺舟 Tang Yizhou <tangyeechou@gmail.com>
====================================
CPUFreq核心和CPUFreq通知器的通用说明
......@@ -29,10 +31,10 @@ CPUFreq核心和CPUFreq通知器的通用说明
======================
cpufreq核心代码位于drivers/cpufreq/cpufreq.c中。这些cpufreq代码为CPUFreq架构的驱
动程序(那些操作硬件切换频率的代码)以及 "通知器 "提供了一个标准化的接口。
这些是设备驱动程序或需要了解策略变化的其它内核部分(如 ACPI 热量管理)或所有频率更改(除
计时代码外),甚至需要强制确定速度限制的通知器(如 ARM 架构上的 LCD 驱动程序)
此外, 内核 "常数" loops_per_jiffy会根据频率变化而更新。
动程序(那些执行硬件频率切换的代码)以及 "通知器" 提供了一个标准化的接口。
包括设备驱动程序;需要了解策略变化(如 ACPI 热量管理),或所有频率变化(如计时代码),
甚至需要强制限制为指定频率(如 ARM 架构上的 LCD 驱动程序)的其它内核组件
此外,内核 "常数" loops_per_jiffy 会根据频率变化而更新。
cpufreq策略的引用计数由 cpufreq_cpu_get 和 cpufreq_cpu_put 来完成,以确保 cpufreq 驱
动程序被正确地注册到核心中,并且驱动程序在 cpufreq_put_cpu 被调用之前不会被卸载。这也保证
......@@ -41,7 +43,7 @@ cpufreq策略的引用计数由 cpufreq_cpu_get 和 cpufreq_cpu_put 来完成,
2. CPUFreq 通知器
====================
CPUFreq通知器符合标准的内核通知器接口。
CPUFreq通知器遵循标准的内核通知器接口。
关于通知器的细节请参阅 linux/include/linux/notifier.h。
这里有两个不同的CPUfreq通知器 - 策略通知器和转换通知器。
......@@ -69,20 +71,20 @@ CPUFreq通知器符合标准的内核通知器接口。
第三个参数是一个包含如下值的结构体cpufreq_freqs:
===== ====================
cpu 受影响cpu的编号
====== ===============================
policy 指向struct cpufreq_policy的指针
old 旧频率
new 新频率
flags cpufreq驱动的标志
===== ====================
====== ===============================
3. 含有Operating Performance Point (OPP)的CPUFreq表的生成
==================================================================
关于OPP的细节请参阅 Documentation/power/opp.rst
dev_pm_opp_init_cpufreq_table -
这个功能提供了一个随时可用的转换程序,用来将OPP层关于可用频率的内部信息翻译成一种容易提供给
cpufreq的格式。
这个函数提供了一个随时可用的转换例程,用来将OPP层关于可用频率的内部信息翻译成一种
cpufreq易于处理的格式。
.. Warning::
......
......@@ -8,13 +8,15 @@
司延腾 Yanteng Si <siyanteng@loongson.cn>
.. _cn_cpufreq-stats.rst:
:校译:
唐艺舟 Tang Yizhou <tangyeechou@gmail.com>
==========================================
sysfs CPUFreq Stats的一般说明
==========================================
用户信息
为使用者准备的信息
作者: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
......@@ -29,17 +31,16 @@ sysfs CPUFreq Stats的一般说明
1. 简介
===============
cpufreq-stats是一个为每个CPU提供CPU频率统计的驱动。
这些统计数据在/sysfs中以一堆只读接口的形式提供。这个接口(在配置好后)将出现在
/sysfs(<sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/)中cpufreq下的一个单
独的目录中,提供给每个CPU。
各种统计数据将在此目录下形成只读文件。
cpufreq-stats是一种为每个CPU提供CPU频率统计的驱动。
这些统计数据以/sysfs中一系列只读接口的形式呈现。cpufreq-stats接口(若已配置)将为每个CPU生成
/sysfs(<sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/)中cpufreq目录下的stats目录。
各项统计数据将在stats目录下形成对应的只读文件。
此驱动是独立于任何可能运行在你所用CPU上的特定cpufreq_driver而设计的。因此,它将与所有
cpufreq_driver一起工作。
此驱动是以独立于任何可能运行在你所用CPU上的特定cpufreq_driver的方式设计的。因此,它将能和任何
cpufreq_driver协同工作。
2. 提供的统计数据(举例说明)
2. 已提供的统计数据(有例子)
=====================================
cpufreq stats提供了以下统计数据(在下面详细解释)。
......@@ -48,8 +49,8 @@ cpufreq stats提供了以下统计数据(在下面详细解释)。
- total_trans
- trans_table
所有的统计数据将从统计驱动被载入的时间(或统计被重置的时间)开始,到某一统计数据被读取的时间为止。
显然,统计驱动不会有任何关于统计驱动载入之前的频率转换信息。
所有统计数据来自以下时间范围:从统计驱动被加载的时间(或统计数据被重置的时间)开始,到某一统计数据被读取的时间为止。
显然,统计驱动不会保存它被加载之前的任何频率转换信息。
::
......@@ -64,14 +65,14 @@ cpufreq stats提供了以下统计数据(在下面详细解释)。
- **reset**
只写属性,可用于重置统计计数器。这对于评估不同调节器的系统行为非常有用,且无需重启。
只写属性,可用于重置统计计数器。这对于评估不同调节器的系统行为非常有用,且无需重启。
- **time_in_state**
项给出了这个CPU所支持的每个频率所花费的时间。cat输出的每一行都会有"<frequency>
<time>"对,表示这个CPU在<frequency>上花费了<time>个usertime单位的时间。这里的
usertime单位是10mS(类似于/proc中输出的其他时间)。
文件给出了在本CPU支持的每个频率上分别花费的时间。cat输出的每一行都是一个"<frequency>
<time>"对,表示这个CPU在<frequency>上花费了<time>个usertime单位的时间。输出的每一行对应
一个CPU支持的频率。这里usertime单位是10mS(类似于/proc导出的其它时间)。
::
......@@ -85,7 +86,7 @@ usertime单位是10mS(类似于/proc中输出的其他时间)。
- **total_trans**
给出了这个CPU上频率转换的总次数。cat的输出将有一个单一的计数,这就是频率转换的总数。
此文件给出了这个CPU频率转换的总次数。cat的输出是一个计数值,它就是频率转换的总次数。
::
......@@ -94,10 +95,10 @@ usertime单位是10mS(类似于/proc中输出的其他时间)。
- **trans_table**
这将提供所有CPU频率转换的细粒度信息。这里的cat输出是一个二维矩阵,其中一个条目<i, j>(第
本文件提供所有CPU频率转换的细粒度信息。这里的cat输出是一个二维矩阵,其中一个条目<i, j>(第
i行,第j列)代表从Freq_i到Freq_j的转换次数。Freq_i行和Freq_j列遵循驱动最初提供给cpufreq
的频率表的排序顺序,因此可以排序(升序或降序)或不排序。 这里的输出也包含了每行每列的实际
频率值,以便更好地阅读。
心的频率表的排列顺序,因此可以已排序(升序或降序)或未排序。这里的输出也包含了实际
频率值,分别按行和按列显示,以便更好地阅读。
如果转换表大于PAGE_SIZE,读取时将返回一个-EFBIG错误。
......@@ -115,7 +116,7 @@ i行,第j列)代表从Freq_i到Freq_j的转换次数。Freq_i行和Freq_j列
3. 配置cpufreq-stats
============================
在你的内核中配置cpufreq-stats::
按以下方式在你的内核中配置cpufreq-stats::
Config Main Menu
Power management options (ACPI, APM) --->
......@@ -124,7 +125,7 @@ i行,第j列)代表从Freq_i到Freq_j的转换次数。Freq_i行和Freq_j列
[*] CPU frequency translation statistics
"CPU Frequency scaling" (CONFIG_CPU_FREQ) 应该被启用配置cpufreq-stats。
"CPU Frequency scaling" (CONFIG_CPU_FREQ) 应该被启用,以支持配置cpufreq-stats。
"CPU frequency translation statistics" (CONFIG_CPU_FREQ_STAT)提供了包括
time_in_state、total_trans和trans_table的统计数据。
......
......@@ -22,13 +22,13 @@ Documentation/translations/zh_CN/dev-tools/testing-overview.rst
:maxdepth: 2
testing-overview
sparse
gcov
kasan
Todolist:
- coccinelle
- sparse
- kcov
- ubsan
- kmemleak
......
Chinese translated version of Documentation/dev-tools/sparse.rst
Copyright 2004 Linus Torvalds
Copyright 2004 Pavel Machek <pavel@ucw.cz>
Copyright 2006 Bob Copeland <me@bobcopeland.com>
If you have any comment or update to the content, please contact the
original document maintainer directly. However, if you have a problem
communicating in English you can also ask the Chinese maintainer for
help. Contact the Chinese maintainer if this translation is outdated
or if there is a problem with the translation.
.. include:: ../disclaimer-zh_CN.rst
Chinese maintainer: Li Yang <leoyang.li@nxp.com>
---------------------------------------------------------------------
Documentation/dev-tools/sparse.rst 的中文翻译
:Original: Documentation/dev-tools/sparse.rst
如果想评论或更新本文的内容,请直接联系原文档的维护者。如果你使用英文
交流有困难的话,也可以向中文版维护者求助。如果本翻译更新不及时或者翻
译存在问题,请联系中文版维护者。
:翻译:
中文版维护者: 李阳 Li Yang <leoyang.li@nxp.com>
中文版翻译者: 李阳 Li Yang <leoyang.li@nxp.com>
Li Yang <leoyang.li@nxp.com>
:校译:
以下为正文
---------------------------------------------------------------------
司延腾 Yanteng Si <siyanteng@loongson.cn>
Copyright 2004 Linus Torvalds
Copyright 2004 Pavel Machek <pavel@ucw.cz>
Copyright 2006 Bob Copeland <me@bobcopeland.com>
.. _cn_sparse:
Sparse
======
Sparse是一个C程序的语义检查器;它可以用来发现内核代码的一些潜在问题。 关
于sparse的概述,请参见https://lwn.net/Articles/689907/;本文档包含
一些针对内核的sparse信息。
关于sparse的更多信息,主要是关于它的内部结构,可以在它的官方网页上找到:
https://sparse.docs.kernel.org。
使用 sparse 工具做类型检查
~~~~~~~~~~~~~~~~~~~~~~~~~~
"__bitwise" 是一种类型属性,所以你应该这样使用它
"__bitwise" 是一种类型属性,所以你应该这样使用它::
typedef int __bitwise pm_request_t;
......@@ -48,7 +48,7 @@ Copyright 2006 Bob Copeland <me@bobcopeland.com>
坦白来说,你并不需要使用枚举类型。上面那些实际都可以浓缩成一个特殊的"int
__bitwise"类型。
所以更简单的办法只要这样做
所以更简单的办法只要这样做::
typedef int __bitwise pm_request_t;
......@@ -60,25 +60,42 @@ __bitwise"类型。
一个小提醒:常数整数"0"是特殊的。你可以直接把常数零当作位方式整数使用而
不用担心 sparse 会抱怨。这是因为"bitwise"(恰如其名)是用来确保不同位方
式类型不会被弄混(小尾模式,大尾模式,cpu尾模式,或者其他),对他们来说
常数"0"确实是特殊的。
常数"0"确实 **是** 特殊的。
使用sparse进行锁检查
--------------------
下面的宏对于 gcc 来说是未定义的,在 sparse 运行时定义,以使用sparse的“上下文”
跟踪功能,应用于锁定。 这些注释告诉 sparse 什么时候有锁,以及注释的函数的进入和
退出。
__must_hold - 指定的锁在函数进入和退出时被持有。
__acquires - 指定的锁在函数退出时被持有,但在进入时不被持有。
__releases - 指定的锁在函数进入时被持有,但在退出时不被持有。
如果函数在不持有锁的情况下进入和退出,在函数内部以平衡的方式获取和释放锁,则不
需要注释。
上面的三个注释是针对sparse否则会报告上下文不平衡的情况。
获取 sparse 工具
~~~~~~~~~~~~~~~~
你可以从 Sparse 的主页获取最新的发布版本:
http://www.kernel.org/pub/linux/kernel/people/josh/sparse/
https://www.kernel.org/pub/software/devel/sparse/dist/
或者,你也可以使用 git 克隆最新的 sparse 开发版本:
git://git.kernel.org/pub/scm/linux/kernel/git/josh/sparse.git
git://git.kernel.org/pub/scm/devel/sparse/sparse.git
一旦你下载了源码,只要以普通用户身份运行:
make
make install
它将会被自动安装到你的 ~/bin 目录下。
如果是标准的用户,它将会被自动安装到你的~/bin目录下。
使用 sparse 工具
~~~~~~~~~~~~~~~~
......
......@@ -23,6 +23,11 @@
另外,随时欢迎您对内核文档进行改进;如果您想提供帮助,请加入vger.kernel.org
上的linux-doc邮件列表。
顺便说下,中文文档也需要遵守内核编码风格,风格中中文和英文的主要不同就是中文
的字符标点占用两个英文字符宽度, 所以,当英文要求不要超过每行100个字符时,
中文就不要超过50个字符。另外,也要注意'-','=' 等符号与相关标题的对齐。在将
补丁提交到社区之前,一定要进行必要的checkpatch.pl检查和编译测试。
许可证文档
----------
......@@ -106,6 +111,7 @@ TODOList:
virt/index
infiniband/index
accounting/index
scheduler/index
TODOList:
......@@ -140,7 +146,6 @@ TODOList:
* PCI/index
* scsi/index
* misc-devices/index
* scheduler/index
* mhi/index
体系结构无关文档
......
This diff is collapsed.
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/scheduler/index.rst
:翻译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
:校译:
===============
Linux调度器
===============
.. toctree::
:maxdepth: 1
completion
sched-arch
sched-bwc
sched-design-CFS
sched-domains
sched-capacity
TODOList:
sched-bwc
sched-deadline
sched-energy
sched-nice-design
sched-rt-group
sched-stats
text_files
.. only:: subproject and html
Indices
=======
* :ref:`genindex`
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/scheduler/sched-arch.rst
:翻译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
:校译:
===============================
架构特定代码的CPU调度器实现提示
===============================
Nick Piggin, 2005
上下文切换
==========
1. 运行队列锁
默认情况下,switch_to arch函数在调用时锁定了运行队列。这通常不是一个问题,除非
switch_to可能需要获取运行队列锁。这通常是由于上下文切换中的唤醒操作造成的。见
arch/ia64/include/asm/switch_to.h的例子。
为了要求调度器在运行队列解锁的情况下调用switch_to,你必须在头文件
中`#define __ARCH_WANT_UNLOCKED_CTXSW`(通常是定义switch_to的那个文件)。
在CONFIG_SMP的情况下,解锁的上下文切换对核心调度器的实现只带来了非常小的性能损
失。
CPU空转
=======
你的cpu_idle程序需要遵守以下规则:
1. 现在抢占应该在空闲的例程上禁用。应该只在调用schedule()时启用,然后再禁用。
2. need_resched/TIF_NEED_RESCHED 只会被设置,并且在运行任务调用 schedule()
之前永远不会被清除。空闲线程只需要查询need_resched,并且永远不会设置或清除它。
3. 当cpu_idle发现(need_resched() == 'true'),它应该调用schedule()。否则
它不应该调用schedule()。
4. 在检查need_resched时,唯一需要禁用中断的情况是,我们要让处理器休眠到下一个中
断(这并不对need_resched提供任何保护,它可以防止丢失一个中断):
4a. 这种睡眠类型的常见问题似乎是::
local_irq_disable();
if (!need_resched()) {
local_irq_enable();
*** resched interrupt arrives here ***
__asm__("sleep until next interrupt");
}
5. 当need_resched变为高电平时,TIF_POLLING_NRFLAG可以由不需要中断来唤醒它们
的空闲程序设置。换句话说,它们必须定期轮询need_resched,尽管做一些后台工作或
进入低CPU优先级可能是合理的。
- 5a. 如果TIF_POLLING_NRFLAG被设置,而我们确实决定进入一个中断睡眠,那
么需要清除它,然后发出一个内存屏障(接着测试need_resched,禁用中断,如3中解释)。
arch/x86/kernel/process.c有轮询和睡眠空闲函数的例子。
可能出现的arch/问题
===================
我发现的可能的arch问题(并试图解决或没有解决)。:
ia64 - safe_halt的调用与中断相比,是否很荒谬? (它睡眠了吗) (参考 #4a)
sh64 - 睡眠与中断相比,是否很荒谬? (参考 #4a)
sparc - 在这一点上,IRQ是开着的(?),把local_irq_save改为_disable。
- 待办事项: 需要第二个CPU来禁用抢占 (参考 #1)
This diff is collapsed.
This diff is collapsed.
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/scheduler/sched-domains.rst
:翻译:
唐艺舟 Tang Yizhou <tangyeechou@gmail.com>
:校译:
司延腾 Yanteng Si <siyanteng@loongson.cn>
======
调度域
======
每个CPU有一个“基”调度域(struct sched_domain)。调度域层次结构从基调度域构建而来,可
通过->parent指针自下而上遍历。->parent必须以NULL结尾,调度域结构体必须是per-CPU的,
因为它们无锁更新。
每个调度域管辖数个CPU(存储在->span字段中)。一个调度域的span必须是它的子调度域span的
超集(如有需求出现,这个限制可以放宽)。CPU i的基调度域必须至少管辖CPU i。每个CPU的
顶层调度域通常将会管辖系统中的全部CPU,尽管严格来说这不是必须的,假如是这样,会导致某些
CPU出现永远不会被指定任务运行的情况,直到允许的CPU掩码被显式设定。调度域的span字段意味
着“在这些CPU中做进程负载均衡”。
每个调度域必须具有一个或多个CPU调度组(struct sched_group),它们以单向循环链表的形式
组织,存储在->groups指针中。这些组的CPU掩码的并集必须和调度域span字段一致。->groups
指针指向的这些组包含的CPU,必须被调度域管辖。组包含的是只读数据,被创建之后,可能被多个
CPU共享。任意两个组的CPU掩码的交集不一定为空,如果是这种情况,对应调度域的SD_OVERLAP
标志位被设置,它管辖的调度组可能不能在多个CPU中共享。
调度域中的负载均衡发生在调度组中。也就是说,每个组被视为一个实体。组的负载被定义为它
管辖的每个CPU的负载之和。仅当组的负载不均衡后,任务才在组之间发生迁移。
在kernel/sched/core.c中,trigger_load_balance()在每个CPU上通过scheduler_tick()
周期执行。在当前运行队列下一个定期调度再平衡事件到达后,它引发一个软中断。负载均衡真正
的工作由run_rebalance_domains()->rebalance_domains()完成,在软中断上下文中执行
(SCHED_SOFTIRQ)。
后一个函数有两个入参:当前CPU的运行队列、它在scheduler_tick()调用时是否空闲。函数会从
当前CPU所在的基调度域开始迭代执行,并沿着parent指针链向上进入更高层级的调度域。在迭代
过程中,函数会检查当前调度域是否已经耗尽了再平衡的时间间隔,如果是,它在该调度域运行
load_balance()。接下来它检查父调度域(如果存在),再后来父调度域的父调度域,以此类推。
起初,load_balance()查找当前调度域中最繁忙的调度组。如果成功,在该调度组管辖的全部CPU
的运行队列中找出最繁忙的运行队列。如能找到,对当前的CPU运行队列和新找到的最繁忙运行
队列均加锁,并把任务从最繁忙队列中迁移到当前CPU上。被迁移的任务数量等于在先前迭代执行
中计算出的该调度域的调度组的不均衡值。
实现调度域
==========
基调度域会管辖CPU层次结构中的第一层。对于超线程(SMT)而言,基调度域将会管辖同一个物理
CPU的全部虚拟CPU,每个虚拟CPU对应一个调度组。
在SMP中,基调度域的父调度域将会管辖同一个结点中的全部物理CPU,每个调度组对应一个物理CPU。
接下来,如果是非统一内存访问(NUMA)系统,SMP调度域的父调度域将管辖整个机器,一个结点的
CPU掩码对应一个调度组。亦或,你可以使用多级NUMA;举例来说Opteron处理器,可能仅用一个
调度域来覆盖它的一个NUMA层级。
实现者需要阅读include/linux/sched/sd_flags.h的注释:读SD_*来了解具体情况以及调度域的
SD标志位调节了哪些东西。
体系结构可以把指定的拓扑层级的通用调度域构建器和默认的SD标志位覆盖掉,方法是创建一个
sched_domain_topology_level数组,并以该数组作为入参调用set_sched_topology()。
调度域调试基础设施可以通过CONFIG_SCHED_DEBUG开启,并在开机启动命令行中增加
“sched_verbose”。如果你忘记调整开机启动命令行了,也可以打开
/sys/kernel/debug/sched/verbose开关。这将开启调度域错误检查的解析,它应该能捕获(上文
描述过的)绝大多数错误,同时以可视化格式打印调度域的结构。
......@@ -34,7 +34,8 @@ The Linux kernel supports the following overcommit handling modes
The overcommit policy is set via the sysctl ``vm.overcommit_memory``.
The overcommit amount can be set via ``vm.overcommit_ratio`` (percentage)
or ``vm.overcommit_kbytes`` (absolute value).
or ``vm.overcommit_kbytes`` (absolute value). These only have an effect
when ``vm.overcommit_memory`` is set to 2.
The current overcommit limit and amount committed are viewable in
``/proc/meminfo`` as CommitLimit and Committed_AS respectively.
......
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment