Commit d58db3f3 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'docs-6.12' of git://git.lwn.net/linux

Pull documentation update from Jonathan Corbet:
 "Another relatively mundane cycle for docs:

   - The beginning of an EEVDF scheduler document

   - More Chinese translations

   - A rethrashing of our bisection documentation

  ...plus the usual array of smaller fixes, and more than the usual
  number of typo fixes"

* tag 'docs-6.12' of git://git.lwn.net/linux: (48 commits)
  Remove duplicate "and" in 'Linux NVMe docs.
  docs:filesystems: fix spelling and grammar mistakes
  docs:filesystem: fix mispelled words on autofs page
  docs:mm: fixed spelling and grammar mistakes on vmalloc kernel stack page
  Documentation: PCI: fix typo in pci.rst
  docs/zh_CN: add the translation of kbuild/gcc-plugins.rst
  docs/process: fix typos
  docs:mm: fix spelling mistakes in heterogeneous memory management page
  accel/qaic: Fix a typo
  docs/zh_CN: update the translation of security-bugs
  docs: block: Fix grammar and spelling mistakes in bfq-iosched.rst
  Documentation: Fix spelling mistakes
  Documentation/gpu: Fix typo in Documentation/gpu/komeda-kms.rst
  scripts: sphinx-pre-install: remove unnecessary double check for $cur_version
  Loongarch: KVM: Add KVM hypercalls documentation for LoongArch
  Documentation: Document the kernel flag bdev_allow_write_mounted
  docs: scheduler: completion: Update member of struct completion
  docs: kerneldoc-preamble.sty: Suppress extra spaces in CJK literal blocks
  docs: submitting-patches: Advertise b4
  docs: update dev-tools/kcsan.rst url about KTSAN
  ...
parents 8202cc80 4f77c346
......@@ -52,7 +52,7 @@ driver generally needs to perform the following initialization:
- Enable DMA/processing engines
When done using the device, and perhaps the module needs to be unloaded,
the driver needs to take the follow steps:
the driver needs to take the following steps:
- Disable the device from generating IRQs
- Release the IRQ (free_irq())
......
......@@ -93,7 +93,7 @@ commands (does not impact QAIC).
uAPI
====
QAIC creates an accel device per phsyical PCIe device. This accel device exists
QAIC creates an accel device per physical PCIe device. This accel device exists
for as long as the PCIe device is known to Linux.
The PCIe device may not be in the state to accept requests from userspace at
......
Bisecting a bug
+++++++++++++++
.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0)
.. [see the bottom of this file for redistribution information]
Last updated: 28 October 2016
======================
Bisecting a regression
======================
Introduction
============
This document describes how to use a ``git bisect`` to find the source code
change that broke something -- for example when some functionality stopped
working after upgrading from Linux 6.0 to 6.1.
Always try the latest kernel from kernel.org and build from source. If you are
not confident in doing that please report the bug to your distribution vendor
instead of to a kernel developer.
The text focuses on the gist of the process. If you are new to bisecting the
kernel, better follow Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst
instead: it depicts everything from start to finish while covering multiple
aspects even kernel developers occasionally forget. This includes detecting
situations early where a bisection would be a waste of time, as nobody would
care about the result -- for example, because the problem happens after the
kernel marked itself as 'tainted', occurs in an abandoned version, was already
fixed, or is caused by a .config change you or your Linux distributor performed.
Finding bugs is not always easy. Have a go though. If you can't find it don't
give up. Report as much as you have found to the relevant maintainer. See
MAINTAINERS for who that is for the subsystem you have worked on.
Finding the change causing a kernel issue using a bisection
===========================================================
Before you submit a bug report read
'Documentation/admin-guide/reporting-issues.rst'.
*Note: the following process assumes you prepared everything for a bisection.
This includes having a Git clone with the appropriate sources, installing the
software required to build and install kernels, as well as a .config file stored
in a safe place (the following example assumes '~/prepared_kernel_.config') to
use as pristine base at each bisection step; ideally, you have also worked out
a fully reliable and straight-forward way to reproduce the regression, too.*
Devices not appearing
=====================
Often this is caused by udev/systemd. Check that first before blaming it
on the kernel.
Finding patch that caused a bug
===============================
Using the provided tools with ``git`` makes finding bugs easy provided the bug
is reproducible.
Steps to do it:
- build the Kernel from its git source
- start bisect with [#f1]_::
$ git bisect start
- mark the broken changeset with::
$ git bisect bad [commit]
- mark a changeset where the code is known to work with::
$ git bisect good [commit]
- rebuild the Kernel and test
- interact with git bisect by using either::
$ git bisect good
or::
$ git bisect bad
depending if the bug happened on the changeset you're testing
- After some interactions, git bisect will give you the changeset that
likely caused the bug.
- For example, if you know that the current version is bad, and version
4.8 is good, you could do::
$ git bisect start
$ git bisect bad # Current version is bad
$ git bisect good v4.8
.. [#f1] You can, optionally, provide both good and bad arguments at git
start with ``git bisect start [BAD] [GOOD]``
For further references, please read:
- The man page for ``git-bisect``
- `Fighting regressions with git bisect <https://www.kernel.org/pub/software/scm/git/docs/git-bisect-lk2009.html>`_
- `Fully automated bisecting with "git bisect run" <https://lwn.net/Articles/317154>`_
- `Using Git bisect to figure out when brokenness was introduced <http://webchick.net/node/99>`_
* Preparation: start the bisection and tell Git about the points in the history
you consider to be working and broken, which Git calls 'good' and 'bad'::
git bisect start
git bisect good v6.0
git bisect bad v6.1
Instead of Git tags like 'v6.0' and 'v6.1' you can specify commit-ids, too.
1. Copy your prepared .config into the build directory and adjust it to the
needs of the codebase Git checked out for testing::
cp ~/prepared_kernel_.config .config
make olddefconfig
2. Now build, install, and boot a kernel. This might fail for unrelated reasons,
for example, when a compile error happens at the current stage of the
bisection a later change resolves. In such cases run ``git bisect skip`` and
go back to step 1.
3. Check if the functionality that regressed works in the kernel you just built.
If it works, execute::
git bisect good
If it is broken, run::
git bisect bad
Note, getting this wrong just once will send the rest of the bisection
totally off course. To prevent having to start anew later you thus want to
ensure what you tell Git is correct; it is thus often wise to spend a few
minutes more on testing in case your reproducer is unreliable.
After issuing one of these two commands, Git will usually check out another
bisection point and print something like 'Bisecting: 675 revisions left to
test after this (roughly 10 steps)'. In that case go back to step 1.
If Git instead prints something like 'cafecaca0c0dacafecaca0c0dacafecaca0c0da
is the first bad commit', then you have finished the bisection. In that case
move to the next point below. Note, right after displaying that line Git will
show some details about the culprit including its patch description; this can
easily fill your terminal, so you might need to scroll up to see the message
mentioning the culprit's commit-id.
In case you missed Git's output, you can always run ``git bisect log`` to
print the status: it will show how many steps remain or mention the result of
the bisection.
* Recommended complementary task: put the bisection log and the current .config
file aside for the bug report; furthermore tell Git to reset the sources to
the state before the bisection::
git bisect log > ~/bisection-log
cp .config ~/bisection-config-culprit
git bisect reset
* Recommended optional task: try reverting the culprit on top of the latest
codebase and check if that fixes your bug; if that is the case, it validates
the bisection and enables developers to resolve the regression through a
revert.
To try this, update your clone and check out latest mainline. Then tell Git
to revert the change by specifying its commit-id::
git revert --no-edit cafec0cacaca0
Git might reject this, for example when the bisection landed on a merge
commit. In that case, abandon the attempt. Do the same, if Git fails to revert
the culprit on its own because later changes depend on it -- at least unless
you bisected a stable or longterm kernel series, in which case you want to
check out its latest codebase and try a revert there.
If a revert succeeds, build and test another kernel to check if reverting
resolved your regression.
With that the process is complete. Now report the regression as described by
Documentation/admin-guide/reporting-issues.rst.
Additional reading material
---------------------------
* The `man page for 'git bisect' <https://git-scm.com/docs/git-bisect>`_ and
`fighting regressions with 'git bisect' <https://git-scm.com/docs/git-bisect-lk2009.html>`_
in the Git documentation.
* `Working with git bisect <https://nathanchance.dev/posts/working-with-git-bisect/>`_
from kernel developer Nathan Chancellor.
* `Using Git bisect to figure out when brokenness was introduced <http://webchick.net/node/99>`_.
* `Fully automated bisecting with 'git bisect run' <https://lwn.net/Articles/317154>`_.
..
end-of-content
..
This document is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If
you spot a typo or small mistake, feel free to let him know directly and
he'll fix it. You are free to do the same in a mostly informal way if you
want to contribute changes to the text -- but for copyright reasons please CC
linux-doc@vger.kernel.org and 'sign-off' your contribution as
Documentation/process/submitting-patches.rst explains in the section 'Sign
your work - the Developer's Certificate of Origin'.
..
This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top
of the file. If you want to distribute this text under CC-BY-4.0 only,
please use 'The Linux kernel development community' for author attribution
and link this as source:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/bug-bisect.rst
..
Note: Only the content of this RST file as found in the Linux kernel sources
is available under CC-BY-4.0, as versions of this text that were processed
(for example by the kernel's build system) might contain content taken from
files which use a more restrictive license.
......@@ -244,14 +244,14 @@ Reporting the bug
Once you find where the bug happened, by inspecting its location,
you could either try to fix it yourself or report it upstream.
In order to report it upstream, you should identify the mailing list
used for the development of the affected code. This can be done by using
the ``get_maintainer.pl`` script.
In order to report it upstream, you should identify the bug tracker, if any, or
mailing list used for the development of the affected code. This can be done by
using the ``get_maintainer.pl`` script.
For example, if you find a bug at the gspca's sonixj.c file, you can get
its maintainers with::
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
$ ./scripts/get_maintainer.pl --bug -f drivers/media/usb/gspca/sonixj.c
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
......@@ -267,11 +267,12 @@ Please notice that it will point to:
- The driver maintainer (Hans Verkuil);
- The subsystem maintainer (Mauro Carvalho Chehab);
- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
- The Linux Kernel mailing list (linux-kernel@vger.kernel.org);
- The bug reporting URIs for the driver/subsystem (none in the above example).
Usually, the fastest way to have your bug fixed is to report it to mailing
list used for the development of the code (linux-media ML) copying the
driver maintainer (Hans).
If the listing contains bug reporting URIs at the end, please prefer them over
email. Otherwise, please report bugs to the mailing list used for the
development of the code (linux-media ML) copying the driver maintainer (Hans).
If you are totally stumped as to whom to send the report, and
``get_maintainer.pl`` didn't provide you anything useful, send it to
......
......@@ -162,13 +162,18 @@ iv_large_sectors
Module parameters::
max_read_size
Maximum size of read requests. When a request larger than this size
is received, dm-crypt will split the request. The splitting improves
concurrency (the split requests could be encrypted in parallel by multiple
cores), but it also causes overhead. The user should tune this parameters to
fit the actual workload.
max_write_size
Maximum size of read or write requests. When a request larger than this size
Maximum size of write requests. When a request larger than this size
is received, dm-crypt will split the request. The splitting improves
concurrency (the split requests could be encrypted in parallel by multiple
cores), but it also causes overhead. The user should tune these parameters to
cores), but it also causes overhead. The user should tune this parameters to
fit the actual workload.
......
......@@ -517,6 +517,18 @@
Format: <io>,<irq>,<mode>
See header of drivers/net/hamradio/baycom_ser_hdx.c.
bdev_allow_write_mounted=
Format: <bool>
Control the ability to open a mounted block device
for writing, i.e., allow / disallow writes that bypass
the FS. This was implemented as a means to prevent
fuzzers from crashing the kernel by overwriting the
metadata underneath a mounted FS without its awareness.
This also prevents destructive formatting of mounted
filesystems by naive storage tooling that don't use
O_EXCL. Default is Y and can be changed through the
Kconfig option CONFIG_BLK_DEV_WRITE_MOUNTED.
bert_disable [ACPI]
Disable BERT OS support on buggy BIOSes.
......
......@@ -182,3 +182,5 @@ More detailed explanation for tainting
produce extremely unusual kernel structure layouts (even performance
pathological ones), which is important to know when debugging. Set at
build time.
18) ``N`` if an in-kernel test, such as a KUnit test, has been run.
......@@ -359,7 +359,7 @@ Driver updates for STM32 DMA-MDMA chaining support in foo driver
descriptor you want a callback to be called at the end of the transfer
(dmaengine_prep_slave_sg()) or the period (dmaengine_prep_dma_cyclic()).
Depending on the direction, set the callback on the descriptor that finishes
the overal transfer:
the overall transfer:
* DMA_DEV_TO_MEM: set the callback on the "MDMA" descriptor
* DMA_MEM_TO_DEV: set the callback on the "DMA" descriptor
......@@ -371,7 +371,7 @@ Driver updates for STM32 DMA-MDMA chaining support in foo driver
As STM32 MDMA channel transfer is triggered by STM32 DMA, you must issue
STM32 MDMA channel before STM32 DMA channel.
If any, your callback will be called to warn you about the end of the overal
If any, your callback will be called to warn you about the end of the overall
transfer or the period completion.
Don't forget to terminate both channels. STM32 DMA channel is configured in
......
......@@ -26,7 +26,7 @@ There are no systems that support the physical addition (or removal) of CPUs
while the system is running, and ACPI is not able to sufficiently describe
them.
e.g. New CPUs come with new caches, but the platform's cache toplogy is
e.g. New CPUs come with new caches, but the platform's cache topology is
described in a static table, the PPTT. How caches are shared between CPUs is
not discoverable, and must be described by firmware.
......
......@@ -134,7 +134,7 @@ Hardware
* PTCR and partition table entries (partition table is in secure
memory). An attempt to write to PTCR will cause a Hypervisor
Emulation Assitance interrupt.
Emulation Assistance interrupt.
* LDBAR (LD Base Address Register) and IMC (In-Memory Collection)
non-architected registers. An attempt to write to them will cause a
......
......@@ -15,7 +15,7 @@ status for the use of Vector in userspace. The intended usage guideline for
these interfaces is to give init systems a way to modify the availability of V
for processes running under its domain. Calling these interfaces is not
recommended in libraries routines because libraries should not override policies
configured from the parant process. Also, users must noted that these interfaces
configured from the parent process. Also, users must note that these interfaces
are not portable to non-Linux, nor non-RISC-V environments, so it is discourage
to use in a portable code. To get the availability of V in an ELF program,
please read :c:macro:`COMPAT_HWCAP_ISA_V` bit of :c:macro:`ELF_HWCAP` in the
......
......@@ -162,7 +162,7 @@ Mitigation points
3. It would take a large number of these precisely-timed NMIs to mount
an actual attack. There's presumably not enough bandwidth.
4. The NMI in question occurs after a VERW, i.e. when user state is
restored and most interesting data is already scrubbed. Whats left
restored and most interesting data is already scrubbed. What's left
is only the data that NMI touches, and that may or may not be of
any interest.
......
......@@ -125,7 +125,7 @@ FSGSBASE instructions enablement
FSGSBASE instructions compiler support
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
GCC version 4.6.4 and newer provide instrinsics for the FSGSBASE
GCC version 4.6.4 and newer provide intrinsics for the FSGSBASE
instructions. Clang 5 supports them as well.
=================== ===========================
......@@ -135,7 +135,7 @@ instructions. Clang 5 supports them as well.
_writegsbase_u64() Write the GS base register
=================== ===========================
To utilize these instrinsics <immintrin.h> must be included in the source
To utilize these intrinsics <immintrin.h> must be included in the source
code and the compiler option -mfsgsbase has to be added.
Compiler support for FS/GS based addressing
......
......@@ -9,7 +9,7 @@ controllers), BFQ's main features are:
- BFQ guarantees a high system and application responsiveness, and a
low latency for time-sensitive applications, such as audio or video
players;
- BFQ distributes bandwidth, and not just time, among processes or
- BFQ distributes bandwidth, not just time, among processes or
groups (switching back to time distribution when needed to keep
throughput high).
......@@ -111,7 +111,7 @@ Higher speed for code-development tasks
If some additional workload happens to be executed in parallel, then
BFQ executes the I/O-related components of typical code-development
tasks (compilation, checkout, merge, ...) much more quickly than CFQ,
tasks (compilation, checkout, merge, etc.) much more quickly than CFQ,
NOOP or DEADLINE.
High throughput
......@@ -127,9 +127,9 @@ Strong fairness, bandwidth and delay guarantees
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
BFQ distributes the device throughput, and not just the device time,
among I/O-bound applications in proportion their weights, with any
among I/O-bound applications in proportion to their weights, with any
workload and regardless of the device parameters. From these bandwidth
guarantees, it is possible to compute tight per-I/O-request delay
guarantees, it is possible to compute a tight per-I/O-request delay
guarantees by a simple formula. If not configured for strict service
guarantees, BFQ switches to time-based resource sharing (only) for
applications that would otherwise cause a throughput loss.
......@@ -199,7 +199,7 @@ plus a lot of code, are borrowed from CFQ.
- On flash-based storage with internal queueing of commands
(typically NCQ), device idling happens to be always detrimental
for throughput. So, with these devices, BFQ performs idling
to throughput. So, with these devices, BFQ performs idling
only when strictly needed for service guarantees, i.e., for
guaranteeing low latency or fairness. In these cases, overall
throughput may be sub-optimal. No solution currently exists to
......@@ -212,7 +212,7 @@ plus a lot of code, are borrowed from CFQ.
and to reduce their latency. The most important action taken to
achieve this goal is to give to the queues associated with these
applications more than their fair share of the device
throughput. For brevity, we call just "weight-raising" the whole
throughput. For brevity, we call it just "weight-raising" the whole
sets of actions taken by BFQ to privilege these queues. In
particular, BFQ provides a milder form of weight-raising for
interactive applications, and a stronger form for soft real-time
......@@ -231,7 +231,7 @@ plus a lot of code, are borrowed from CFQ.
responsive in detecting interleaved I/O (cooperating processes),
that it enables BFQ to achieve a high throughput, by queue
merging, even for queues for which CFQ needs a different
mechanism, preemption, to get a high throughput. As such EQM is a
mechanism, preemption, to get a high throughput. As such, EQM is a
unified mechanism to achieve a high throughput with interleaved
I/O.
......@@ -254,7 +254,7 @@ plus a lot of code, are borrowed from CFQ.
- First, with any proportional-share scheduler, the maximum
deviation with respect to an ideal service is proportional to
the maximum budget (slice) assigned to queues. As a consequence,
BFQ can keep this deviation tight not only because of the
BFQ can keep this deviation tight, not only because of the
accurate service of B-WF2Q+, but also because BFQ *does not*
need to assign a larger budget to a queue to let the queue
receive a higher fraction of the device throughput.
......@@ -327,7 +327,7 @@ applications. Unset this tunable if you need/want to control weights.
slice_idle
----------
This parameter specifies how long BFQ should idle for next I/O
This parameter specifies how long BFQ should idle for the next I/O
request, when certain sync BFQ queues become empty. By default
slice_idle is a non-zero value. Idling has a double purpose: boosting
throughput and making sure that the desired throughput distribution is
......@@ -365,7 +365,7 @@ terms of I/O-request dispatches. To guarantee that the actual service
order then corresponds to the dispatch order, the strict_guarantees
tunable must be set too.
There is an important flipside for idling: apart from the above cases
There is an important flip side to idling: apart from the above cases
where it is beneficial also for throughput, idling can severely impact
throughput. One important case is random workload. Because of this
issue, BFQ tends to avoid idling as much as possible, when it is not
......@@ -475,7 +475,7 @@ max_budget
Maximum amount of service, measured in sectors, that can be provided
to a BFQ queue once it is set in service (of course within the limits
of the above timeout). According to what said in the description of
of the above timeout). According to what was said in the description of
the algorithm, larger values increase the throughput in proportion to
the percentage of sequential I/O requests issued. The price of larger
values is that they coarsen the granularity of short-term bandwidth
......
......@@ -45,8 +45,9 @@ here we briefly outline their recommended usage:
* If the allocation is performed from an atomic context, e.g interrupt
handler, use ``GFP_NOWAIT``. This flag prevents direct reclaim and
IO or filesystem operations. Consequently, under memory pressure
``GFP_NOWAIT`` allocation is likely to fail. Allocations which
have a reasonable fallback should be using ``GFP_NOWARN``.
``GFP_NOWAIT`` allocation is likely to fail. Users of this flag need
to provide a suitable fallback to cope with such failures where
appropriate.
* If you think that accessing memory reserves is justified and the kernel
will be stressed unless allocation succeeds, you may use ``GFP_ATOMIC``.
* Untrusted allocations triggered from userspace should be a subject
......
......@@ -361,7 +361,8 @@ Alternatives Considered
-----------------------
An alternative data race detection approach for the kernel can be found in the
`Kernel Thread Sanitizer (KTSAN) <https://github.com/google/ktsan/wiki>`_.
`Kernel Thread Sanitizer (KTSAN)
<https://github.com/google/kernel-sanitizers/blob/master/KTSAN.md>`_.
KTSAN is a happens-before data race detector, which explicitly establishes the
happens-before order between memory operations, which can then be used to
determine data races as defined in `Data Races`_.
......
.. SPDX-License-Identifier: GPL-2.0
Checking for needed translation updates
=======================================
This script helps track the translation status of the documentation in
different locales, i.e., whether the documentation is up-to-date with
the English counterpart.
How it works
------------
It uses ``git log`` command to track the latest English commit from the
translation commit (order by author date) and the latest English commits
from HEAD. If any differences occur, the file is considered as out-of-date,
then commits that need to be updated will be collected and reported.
Features implemented
- check all files in a certain locale
- check a single file or a set of files
- provide options to change output format
- track the translation status of files that have no translation
Usage
-----
::
./scripts/checktransupdate.py --help
Please refer to the output of argument parser for usage details.
Samples
- ``./scripts/checktransupdate.py -l zh_CN``
This will print all the files that need to be updated in the zh_CN locale.
- ``./scripts/checktransupdate.py Documentation/translations/zh_CN/dev-tools/testing-overview.rst``
This will only print the status of the specified file.
Then the output is something like:
::
Documentation/dev-tools/kfence.rst
No translation in the locale of zh_CN
Documentation/translations/zh_CN/dev-tools/testing-overview.rst
commit 42fb9cfd5b18 ("Documentation: dev-tools: Add link to RV docs")
1 commits needs resolving in total
Features to be implemented
- files can be a folder instead of only a file
......@@ -12,6 +12,7 @@ How to write kernel documentation
parse-headers
contributing
maintainer-profile
checktransupdate
.. only:: subproject and html
......
......@@ -262,7 +262,7 @@ vsyscall_32.lds
wanxlfw.inc
uImage
unifdef
utf8data.h
utf8data.c
wakeup.bin
wakeup.elf
wakeup.lds
......
......@@ -391,7 +391,7 @@ PCI
devm_pci_remap_cfgspace() : ioremap PCI configuration space
devm_pci_remap_cfg_resource() : ioremap PCI configuration space resource
pcim_enable_device() : after success, all PCI ops become managed
pcim_enable_device() : after success, some PCI ops become managed
pcim_iomap() : do iomap() on a single BAR
pcim_iomap_regions() : do request_region() and iomap() on multiple BARs
pcim_iomap_regions_request_all() : do request_region() on all and iomap() on multiple BARs
......
......@@ -15,8 +15,8 @@ trigger source. Multiple data channels can be read at once from
IIO buffer sysfs interface
==========================
An IIO buffer has an associated attributes directory under
:file:`/sys/bus/iio/iio:device{X}/buffer/*`. Here are some of the existing
attributes:
:file:`/sys/bus/iio/devices/iio:device{X}/buffer/*`. Here are some of the
existing attributes:
* :file:`length`, the total number of data samples (capacity) that can be
stored by the buffer.
......@@ -28,8 +28,8 @@ IIO buffer setup
The meta information associated with a channel reading placed in a buffer is
called a scan element. The important bits configuring scan elements are
exposed to userspace applications via the
:file:`/sys/bus/iio/iio:device{X}/scan_elements/` directory. This directory contains
attributes of the following form:
:file:`/sys/bus/iio/devices/iio:device{X}/scan_elements/` directory. This
directory contains attributes of the following form:
* :file:`enable`, used for enabling a channel. If and only if its attribute
is non *zero*, then a triggered capture will contain data samples for this
......
......@@ -24,7 +24,7 @@ then we will show how a device driver makes use of an IIO device.
There are two ways for a user space application to interact with an IIO driver.
1. :file:`/sys/bus/iio/iio:device{X}/`, this represents a hardware sensor
1. :file:`/sys/bus/iio/devices/iio:device{X}/`, this represents a hardware sensor
and groups together the data channels of the same chip.
2. :file:`/dev/iio:device{X}`, character device node interface used for
buffered data transfer and for events information retrieval.
......@@ -51,8 +51,8 @@ IIO device sysfs interface
Attributes are sysfs files used to expose chip info and also allowing
applications to set various configuration parameters. For device with
index X, attributes can be found under /sys/bus/iio/iio:deviceX/ directory.
Common attributes are:
index X, attributes can be found under /sys/bus/iio/devices/iio:deviceX/
directory. Common attributes are:
* :file:`name`, description of the physical chip.
* :file:`dev`, shows the major:minor pair associated with
......@@ -140,16 +140,16 @@ Here is how we can make use of the channel's modifiers::
This channel's definition will generate two separate sysfs files for raw data
retrieval:
* :file:`/sys/bus/iio/iio:device{X}/in_intensity_ir_raw`
* :file:`/sys/bus/iio/iio:device{X}/in_intensity_both_raw`
* :file:`/sys/bus/iio/devices/iio:device{X}/in_intensity_ir_raw`
* :file:`/sys/bus/iio/devices/iio:device{X}/in_intensity_both_raw`
one file for processed data:
* :file:`/sys/bus/iio/iio:device{X}/in_illuminance_input`
* :file:`/sys/bus/iio/devices/iio:device{X}/in_illuminance_input`
and one shared sysfs file for sampling frequency:
* :file:`/sys/bus/iio/iio:device{X}/sampling_frequency`.
* :file:`/sys/bus/iio/devices/iio:device{X}/sampling_frequency`.
Here is how we can make use of the channel's indexing::
......
......@@ -141,6 +141,14 @@ configuration of fault-injection capabilities.
default is 'Y', setting it to 'N' will also inject failures into
highmem/user allocations (__GFP_HIGHMEM allocations).
- /sys/kernel/debug/failslab/cache-filter
Format: { 'Y' | 'N' }
default is 'N', setting it to 'Y' will only inject failures when
objects are requests from certain caches.
Select the cache by writing '1' to /sys/kernel/slab/<cache>/failslab:
- /sys/kernel/debug/failslab/ignore-gfp-wait:
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
......@@ -283,7 +291,7 @@ kernel may crash because it may not be able to handle the error.
There are 4 types of errors defined in include/asm-generic/error-injection.h
EI_ETYPE_NULL
This function will return `NULL` if it fails. e.g. return an allocateed
This function will return `NULL` if it fails. e.g. return an allocated
object address.
EI_ETYPE_ERRNO
......@@ -459,6 +467,18 @@ Application Examples
losetup -d $DEVICE
rm testfile.img
------------------------------------------------------------------------------
- Inject only skbuff allocation failures ::
# mark skbuff_head_cache as faulty
echo 1 > /sys/kernel/slab/skbuff_head_cache/failslab
# Turn on cache filter (off by default)
echo 1 > /sys/kernel/debug/failslab/cache-filter
# Turn on fault injection
echo 1 > /sys/kernel/debug/failslab/times
echo 1 > /sys/kernel/debug/failslab/probability
Tool to run command with failslab or fail_page_alloc
----------------------------------------------------
......
......@@ -31,7 +31,7 @@ Other applications are described in the following papers:
* PROSE I/O: Using 9p to enable Application Partitions
http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf
* VirtFS: A Virtualization Aware File System pass-through
http://goo.gl/3WPDg
https://kernel.org/doc/ols/2010/ols2010-pages-109-120.pdf
Usage
=====
......
......@@ -18,7 +18,7 @@ key advantages:
2. The names and locations of filesystems can be stored in
a remote database and can change at any time. The content
in that data base at the time of access will be used to provide
in that database at the time of access will be used to provide
a target for the access. The interpretation of names in the
filesystem can even be programmatic rather than database-backed,
allowing wildcards for example, and can vary based on the user who
......@@ -423,7 +423,7 @@ The available ioctl commands are:
and objects are expired if the are not in use.
**AUTOFS_EXP_FORCED** causes the in use status to be ignored
and objects are expired ieven if they are in use. This assumes
and objects are expired even if they are in use. This assumes
that the daemon has requested this because it is capable of
performing the umount.
......
......@@ -137,7 +137,7 @@ Fast commits
JBD2 to also allows you to perform file-system specific delta commits known as
fast commits. In order to use fast commits, you will need to set following
callbacks that perform correspodning work:
callbacks that perform corresponding work:
`journal->j_fc_cleanup_cb`: Cleanup function called after every full commit and
fast commit.
......@@ -149,7 +149,7 @@ File system is free to perform fast commits as and when it wants as long as it
gets permission from JBD2 to do so by calling the function
:c:func:`jbd2_fc_begin_commit()`. Once a fast commit is done, the client
file system should tell JBD2 about it by calling
:c:func:`jbd2_fc_end_commit()`. If file system wants JBD2 to perform a full
:c:func:`jbd2_fc_end_commit()`. If the file system wants JBD2 to perform a full
commit immediately after stopping the fast commit it can do so by calling
:c:func:`jbd2_fc_end_commit_fallback()`. This is useful if fast commit operation
fails for some reason and the only way to guarantee consistency is for JBD2 to
......@@ -199,7 +199,7 @@ Journal Level
.. kernel-doc:: fs/jbd2/recovery.c
:internal:
Transasction Level
Transaction Level
~~~~~~~~~~~~~~~~~~
.. kernel-doc:: fs/jbd2/transaction.c
......
......@@ -86,7 +86,7 @@ types of working mode:
- Single display mode
Two pipelines work together to drive only one display output.
On this mode, pipeline_B doesn't work indenpendently, but outputs its
On this mode, pipeline_B doesn't work independently, but outputs its
composition result into pipeline_A, and its pixel timing also derived from
pipeline_A.timing_ctrlr. The pipeline_B works just like a "slave" of
pipeline_A(master)
......
......@@ -115,4 +115,4 @@ Driver provides the following LEDs for the system "msn2100":
- [1,1,1,1] = Blue blink 6Hz
Driver supports HW blinking at 3Hz and 6Hz frequency (50% duty cycle).
For 3Hz duty cylce is about 167 msec, for 6Hz is about 83 msec.
For 3Hz duty cycle is about 167 msec, for 6Hz is about 83 msec.
......@@ -66,7 +66,7 @@ combinatorial explosion in the library entry points.
Finally, with the advance of high level language constructs (in C++ but in
other languages too) it is now possible for the compiler to leverage GPUs and
other devices without programmer knowledge. Some compiler identified patterns
are only do-able with a shared address space. It is also more reasonable to use
are only doable with a shared address space. It is also more reasonable to use
a shared address space for all other patterns.
......@@ -267,7 +267,7 @@ functions are designed to make drivers easier to write and to centralize common
code across drivers.
Before migrating pages to device private memory, special device private
``struct page`` need to be created. These will be used as special "swap"
``struct page`` needs to be created. These will be used as special "swap"
page table entries so that a CPU process will fault if it tries to access
a page that has been migrated to device private memory.
......@@ -322,7 +322,7 @@ between device driver specific code and shared common code:
The ``invalidate_range_start()`` callback is passed a
``struct mmu_notifier_range`` with the ``event`` field set to
``MMU_NOTIFY_MIGRATE`` and the ``owner`` field set to
the ``args->pgmap_owner`` field passed to migrate_vma_setup(). This is
the ``args->pgmap_owner`` field passed to migrate_vma_setup(). This
allows the device driver to skip the invalidation callback and only
invalidate device private MMU mappings that are actually migrating.
This is explained more in the next section.
......@@ -405,7 +405,7 @@ can be used to make a memory range inaccessible from userspace.
This replaces all mappings for pages in the given range with special swap
entries. Any attempt to access the swap entry results in a fault which is
resovled by replacing the entry with the original mapping. A driver gets
resolved by replacing the entry with the original mapping. A driver gets
notified that the mapping has been changed by MMU notifiers, after which point
it will no longer have exclusive access to the page. Exclusive access is
guaranteed to last until the driver drops the page lock and page reference, at
......@@ -431,7 +431,7 @@ Same decision was made for memory cgroup. Device memory pages are accounted
against same memory cgroup a regular page would be accounted to. This does
simplify migration to and from device memory. This also means that migration
back from device memory to regular memory cannot fail because it would
go above memory cgroup limit. We might revisit this choice latter on once we
go above memory cgroup limit. We might revisit this choice later on once we
get more experience in how device memory is used and its impact on memory
resource control.
......
......@@ -110,7 +110,7 @@ Bulk of the code is in:
`kernel/fork.c <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/fork.c>`.
stack_vm_area pointer in task_struct keeps track of the virtually allocated
stack and a non-null stack_vm_area pointer serves as a indication that the
stack and a non-null stack_vm_area pointer serves as an indication that the
virtually mapped kernel stacks are enabled.
::
......@@ -120,8 +120,8 @@ virtually mapped kernel stacks are enabled.
Stack overflow handling
-----------------------
Leading and trailing guard pages help detect stack overflows. When stack
overflows into the guard pages, handlers have to be careful not overflow
Leading and trailing guard pages help detect stack overflows. When the stack
overflows into the guard pages, handlers have to be careful not to overflow
the stack again. When handlers are called, it is likely that very little
stack space is left.
......@@ -148,6 +148,6 @@ Conclusions
- THREAD_INFO_IN_TASK gets rid of arch-specific thread_info entirely and
simply embed the thread_info (containing only flags) and 'int cpu' into
task_struct.
- The thread stack can be free'ed as soon as the task is dead (without
- The thread stack can be freed as soon as the task is dead (without
waiting for RCU) and then, if vmapped stacks are in use, cache the
entire stack for reuse on the same cpu.
.. SPDX-License-Identifier: GPL-2.0
=======================================
Linux NVMe feature and and quirk policy
=======================================
===================================
Linux NVMe feature and quirk policy
===================================
This file explains the policy used to decide what is supported by the
Linux NVMe driver and what is not.
......
......@@ -73,7 +73,7 @@ Once you have the patch in git, you can go ahead and cherry-pick it into
your source tree. Don't forget to cherry-pick with ``-x`` if you want a
written record of where the patch came from!
Note that if you are submiting a patch for stable, the format is
Note that if you are submitting a patch for stable, the format is
slightly different; the first line after the subject line needs tobe
either::
......@@ -147,7 +147,7 @@ divergence.
It's important to always identify the commit or commits that caused the
conflict, as otherwise you cannot be confident in the correctness of
your resolution. As an added bonus, especially if the patch is in an
area you're not that famliar with, the changelogs of these commits will
area you're not that familiar with, the changelogs of these commits will
often give you the context to understand the code and potential problems
or pitfalls with your conflict resolution.
......@@ -197,7 +197,7 @@ git blame
Another way to find prerequisite commits (albeit only the most recent
one for a given conflict) is to run ``git blame``. In this case, you
need to run it against the parent commit of the patch you are
cherry-picking and the file where the conflict appared, i.e.::
cherry-picking and the file where the conflict appeared, i.e.::
git blame <commit>^ -- <path>
......
......@@ -986,7 +986,7 @@ that can go into these 5 milliseconds.
A reasonable rule of thumb is to not put inline at functions that have more
than 3 lines of code in them. An exception to this rule are the cases where
a parameter is known to be a compiletime constant, and as a result of this
a parameter is known to be a compile time constant, and as a result of this
constantness you *know* the compiler will be able to optimize most of your
function away at compile time. For a good example of this later case, see
the kmalloc() inline function.
......
......@@ -154,7 +154,7 @@ Examples for illustration:
We modify the hot cpu handling to cancel the delayed work on the dying
cpu and run the worker immediately on a different cpu in same domain. We
donot flush the worker because the MBM overflow worker reschedules the
do not flush the worker because the MBM overflow worker reschedules the
worker on same CPU and scans the domain->cpu_mask to get the domain
pointer.
......
......@@ -842,6 +842,14 @@ Make sure that base commit is in an official maintainer/mainline tree
and not in some internal, accessible only to you tree - otherwise it
would be worthless.
Tooling
-------
Many of the technical aspects of this process can be automated using
b4, documented at <https://b4.docs.kernel.org/en/latest/>. This can
help with things like tracking dependencies, running checkpatch and
with formatting and sending mails.
References
----------
......
......@@ -51,7 +51,7 @@ which has only two fields::
struct completion {
unsigned int done;
wait_queue_head_t wait;
struct swait_queue_head wait;
};
This provides the ->wait waitqueue to place tasks on for waiting (if any), and
......
......@@ -12,6 +12,7 @@ Scheduler
sched-bwc
sched-deadline
sched-design-CFS
sched-eevdf
sched-domains
sched-capacity
sched-energy
......
......@@ -8,10 +8,12 @@ CFS Scheduler
1. OVERVIEW
============
CFS stands for "Completely Fair Scheduler," and is the new "desktop" process
scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the
replacement for the previous vanilla scheduler's SCHED_OTHER interactivity
code.
CFS stands for "Completely Fair Scheduler," and is the "desktop" process
scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. When
originally merged, it was the replacement for the previous vanilla
scheduler's SCHED_OTHER interactivity code. Nowadays, CFS is making room
for EEVDF, for which documentation can be found in
Documentation/scheduler/sched-eevdf.rst.
80% of CFS's design can be summed up in a single sentence: CFS basically models
an "ideal, precise multi-tasking CPU" on real hardware.
......
===============
EEVDF Scheduler
===============
The "Earliest Eligible Virtual Deadline First" (EEVDF) was first introduced
in a scientific publication in 1995 [1]. The Linux kernel began
transitioning to EEVDF in version 6.6 (as a new option in 2024), moving
away from the earlier Completely Fair Scheduler (CFS) in favor of a version
of EEVDF proposed by Peter Zijlstra in 2023 [2-4]. More information
regarding CFS can be found in
Documentation/scheduler/sched-design-CFS.rst.
Similarly to CFS, EEVDF aims to distribute CPU time equally among all
runnable tasks with the same priority. To do so, it assigns a virtual run
time to each task, creating a "lag" value that can be used to determine
whether a task has received its fair share of CPU time. In this way, a task
with a positive lag is owed CPU time, while a negative lag means the task
has exceeded its portion. EEVDF picks tasks with lag greater or equal to
zero and calculates a virtual deadline (VD) for each, selecting the task
with the earliest VD to execute next. It's important to note that this
allows latency-sensitive tasks with shorter time slices to be prioritized,
which helps with their responsiveness.
There are ongoing discussions on how to manage lag, especially for sleeping
tasks; but at the time of writing EEVDF uses a "decaying" mechanism based
on virtual run time (VRT). This prevents tasks from exploiting the system
by sleeping briefly to reset their negative lag: when a task sleeps, it
remains on the run queue but marked for "deferred dequeue," allowing its
lag to decay over VRT. Hence, long-sleeping tasks eventually have their lag
reset. Finally, tasks can preempt others if their VD is earlier, and tasks
can request specific time slices using the new sched_setattr() system call,
which further facilitates the job of latency-sensitive applications.
REFERENCES
==========
[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=805acf7726282721504c8f00575d91ebfd750564
[2] https://lore.kernel.org/lkml/a79014e6-ea83-b316-1e12-2ae056bda6fa@linux.vnet.ibm.com/
[3] https://lwn.net/Articles/969062/
[4] https://lwn.net/Articles/925371/
......@@ -199,6 +199,8 @@
% Inactivate CJK after tableofcontents
\apptocmd{\sphinxtableofcontents}{\kerneldocCJKoff}{}{}
\xeCJKsetup{CJKspace = true}% For inter-phrase space of Korean TOC
% Suppress extra white space at latin .. non-latin in literal blocks
\AtBeginEnvironment{sphinxVerbatim}{\CJKsetecglue{}}
}{ % Don't enable CJK
% Custom macros to on/off CJK and switch CJK fonts (Dummy)
\newcommand{\kerneldocCJKon}{}
......
.. SPDX-License-Identifier: GPL-2.0
This is a simple wrapper to bring memory-barriers.txt into the RST world
until such a time as that file can be converted directly.
=========================
리눅스 커널 메모리 배리어
=========================
.. raw:: latex
\footnotesize
.. include:: ../../memory-barriers.txt
:literal:
.. raw:: latex
\normalsize
......@@ -11,18 +11,8 @@
.. toctree::
:maxdepth: 1
howto
리눅스 커널 메모리 배리어
-------------------------
.. raw:: latex
\footnotesize
.. include:: ./memory-barriers.txt
:literal:
process/howto
core-api/wrappers/memory-barriers.rst
.. raw:: latex
......
......@@ -6,3 +6,4 @@
:maxdepth: 1
sched-design-CFS
sched-eevdf
......@@ -14,10 +14,10 @@ Gestor de tareas CFS
CFS viene de las siglas en inglés de "Gestor de tareas totalmente justo"
("Completely Fair Scheduler"), y es el nuevo gestor de tareas de escritorio
implementado por Ingo Molnar e integrado en Linux 2.6.23. Es el sustituto de
el previo gestor de tareas SCHED_OTHER.
Nota: El planificador EEVDF fue incorporado más recientemente al kernel.
implementado por Ingo Molnar e integrado en Linux 2.6.23. Es el sustituto
del previo gestor de tareas SCHED_OTHER. Hoy en día se está abriendo camino
para el gestor de tareas EEVDF, cuya documentación se puede ver en
Documentation/scheduler/sched-eevdf.rst
El 80% del diseño de CFS puede ser resumido en una única frase: CFS
básicamente modela una "CPU ideal, precisa y multi-tarea" sobre hardware
......
.. include:: ../disclaimer-sp.rst
:Original: Documentation/scheduler/sched-eevdf.rst
:Translator: Sergio González Collado <sergio.collado@gmail.com>
======================
Gestor de tareas EEVDF
======================
El gestor de tareas EEVDF, del inglés: "Earliest Eligible Virtual Deadline
First", fue presentado por primera vez en una publicación científica en
1995 [1]. El kernel de Linux comenzó a transicionar hacia EEVPF en la
versión 6.6 (y como una nueva opción en 2024), alejándose del gestor
de tareas CFS, en favor de una versión de EEVDF propuesta por Peter
Zijlstra en 2023 [2-4]. Más información relativa a CFS puede encontrarse
en Documentation/scheduler/sched-design-CFS.rst.
De forma parecida a CFS, EEVDF intenta distribuir el tiempo de ejecución
de la CPU de forma equitativa entre todas las tareas que tengan la misma
prioridad y puedan ser ejecutables. Para eso, asigna un tiempo de
ejecución virtual a cada tarea, creando un "retraso" que puede ser usado
para determinar si una tarea ha recibido su cantidad justa de tiempo
de ejecución en la CPU. De esta manera, una tarea con un "retraso"
positivo, es porque se le debe tiempo de ejecución, mientras que una
con "retraso" negativo implica que la tarea ha excedido su cuota de
tiempo. EEVDF elige las tareas con un "retraso" mayor igual a cero y
calcula un tiempo límite de ejecución virtual (VD, del inglés: virtual
deadline) para cada una, eligiendo la tarea con la VD más próxima para
ser ejecutada a continuación. Es importante darse cuenta que esto permite
que la tareas que sean sensibles a la latencia que tengan porciones de
tiempos de ejecución de CPU más cortos ser priorizadas, lo cual ayuda con
su menor tiempo de respuesta.
Ahora mismo se está discutiendo cómo gestionar esos "retrasos", especialmente
en tareas que estén en un estado durmiente; pero en el momento en el que
se escribe este texto EEVDF usa un mecanismo de "decaimiento" basado en el
tiempo virtual de ejecución (VRT, del inglés: virtual run time). Esto previene
a las tareas de abusar del sistema simplemente durmiendo brevemente para
reajustar su retraso negativo: cuando una tarea duerme, esta permanece en
la cola de ejecución pero marcada para "desencolado diferido", permitiendo
a su retraso decaer a lo largo de VRT. Por tanto, las tareas que duerman
por más tiempo eventualmente eliminarán su retraso. Finalmente, las tareas
pueden adelantarse a otras si su VD es más próximo en el tiempo, y las
tareas podrán pedir porciones de tiempo específicas con la nueva llamada
del sistema sched_setattr(), todo esto facilitara el trabajo de las aplicaciones
que sean sensibles a las latencias.
REFERENCIAS
===========
[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=805acf7726282721504c8f00575d91ebfd750564
[2] https://lore.kernel.org/lkml/a79014e6-ea83-b316-1e12-2ae056bda6fa@linux.vnet.ibm.com/
[3] https://lwn.net/Articles/969062/
[4] https://lwn.net/Articles/925371/
......@@ -37,7 +37,6 @@ Todolist:
reporting-issues
reporting-regressions
security-bugs
bug-hunting
bug-bisect
tainted-kernels
......
......@@ -300,7 +300,7 @@ Documentation/admin-guide/reporting-regressions.rst 对此进行了更详细的
添加到回归跟踪列表中,以确保它不会被忽略。
什么是安全问题留给您自己判断。在继续之前,请考虑阅读
Documentation/translations/zh_CN/admin-guide/security-bugs.rst ,
Documentation/translations/zh_CN/process/security-bugs.rst ,
因为它提供了如何最恰当地处理安全问题的额外细节。
当发生了完全无法接受的糟糕事情时,此问题就是一个“非常严重的问题”。例如,
......@@ -983,7 +983,7 @@ Documentation/admin-guide/reporting-regressions.rst ;它还提供了大量其
报告,请将报告的文本转发到这些地址;但请在报告的顶部加上注释,表明您提交了
报告,并附上工单链接。
更多信息请参见 Documentation/translations/zh_CN/admin-guide/security-bugs.rst 。
更多信息请参见 Documentation/translations/zh_CN/process/security-bugs.rst 。
发布报告后的责任
......
......@@ -21,6 +21,7 @@ Documentation/translations/zh_CN/dev-tools/testing-overview.rst
testing-overview
sparse
kcov
kcsan
gcov
kasan
ubsan
......@@ -32,7 +33,6 @@ Todolist:
- checkpatch
- coccinelle
- kmsan
- kcsan
- kfence
- kgdb
- kselftest
......
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/dev-tools/kcsan.rst
:Translator: 刘浩阳 Haoyang Liu <tttturtleruss@hust.edu.cn>
内核并发消毒剂(KCSAN)
=====================
内核并发消毒剂(KCSAN)是一个动态竞争检测器,依赖编译时插桩,并且使用基于观察
点的采样方法来检测竞争。KCSAN 的主要目的是检测 `数据竞争`_。
使用
----
KCSAN 受 GCC 和 Clang 支持。使用 GCC 需要版本 11 或更高,使用 Clang 也需要
版本 11 或更高。
为了启用 KCSAN,用如下参数配置内核::
CONFIG_KCSAN = y
KCSAN 提供了几个其他的配置选项来自定义行为(见 ``lib/Kconfig.kcsan`` 中的各自的
帮助文档以获取更多信息)。
错误报告
~~~~~~~~
一个典型数据竞争的报告如下所示::
==================================================================
BUG: KCSAN: data-race in test_kernel_read / test_kernel_write
write to 0xffffffffc009a628 of 8 bytes by task 487 on cpu 0:
test_kernel_write+0x1d/0x30
access_thread+0x89/0xd0
kthread+0x23e/0x260
ret_from_fork+0x22/0x30
read to 0xffffffffc009a628 of 8 bytes by task 488 on cpu 6:
test_kernel_read+0x10/0x20
access_thread+0x89/0xd0
kthread+0x23e/0x260
ret_from_fork+0x22/0x30
value changed: 0x00000000000009a6 -> 0x00000000000009b2
Reported by Kernel Concurrency Sanitizer on:
CPU: 6 PID: 488 Comm: access_thread Not tainted 5.12.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
==================================================================
报告的头部提供了一个关于竞争中涉及到的函数的简短总结。随后是竞争中的两个线程的
访问类型和堆栈信息。如果 KCSAN 发现了一个值的变化,那么那个值的旧值和新值会在
“value changed”这一行单独显示。
另一个不太常见的数据竞争类型的报告如下所示::
==================================================================
BUG: KCSAN: data-race in test_kernel_rmw_array+0x71/0xd0
race at unknown origin, with read to 0xffffffffc009bdb0 of 8 bytes by task 515 on cpu 2:
test_kernel_rmw_array+0x71/0xd0
access_thread+0x89/0xd0
kthread+0x23e/0x260
ret_from_fork+0x22/0x30
value changed: 0x0000000000002328 -> 0x0000000000002329
Reported by Kernel Concurrency Sanitizer on:
CPU: 2 PID: 515 Comm: access_thread Not tainted 5.12.0-rc2+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
==================================================================
这个报告是当另一个竞争线程不可能被发现,但是可以从观测的内存地址的值改变而推断
出来的时候生成的。这类报告总是会带有“value changed”行。这类报告的出现通常是因
为在竞争线程中缺少插桩,也可能是因为其他原因,比如 DMA 访问。这类报告只会在
设置了内核参数 ``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y`` 时才会出现,而这
个参数是默认启用的。
选择性分析
~~~~~~~~~~
对于一些特定的访问,函数,编译单元或者整个子系统,可能需要禁用数据竞争检测。
对于静态黑名单,有如下可用的参数:
* KCSAN 支持使用 ``data_race(expr)`` 注解,这个注解告诉 KCSAN 任何由访问
``expr`` 所引起的数据竞争都应该被忽略,其产生的行为后果被认为是安全的。请查阅
`在 LKMM 中 "标记共享内存访问"`_ 获得更多信息。
* 与 ``data_race(...)`` 相似,可以使用类型限定符 ``__data_racy`` 来标记一个变量
,所有访问该变量而导致的数据竞争都是故意为之并且应该被 KCSAN 忽略::
struct foo {
...
int __data_racy stats_counter;
...
};
* 使用函数属性 ``__no_kcsan`` 可以对整个函数禁用数据竞争检测::
__no_kcsan
void foo(void) {
...
为了动态限制该为哪些函数生成报告,查阅 `Debug 文件系统接口`_ 黑名单/白名单特性。
* 为特定的编译单元禁用数据竞争检测,将下列参数加入到 ``Makefile`` 中::
KCSAN_SANITIZE_file.o := n
* 为 ``Makefile`` 中的所有编译单元禁用数据竞争检测,将下列参数添加到相应的
``Makefile`` 中::
KCSAN_SANITIZE := n
.. _在 LKMM 中 "标记共享内存访问": https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/access-marking.txt
此外,KCSAN 可以根据偏好设置显示或隐藏整个类别的数据竞争。可以使用如下
Kconfig 参数进行更改:
* ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY``: 如果启用了该参数并且通过观测点
(watchpoint) 观测到一个有冲突的写操作,但是对应的内存地址中存储的值没有改变,
则不会报告这起数据竞争。
* ``CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC``: 假设默认情况下,不超过字大小的简
单对齐写入操作是原子的。假设这些写入操作不会受到不安全的编译器优化影响,从而导
致数据竞争。该选项使 KCSAN 不报告仅由不超过字大小的简单对齐写入操作引起
的冲突所导致的数据竞争。
* ``CONFIG_KCSAN_PERMISSIVE``: 启用额外的宽松规则来忽略某些常见类型的数据竞争。
与上面的规则不同,这条规则更加复杂,涉及到值改变模式,访问类型和地址。这个
选项依赖编译选项 ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=y``。请查看
``kernel/kcsan/permissive.h`` 获取更多细节。对于只侧重于特定子系统而不是整个
内核报告的测试者和维护者,建议禁用该选项。
要使用尽可能严格的规则,选择 ``CONFIG_KCSAN_STRICT=y``,这将配置 KCSAN 尽可
能紧密地遵循 Linux 内核内存一致性模型(LKMM)。
Debug 文件系统接口
~~~~~~~~~~~~~~~~~~
文件 ``/sys/kernel/debug/kcsan`` 提供了如下接口:
* 读 ``/sys/kernel/debug/kcsan`` 返回不同的运行时统计数据。
* 将 ``on`` 或 ``off`` 写入 ``/sys/kernel/debug/kcsan`` 允许打开或关闭 KCSAN。
* 将 ``!some_func_name`` 写入 ``/sys/kernel/debug/kcsan`` 会将
``some_func_name`` 添加到报告过滤列表中,该列表(默认)会将数据竞争报告中的顶
层堆栈帧是列表中函数的情况列入黑名单。
* 将 ``blacklist`` 或 ``whitelist`` 写入 ``/sys/kernel/debug/kcsan`` 会改变报告
过滤行为。例如,黑名单的特性可以用来过滤掉经常发生的数据竞争。白名单特性可以帮
助复现和修复测试。
性能调优
~~~~~~~~
影响 KCSAN 整体的性能和 bug 检测能力的核心参数是作为内核命令行参数公开的,其默认
值也可以通过相应的 Kconfig 选项更改。
* ``kcsan.skip_watch`` (``CONFIG_KCSAN_SKIP_WATCH``): 在另一个观测点设置之前每
个 CPU 要跳过的内存操作次数。更加频繁的设置观测点将增加观察到竞争情况的可能性
。这个参数对系统整体的性能和竞争检测能力影响最显著。
* ``kcsan.udelay_task`` (``CONFIG_KCSAN_UDELAY_TASK``): 对于任务,观测点设置之
后暂停执行的微秒延迟。值越大,检测到竞争情况的可能性越高。
* ``kcsan.udelay_interrupt`` (``CONFIG_KCSAN_UDELAY_INTERRUPT``): 对于中断,
观测点设置之后暂停执行的微秒延迟。中断对于延迟的要求更加严格,其延迟通常应该小
于为任务选择的延迟。
它们可以通过 ``/sys/module/kcsan/parameters/`` 在运行时进行调整。
数据竞争
--------
在一次执行中,如果两个内存访问存在 *冲突*,在不同的线程中并发执行,并且至少
有一个访问是 *简单访问*,则它们就形成了 *数据竞争*。如果它们访问了同一个内存地址并且
至少有一个是写操作,则称它们存在 *冲突*。有关更详细的讨论和定义,见
`LKMM 中的 "简单访问和数据竞争"`_。
.. _LKMM 中的 "简单访问和数据竞争": https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n1922
与 Linux 内核内存一致性模型(LKMM)的关系
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LKMM 定义了各种内存操作的传播和排序规则,让开发者可以推理并发代码。最终这允许确
定并发代码可能的执行情况并判断这些代码是否存在数据竞争。
KCSAN 可以识别 *被标记的原子操作* ( ``READ_ONCE``, ``WRITE_ONCE`` , ``atomic_*``
等),以及内存屏障所隐含的一部分顺序保证。启用 ``CONFIG_KCSAN_WEAK_MEMORY=y``
配置,KCSAN 会对加载或存储缓冲区进行建模,并可以检测遗漏的
``smp_mb()``, ``smp_wmb()``, ``smp_rmb()``, ``smp_store_release()``,以及所有的
具有等效隐含内存屏障的 ``atomic_*`` 操作。
请注意,KCSAN 不会报告所有由于缺失内存顺序而导致的数据竞争,特别是在需要内存屏障
来禁止后续内存操作在屏障之前重新排序的情况下。因此,开发人员应该仔细考虑那些未
被检查的内存顺序要求。
数据竞争以外的竞争检测
---------------------------
对于有着复杂并发设计的代码,竞争状况不总是表现为数据竞争。如果并发操作引起了意
料之外的系统行为,则认为发生了竞争状况。另一方面,数据竞争是在 C 语言层面定义
的。内核定义了一些宏定义用来检测非数据竞争的漏洞并发代码的属性。
.. note::
为了不引入新的文档编译警告,这里不展示宏定义的具体内容,如果想查看具体
宏定义可以结合原文(Documentation/dev-tools/kcsan.rst)阅读。
实现细节
--------
KCSAN 需要观测两个并发访问。特别重要的是,我们想要(a)增加观测到竞争的机会(尤
其是很少发生的竞争),以及(b)能够实际观测到这些竞争。我们可以通过(a)注入
不同的延迟,以及(b)使用地址观测点(或断点)来实现。
如果我们在设置了地址观察点的情况下故意延迟一个内存访问,然后观察到观察点被触发
,那么两个对同一地址的访问就发生了竞争。使用硬件观察点,这是 `DataCollider
<http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf>`_ 中采用
的方法。与 DataCollider 不同,KCSAN 不使用硬件观察点,而是依赖于编译器插桩和“软
观测点”。
在 KCSAN 中,观察点是通过一种高效的编码实现的,该编码将访问类型、大小和地址存储
在一个长整型变量中;使用“软观察点”的好处是具有可移植性和更大的灵活性。然后,
KCSAN依赖于编译器对普通访问的插桩。对于每个插桩的普通访问:
1. 检测是否存在一个符合的观测点,如果存在,并且至少有一个操作是写操作,则我们发
现了一个竞争访问。
2. 如果不存在匹配的观察点,则定期的设置一个观测点并随机延迟一小段时间。
3. 在延迟前检查数据值,并在延迟后重新检查数据值;如果值不匹配,我们推测存在一个
未知来源的竞争状况。
为了检测普通访问和标记访问之间的数据竞争,KCSAN 也对标记访问进行标记,但仅用于
检查是否存在观察点;即 KCSAN 不会在标记访问上设置观察点。通过不在标记操作上设
置观察点,如果对一个变量的所有并发访问都被正确标记,KCSAN 将永远不会触发观察点
,因此也不会报告这些访问。
弱内存建模
~~~~~~~~~~
KCSAN 通过建模访问重新排序(使用 ``CONFIG_KCSAN_WEAK_MEMORY=y``)来检测由于缺少
内存屏障而导致的数据竞争。每个设置了观察点的普通内存访问也会被选择在其函数范围
内进行模拟重新排序(最多一个正在进行的访问)。
一旦某个访问被选择用于重新排序,它将在函数范围内与每个其他访问进行检查。如果遇
到适当的内存屏障,该访问将不再被考虑进行模拟重新排序。
当内存操作的结果应该由屏障排序时,KCSAN 可以检测到仅由于缺失屏障而导致的冲突的
数据竞争。考虑下面的例子::
int x, flag;
void T1(void)
{
x = 1; // data race!
WRITE_ONCE(flag, 1); // correct: smp_store_release(&flag, 1)
}
void T2(void)
{
while (!READ_ONCE(flag)); // correct: smp_load_acquire(&flag)
... = x; // data race!
}
当启用了弱内存建模,KCSAN 将考虑对 ``T1`` 中的 ``x`` 进行模拟重新排序。在写入
``flag`` 之后,x再次被检查是否有并发访问:因为 ``T2`` 可以在写入
``flag`` 之后继续进行,因此检测到数据竞争。如果遇到了正确的屏障, ``x`` 在正确
释放 ``flag`` 后将不会被考虑重新排序,因此不会检测到数据竞争。
在复杂性上的权衡以及实际的限制意味着只能检测到一部分由于缺失内存屏障而导致的数
据竞争。由于当前可用的编译器支持,KCSAN 的实现仅限于建模“缓冲”(延迟访问)的
效果,因为运行时不能“预取”访问。同时要注意,观测点只设置在普通访问上,这是唯
一一个 KCSAN 会模拟重新排序的访问类型。这意味着标记访问的重新排序不会被建模。
上述情况的一个后果是获取 (acquire) 操作不需要屏障插桩(不需要预取)。此外,引
入地址或控制依赖的标记访问不需要特殊处理(标记访问不能重新排序,后续依赖的访问
不能被预取)。
关键属性
~~~~~~~~
1. **内存开销**:整体的内存开销只有几 MiB,取决于配置。当前的实现是使用一个小长
整型数组来编码观测点信息,几乎可以忽略不计。
2. **性能开销**:KCSAN 的运行时旨在性能开销最小化,使用一个高效的观测点编码,在
快速路径中不需要获取任何锁。在拥有 8 个 CPU 的系统上的内核启动来说:
- 使用默认 KCSAN 配置时,性能下降 5 倍;
- 仅因运行时快速路径开销导致性能下降 2.8 倍(设置非常大的
``KCSAN_SKIP_WATCH`` 并取消设置 ``KCSAN_SKIP_WATCH_RANDOMIZE``)。
3. **注解开销**:KCSAN 运行时之外需要的注释很少。因此,随着内核的发展维护的开
销也很小。
4. **检测设备的竞争写入**:由于设置观测点时会检查数据值,设备的竞争写入也可以
被检测到。
5. **内存排序**:KCSAN 只了解一部分 LKMM 排序规则;这可能会导致漏报数据竞争(
假阴性)。
6. **分析准确率**: 对于观察到的执行,由于使用采样策略,分析是 *不健全* 的
(可能有假阴性),但期望得到完整的分析(没有假阳性)。
考虑的替代方案
--------------
一个内核数据竞争检测的替代方法是 `Kernel Thread Sanitizer (KTSAN)
<https://github.com/google/kernel-sanitizers/blob/master/KTSAN.md>`_。KTSAN 是一
个基于先行发生关系(happens-before)的数据竞争检测器,它显式建立内存操作之间的先
后发生顺序,这可以用来确定 `数据竞争`_ 中定义的数据竞争。
为了建立正确的先行发生关系,KTSAN 必须了解 LKMM 的所有排序规则和同步原语。不幸
的是,任何遗漏都会导致大量的假阳性,这在包含众多自定义同步机制的内核上下文中特
别有害。为了跟踪前因后果关系,KTSAN 的实现需要为每个内存位置提供元数据(影子内
存),这意味着每页内存对应 4 页影子内存,在大型系统上可能会带来数十 GiB 的开销
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/doc-guide/checktransupdate.rst
:译者: 慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
检查翻译更新
这个脚本帮助跟踪不同语言的文档翻译状态,即文档是否与对应的英文版本保持更新。
工作原理
------------
它使用 ``git log`` 命令来跟踪翻译提交的最新英文提交(按作者日期排序)和英文文档的
最新提交。如果有任何差异,则该文件被认为是过期的,然后需要更新的提交将被收集并报告。
实现的功能
- 检查特定语言中的所有文件
- 检查单个文件或一组文件
- 提供更改输出格式的选项
- 跟踪没有翻译过的文件的翻译状态
用法
-----
::
./scripts/checktransupdate.py --help
具体用法请参考参数解析器的输出
示例
- ``./scripts/checktransupdate.py -l zh_CN``
这将打印 zh_CN 语言中需要更新的所有文件。
- ``./scripts/checktransupdate.py Documentation/translations/zh_CN/dev-tools/testing-overview.rst``
这将只打印指定文件的状态。
然后输出类似如下的内容:
::
Documentation/dev-tools/kfence.rst
No translation in the locale of zh_CN
Documentation/translations/zh_CN/dev-tools/testing-overview.rst
commit 42fb9cfd5b18 ("Documentation: dev-tools: Add link to RV docs")
1 commits needs resolving in total
待实现的功能
- 文件参数可以是文件夹而不仅仅是单个文件
......@@ -18,6 +18,7 @@
parse-headers
contributing
maintainer-profile
checktransupdate
.. only:: subproject and html
......
......@@ -89,10 +89,10 @@ TODOList:
admin-guide/index
admin-guide/reporting-issues.rst
userspace-api/index
内核构建系统 <kbuild/index>
TODOList:
* 内核构建系统 <kbuild/index>
* 用户空间工具 <tools/index>
也可参考独立于内核文档的 `Linux 手册页 <https://www.kernel.org/doc/man-pages/>`_ 。
......
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/kbuild/gcc-plugins.rst
:Translator: 慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
================
GCC 插件基础设施
================
介绍
====
GCC 插件是为编译器提供额外功能的可加载模块 [1]_。它们对于运行时插装和静态分析非常有用。
我们可以在编译过程中通过回调 [2]_,GIMPLE [3]_,IPA [4]_ 和 RTL Passes [5]_
(译者注:Pass 是编译器所采用的一种结构化技术,用于完成编译对象的分析、优化或转换等功能)
来分析、修改和添加更多的代码。
内核的 GCC 插件基础设施支持构建树外模块、交叉编译和在单独的目录中构建。插件源文件必须由
C++ 编译器编译。
目前 GCC 插件基础设施只支持一些架构。搜索 "select HAVE_GCC_PLUGINS" 来查找支持
GCC 插件的架构。
这个基础设施是从 grsecurity [6]_ 和 PaX [7]_ 移植过来的。
--
.. [1] https://gcc.gnu.org/onlinedocs/gccint/Plugins.html
.. [2] https://gcc.gnu.org/onlinedocs/gccint/Plugin-API.html#Plugin-API
.. [3] https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html
.. [4] https://gcc.gnu.org/onlinedocs/gccint/IPA.html
.. [5] https://gcc.gnu.org/onlinedocs/gccint/RTL.html
.. [6] https://grsecurity.net/
.. [7] https://pax.grsecurity.net/
目的
====
GCC 插件的设计目的是提供一个用于试验 GCC 或 Clang 上游没有的潜在编译器功能的场所。
一旦它们的实用性得到验证,这些功能将被添加到 GCC(和 Clang)的上游。随后,在所有
支持的 GCC 版本都支持这些功能后,它们会被从内核中移除。
具体来说,新插件应该只实现上游编译器(GCC 和 Clang)不支持的功能。
当 Clang 中存在 GCC 中不存在的某项功能时,应努力将该功能做到 GCC 上游(而不仅仅
是作为内核专用的 GCC 插件),以使整个生态都能从中受益。
类似的,如果 GCC 插件提供的功能在 Clang 中 **不** 存在,但该功能被证明是有用的,也应
努力将该功能上传到 GCC(和 Clang)。
在上游 GCC 提供了某项功能后,该插件将无法在相应的 GCC 版本(以及更高版本)下编译。
一旦所有内核支持的 GCC 版本都提供了该功能,该插件将从内核中移除。
文件
====
**$(src)/scripts/gcc-plugins**
这是 GCC 插件的目录。
**$(src)/scripts/gcc-plugins/gcc-common.h**
这是 GCC 插件的兼容性头文件。
应始终包含它,而不是单独的 GCC 头文件。
**$(src)/scripts/gcc-plugins/gcc-generate-gimple-pass.h,
$(src)/scripts/gcc-plugins/gcc-generate-ipa-pass.h,
$(src)/scripts/gcc-plugins/gcc-generate-simple_ipa-pass.h,
$(src)/scripts/gcc-plugins/gcc-generate-rtl-pass.h**
这些头文件可以自动生成 GIMPLE、SIMPLE_IPA、IPA 和 RTL passes 的注册结构。
与手动创建结构相比,它们更受欢迎。
用法
====
你必须为你的 GCC 版本安装 GCC 插件头文件,以 Ubuntu 上的 gcc-10 为例::
apt-get install gcc-10-plugin-dev
或者在 Fedora 上::
dnf install gcc-plugin-devel libmpc-devel
或者在 Fedora 上使用包含插件的交叉编译器时::
dnf install libmpc-devel
在内核配置中启用 GCC 插件基础设施与一些你想使用的插件::
CONFIG_GCC_PLUGINS=y
CONFIG_GCC_PLUGIN_LATENT_ENTROPY=y
...
运行 gcc(本地或交叉编译器),确保能够检测到插件头文件::
gcc -print-file-name=plugin
CROSS_COMPILE=arm-linux-gnu- ${CROSS_COMPILE}gcc -print-file-name=plugin
"plugin" 这个词意味着它们没有被检测到::
plugin
完整的路径则表示插件已经被检测到::
/usr/lib/gcc/x86_64-redhat-linux/12/plugin
编译包括插件在内的最小工具集::
make scripts
或者直接在内核中运行 make,使用循环复杂性 GCC 插件编译整个内核。
4. 如何添加新的 GCC 插件
========================
GCC 插件位于 scripts/gcc-plugins/。你需要将插件源文件放在 scripts/gcc-plugins/ 目录下。
子目录创建并不支持,你必须添加在 scripts/gcc-plugins/Makefile、scripts/Makefile.gcc-plugins
和相关的 Kconfig 文件中。
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/kbuild/headers_install.rst
:Translator: 慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
============================
导出内核头文件供用户空间使用
============================
"make headers_install" 命令以适合于用户空间程序的形式导出内核头文件。
Linux 内核导出的头文件描述了用户空间程序尝试使用内核服务的 API。这些内核
头文件被系统的 C 库(例如 glibc 和 uClibc)用于定义可用的系统调用,以及
与这些系统调用一起使用的常量和结构。C 库的头文件包括来自 linux 子目录的
内核头文件。系统的 libc 头文件通常被安装在默认位置 /usr/include,而内核
头文件在该位置的子目录中(主要是 /usr/include/linux 和 /usr/include/asm)。
内核头文件向后兼容,但不向前兼容。这意味着使用旧内核头文件的 C 库构建的程序
可以在新内核上运行(尽管它可能无法访问新特性),但使用新内核头文件构建的程序
可能无法在旧内核上运行。
"make headers_install" 命令可以在内核源代码的顶层目录中运行(或使用标准
的树外构建)。它接受两个可选参数::
make headers_install ARCH=i386 INSTALL_HDR_PATH=/usr
ARCH 表明为其生成头文件的架构,默认为当前架构。导出内核头文件的 linux/asm
目录是基于特定平台的,要查看支持架构的完整列表,使用以下命令::
ls -d include/asm-* | sed 's/.*-//'
INSTALL_HDR_PATH 表明头文件的安装位置,默认为 "./usr"。
该命令会在 INSTALL_HDR_PATH 中自动创建创建一个 'include' 目录,而头文件
会被安装在 INSTALL_HDR_PATH/include 中。
内核头文件导出的基础设施由 David Woodhouse <dwmw2@infradead.org> 维护。
.. SPDX-License-Identifier: GPL-2.0
.. include:: ../disclaimer-zh_CN.rst
:Original: Documentation/kbuild/index.rst
:Translator: 慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
============
内核编译系统
============
.. toctree::
:maxdepth: 1
headers_install
gcc-plugins
TODO:
- kconfig-language
- kconfig-macro-language
- kbuild
- kconfig
- makefiles
- modules
- issues
- reproducible-builds
- llvm
.. only:: subproject and html
目录
=====
* :ref:`genindex`
......@@ -49,10 +49,11 @@ TODOLIST:
embargoed-hardware-issues
cve
security-bugs
TODOLIST:
* security-bugs
* handling-regressions
其它大多数开发人员感兴趣的社区指南:
......
.. SPDX-License-Identifier: GPL-2.0-or-later
.. include:: ../disclaimer-zh_CN.rst
:Original: :doc:`../../../process/security-bugs`
......@@ -5,6 +7,7 @@
:译者:
吴想成 Wu XiangCheng <bobwxc@email.cn>
慕冬亮 Dongliang Mu <dzm91@hust.edu.cn>
安全缺陷
=========
......@@ -17,13 +20,13 @@ Linux内核开发人员非常重视安全性。因此我们想知道何时发现
可以通过电子邮件<security@kernel.org>联系Linux内核安全团队。这是一个安全人员
的私有列表,他们将帮助验证错误报告并开发和发布修复程序。如果您已经有了一个
修复,请将其包含在您的报告中,这样可以大大加快进程。安全团队可能会从区域维护
修复,请将其包含在您的报告中,这样可以大大加快处理进程。安全团队可能会从区域维护
人员那里获得额外的帮助,以理解和修复安全漏洞。
与任何缺陷一样,提供的信息越多,诊断和修复就越容易。如果您不清楚哪些信息有用,
请查看“Documentation/translations/zh_CN/admin-guide/reporting-issues.rst”中
概述的步骤。任何利用漏洞的攻击代码都非常有用,未经报告者同意不会对外发布,
非已经公开。
概述的步骤。任何利用漏洞的攻击代码都非常有用,未经报告者同意不会对外发布,
非已经公开。
请尽可能发送无附件的纯文本电子邮件。如果所有的细节都藏在附件里,那么就很难对
一个复杂的问题进行上下文引用的讨论。把它想象成一个
......@@ -49,24 +52,31 @@ Linux内核开发人员非常重视安全性。因此我们想知道何时发现
换句话说,我们唯一感兴趣的是修复缺陷。提交给安全列表的所有其他资料以及对报告
的任何后续讨论,即使在解除限制之后,也将永久保密。
协调
------
与其他团队协调
--------------
虽然内核安全团队仅关注修复漏洞,但还有其他组织关注修复发行版上的安全问题以及协调
操作系统厂商的漏洞披露。协调通常由 "linux-distros" 邮件列表处理,而披露则由
公共 "oss-security" 邮件列表进行。两者紧密关联且被展示在 linux-distros 维基:
<https://oss-security.openwall.org/wiki/mailing-lists/distros>
请注意,这三个列表的各自政策和规则是不同的,因为它们追求不同的目标。内核安全团队
与其他团队之间的协调很困难,因为对于内核安全团队,保密期(即最大允许天数)是从补丁
可用时开始,而 "linux-distros" 则从首次发布到列表时开始计算,无论是否存在补丁。
对敏感缺陷(例如那些可能导致权限提升的缺陷)的修复可能需要与私有邮件列表
<linux-distros@vs.openwall.org>进行协调,以便分发供应商做好准备,在公开披露
上游补丁时发布一个已修复的内核。发行版将需要一些时间来测试建议的补丁,通常
会要求至少几天的限制,而供应商更新发布更倾向于周二至周四。若合适,安全团队
可以协助这种协调,或者报告者可以从一开始就包括linux发行版。在这种情况下,请
记住在电子邮件主题行前面加上“[vs]”,如linux发行版wiki中所述:
<http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists>。
因此,内核安全团队强烈建议,作为一位潜在安全问题的报告者,在受影响代码的维护者
接受补丁之前,且在您阅读上述发行版维基页面并完全理解联系 "linux-distros"
邮件列表会对您和内核社区施加的要求之前,不要联系 "linux-distros" 邮件列表。
这也意味着通常情况下不要同时抄送两个邮件列表,除非在协调时有已接受但尚未合并的补丁。
换句话说,在补丁被接受之前,不要抄送 "linux-distros";在修复程序被合并之后,
不要抄送内核安全团队。
CVE分配
--------
安全团队通常不分配CVE,我们也不需要它们来进行报告或修复,因为这会使过程不必
要的复杂化,并可能耽误缺陷处理。如果报告者希望在公开披露之前分配一个CVE编号,
他们需要联系上述的私有linux-distros列表。当在提供补丁之前已有这样的CVE编号时,
如报告者愿意,最好在提交消息中提及它。
安全团队不分配 CVE,同时我们也不需要 CVE 来报告或修复漏洞,因为这会使过程不必要
的复杂化,并可能延误漏洞处理。如果报告者希望为确认的问题分配一个 CVE 编号,
可以联系 :doc:`内核 CVE 分配团队 <../process/cve>` 获取。
保密协议
---------
......
......@@ -208,7 +208,7 @@ torvalds@linux-foundation.org 。他收到的邮件很多,所以一般来说
如果您有修复可利用安全漏洞的补丁,请将该补丁发送到 security@kernel.org 。对于
严重的bug,可以考虑短期禁令以允许分销商(有时间)向用户发布补丁;在这种情况下,
显然不应将补丁发送到任何公共列表。
参见 Documentation/translations/zh_CN/admin-guide/security-bugs.rst 。
参见 Documentation/translations/zh_CN/process/security-bugs.rst 。
修复已发布内核中严重错误的补丁程序应该抄送给稳定版维护人员,方法是把以下列行
放进补丁的签准区(注意,不是电子邮件收件人)::
......
......@@ -301,7 +301,7 @@ Documentation/admin-guide/reporting-regressions.rst 對此進行了更詳細的
添加到迴歸跟蹤列表中,以確保它不會被忽略。
什麼是安全問題留給您自己判斷。在繼續之前,請考慮閱讀
Documentation/translations/zh_CN/admin-guide/security-bugs.rst ,
Documentation/translations/zh_CN/process/security-bugs.rst ,
因爲它提供瞭如何最恰當地處理安全問題的額外細節。
當發生了完全無法接受的糟糕事情時,此問題就是一個“非常嚴重的問題”。例如,
......@@ -984,7 +984,7 @@ Documentation/admin-guide/reporting-regressions.rst ;它還提供了大量其
報告,請將報告的文本轉發到這些地址;但請在報告的頂部加上註釋,表明您提交了
報告,並附上工單鏈接。
更多信息請參見 Documentation/translations/zh_CN/admin-guide/security-bugs.rst 。
更多信息請參見 Documentation/translations/zh_CN/process/security-bugs.rst 。
發佈報告後的責任
......
......@@ -209,7 +209,7 @@ torvalds@linux-foundation.org 。他收到的郵件很多,所以一般來說
如果您有修復可利用安全漏洞的補丁,請將該補丁發送到 security@kernel.org 。對於
嚴重的bug,可以考慮短期禁令以允許分銷商(有時間)向用戶發佈補丁;在這種情況下,
顯然不應將補丁發送到任何公共列表。
參見 Documentation/translations/zh_CN/admin-guide/security-bugs.rst 。
參見 Documentation/translations/zh_CN/process/security-bugs.rst 。
修復已發佈內核中嚴重錯誤的補丁程序應該抄送給穩定版維護人員,方法是把以下列行
放進補丁的籤準區(注意,不是電子郵件收件人)::
......
......@@ -78,6 +78,7 @@ Code Seq# Include File Comments
0x03 all linux/hdreg.h
0x04 D2-DC linux/umsdos_fs.h Dead since 2.6.11, but don't reuse these.
0x06 all linux/lp.h
0x07 9F-D0 linux/vmw_vmci_defs.h, uapi/linux/vm_sockets.h
0x09 all linux/raid/md_u.h
0x10 00-0F drivers/char/s390/vmcp.h
0x10 10-1F arch/s390/include/uapi/sclp_ctl.h
......@@ -292,6 +293,7 @@ Code Seq# Include File Comments
't' 80-8F linux/isdn_ppp.h
't' 90-91 linux/toshiba.h toshiba and toshiba_acpi SMM
'u' 00-1F linux/smb_fs.h gone
'u' 00-2F linux/ublk_cmd.h conflict!
'u' 20-3F linux/uvcvideo.h USB video class host driver
'u' 40-4f linux/udmabuf.h userspace dma-buf misc device
'v' 00-1F linux/ext2_fs.h conflict!
......
......@@ -14,6 +14,7 @@ KVM
s390/index
ppc-pv
x86/index
loongarch/index
locking
vcpu-requests
......
.. SPDX-License-Identifier: GPL-2.0
===================================
The LoongArch paravirtual interface
===================================
KVM hypercalls use the HVCL instruction with code 0x100 and the hypercall
number is put in a0. Up to five arguments may be placed in registers a1 - a5.
The return value is placed in v0 (an alias of a0).
Source code for this interface can be found in arch/loongarch/kvm*.
Querying for existence
======================
To determine if the host is running on KVM, we can utilize the cpucfg()
function at index CPUCFG_KVM_BASE (0x40000000).
The CPUCFG_KVM_BASE range, spanning from 0x40000000 to 0x400000FF, The
CPUCFG_KVM_BASE range between 0x40000000 - 0x400000FF is marked as reserved.
Consequently, all current and future processors will not implement any
feature within this range.
On a KVM-virtualized Linux system, a read operation on cpucfg() at index
CPUCFG_KVM_BASE (0x40000000) returns the magic string 'KVM\0'.
Once you have determined that your host is running on a paravirtualization-
capable KVM, you may now use hypercalls as described below.
KVM hypercall ABI
=================
The KVM hypercall ABI is simple, with one scratch register a0 (v0) and at most
five generic registers (a1 - a5) used as input parameters. The FP (Floating-
point) and vector registers are not utilized as input registers and must
remain unmodified during a hypercall.
Hypercall functions can be inlined as it only uses one scratch register.
The parameters are as follows:
======== ================= ================
Register IN OUT
======== ================= ================
a0 function number Return code
a1 1st parameter -
a2 2nd parameter -
a3 3rd parameter -
a4 4th parameter -
a5 5th parameter -
======== ================= ================
The return codes may be one of the following:
==== =========================
Code Meaning
==== =========================
0 Success
-1 Hypercall not implemented
-2 Bad Hypercall parameter
==== =========================
KVM Hypercalls Documentation
============================
The template for each hypercall is as follows:
1. Hypercall name
2. Purpose
1. KVM_HCALL_FUNC_IPI
------------------------
:Purpose: Send IPIs to multiple vCPUs.
- a0: KVM_HCALL_FUNC_IPI
- a1: Lower part of the bitmap for destination physical CPUIDs
- a2: Higher part of the bitmap for destination physical CPUIDs
- a3: The lowest physical CPUID in the bitmap
The hypercall lets a guest send multiple IPIs (Inter-Process Interrupts) with
at most 128 destinations per hypercall. The destinations are represented in a
bitmap contained in the first two input registers (a1 and a2).
Bit 0 of a1 corresponds to the physical CPUID in the third input register (a3)
and bit 1 corresponds to the physical CPUID in a3+1, and so on.
PV IPI on LoongArch includes both PV IPI multicast sending and PV IPI receiving,
and SWI is used for PV IPI inject since there is no VM-exits accessing SWI registers.
.. SPDX-License-Identifier: GPL-2.0
=========================
KVM for LoongArch systems
=========================
.. toctree::
:maxdepth: 2
hypercalls.rst
......@@ -249,7 +249,7 @@ Note that not all devices support these two calls, and some only
support the GETBOOTSTATUS call.
Some drivers can measure the temperature using the GETTEMP ioctl. The
returned value is the temperature in degrees fahrenheit::
returned value is the temperature in degrees Fahrenheit::
int temperature;
ioctl(fd, WDIOC_GETTEMP, &temperature);
......
......@@ -6753,6 +6753,7 @@ DOCUMENTATION PROCESS
M: Jonathan Corbet <corbet@lwn.net>
L: workflows@vger.kernel.org
S: Maintained
F: Documentation/dev-tools/
F: Documentation/maintainer/
F: Documentation/process/
......@@ -6760,6 +6761,7 @@ DOCUMENTATION REPORTING ISSUES
M: Thorsten Leemhuis <linux@leemhuis.info>
L: linux-doc@vger.kernel.org
S: Maintained
F: Documentation/admin-guide/bug-bisect.rst
F: Documentation/admin-guide/quickly-build-trimmed-linux.rst
F: Documentation/admin-guide/reporting-issues.rst
F: Documentation/admin-guide/verify-bugs-and-bisect-regressions.rst
......@@ -12360,6 +12362,7 @@ L: kvm@vger.kernel.org
L: loongarch@lists.linux.dev
S: Maintained
T: git git://git.kernel.org/pub/scm/virt/kvm/kvm.git
F: Documentation/virt/kvm/loongarch/
F: arch/loongarch/include/asm/kvm*
F: arch/loongarch/include/uapi/asm/kvm*
F: arch/loongarch/kvm/
......
......@@ -10,31 +10,28 @@ differences occur, report the file and commits that need to be updated.
The usage is as follows:
- ./scripts/checktransupdate.py -l zh_CN
This will print all the files that need to be updated in the zh_CN locale.
This will print all the files that need to be updated or translated in the zh_CN locale.
- ./scripts/checktransupdate.py Documentation/translations/zh_CN/dev-tools/testing-overview.rst
This will only print the status of the specified file.
The output is something like:
Documentation/translations/zh_CN/dev-tools/testing-overview.rst (1 commits)
Documentation/dev-tools/kfence.rst
No translation in the locale of zh_CN
Documentation/translations/zh_CN/dev-tools/testing-overview.rst
commit 42fb9cfd5b18 ("Documentation: dev-tools: Add link to RV docs")
1 commits needs resolving in total
"""
import os
from argparse import ArgumentParser, BooleanOptionalAction
import time
import logging
from argparse import ArgumentParser, ArgumentTypeError, BooleanOptionalAction
from datetime import datetime
flag_p_c = False
flag_p_uf = False
flag_debug = False
def dprint(*args, **kwargs):
if flag_debug:
print("[DEBUG] ", end="")
print(*args, **kwargs)
def get_origin_path(file_path):
"""Get the origin path from the translation path"""
paths = file_path.split("/")
tidx = paths.index("translations")
opaths = paths[:tidx]
......@@ -43,17 +40,16 @@ def get_origin_path(file_path):
def get_latest_commit_from(file_path, commit):
command = "git log --pretty=format:%H%n%aD%n%cD%n%n%B {} -1 -- {}".format(
commit, file_path
)
dprint(command)
"""Get the latest commit from the specified commit for the specified file"""
command = f"git log --pretty=format:%H%n%aD%n%cD%n%n%B {commit} -1 -- {file_path}"
logging.debug(command)
pipe = os.popen(command)
result = pipe.read()
result = result.split("\n")
if len(result) <= 1:
return None
dprint("Result: {}".format(result[0]))
logging.debug("Result: %s", result[0])
return {
"hash": result[0],
......@@ -64,17 +60,19 @@ def get_latest_commit_from(file_path, commit):
def get_origin_from_trans(origin_path, t_from_head):
"""Get the latest origin commit from the translation commit"""
o_from_t = get_latest_commit_from(origin_path, t_from_head["hash"])
while o_from_t is not None and o_from_t["author_date"] > t_from_head["author_date"]:
o_from_t = get_latest_commit_from(origin_path, o_from_t["hash"] + "^")
if o_from_t is not None:
dprint("tracked origin commit id: {}".format(o_from_t["hash"]))
logging.debug("tracked origin commit id: %s", o_from_t["hash"])
return o_from_t
def get_commits_count_between(opath, commit1, commit2):
command = "git log --pretty=format:%H {}...{} -- {}".format(commit1, commit2, opath)
dprint(command)
"""Get the commits count between two commits for the specified file"""
command = f"git log --pretty=format:%H {commit1}...{commit2} -- {opath}"
logging.debug(command)
pipe = os.popen(command)
result = pipe.read().split("\n")
# filter out empty lines
......@@ -83,50 +81,120 @@ def get_commits_count_between(opath, commit1, commit2):
def pretty_output(commit):
command = "git log --pretty='format:%h (\"%s\")' -1 {}".format(commit)
dprint(command)
"""Pretty print the commit message"""
command = f"git log --pretty='format:%h (\"%s\")' -1 {commit}"
logging.debug(command)
pipe = os.popen(command)
return pipe.read()
def valid_commit(commit):
"""Check if the commit is valid or not"""
msg = pretty_output(commit)
return "Merge tag" not in msg
def check_per_file(file_path):
"""Check the translation status for the specified file"""
opath = get_origin_path(file_path)
if not os.path.isfile(opath):
dprint("Error: Cannot find the origin path for {}".format(file_path))
logging.error("Cannot find the origin path for {file_path}")
return
o_from_head = get_latest_commit_from(opath, "HEAD")
t_from_head = get_latest_commit_from(file_path, "HEAD")
if o_from_head is None or t_from_head is None:
print("Error: Cannot find the latest commit for {}".format(file_path))
logging.error("Cannot find the latest commit for %s", file_path)
return
o_from_t = get_origin_from_trans(opath, t_from_head)
if o_from_t is None:
print("Error: Cannot find the latest origin commit for {}".format(file_path))
logging.error("Error: Cannot find the latest origin commit for %s", file_path)
return
if o_from_head["hash"] == o_from_t["hash"]:
if flag_p_uf:
print("No update needed for {}".format(file_path))
return
logging.debug("No update needed for %s", file_path)
else:
print("{}".format(file_path), end="\t")
logging.info(file_path)
commits = get_commits_count_between(
opath, o_from_t["hash"], o_from_head["hash"]
)
print("({} commits)".format(len(commits)))
if flag_p_c:
count = 0
for commit in commits:
msg = pretty_output(commit)
if "Merge tag" not in msg:
print("commit", msg)
if valid_commit(commit):
logging.info("commit %s", pretty_output(commit))
count += 1
logging.info("%d commits needs resolving in total\n", count)
def valid_locales(locale):
"""Check if the locale is valid or not"""
script_path = os.path.dirname(os.path.abspath(__file__))
linux_path = os.path.join(script_path, "..")
if not os.path.isdir(f"{linux_path}/Documentation/translations/{locale}"):
raise ArgumentTypeError("Invalid locale: {locale}")
return locale
def list_files_with_excluding_folders(folder, exclude_folders, include_suffix):
"""List all files with the specified suffix in the folder and its subfolders"""
files = []
stack = [folder]
while stack:
pwd = stack.pop()
# filter out the exclude folders
if os.path.basename(pwd) in exclude_folders:
continue
# list all files and folders
for item in os.listdir(pwd):
ab_item = os.path.join(pwd, item)
if os.path.isdir(ab_item):
stack.append(ab_item)
else:
if ab_item.endswith(include_suffix):
files.append(ab_item)
return files
class DmesgFormatter(logging.Formatter):
"""Custom dmesg logging formatter"""
def format(self, record):
timestamp = time.time()
formatted_time = f"[{timestamp:>10.6f}]"
log_message = f"{formatted_time} {record.getMessage()}"
return log_message
def config_logging(log_level, log_file="checktransupdate.log"):
"""configure logging based on the log level"""
# set up the root logger
logger = logging.getLogger()
logger.setLevel(log_level)
# Create console handler
console_handler = logging.StreamHandler()
console_handler.setLevel(log_level)
# Create file handler
file_handler = logging.FileHandler(log_file)
file_handler.setLevel(log_level)
# Create formatter and add it to the handlers
formatter = DmesgFormatter()
console_handler.setFormatter(formatter)
file_handler.setFormatter(formatter)
# Add the handler to the logger
logger.addHandler(console_handler)
logger.addHandler(file_handler)
def main():
"""Main function of the script"""
script_path = os.path.dirname(os.path.abspath(__file__))
linux_path = os.path.join(script_path, "..")
......@@ -134,62 +202,62 @@ def main():
parser.add_argument(
"-l",
"--locale",
default="zh_CN",
type=valid_locales,
help="Locale to check when files are not specified",
)
parser.add_argument(
"--print-commits",
"--print-missing-translations",
action=BooleanOptionalAction,
default=True,
help="Print commits between the origin and the translation",
help="Print files that do not have translations",
)
parser.add_argument(
"--print-updated-files",
action=BooleanOptionalAction,
default=False,
help="Print files that do no need to be updated",
)
'--log',
default='INFO',
choices=['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL'],
help='Set the logging level')
parser.add_argument(
"--debug",
action=BooleanOptionalAction,
help="Print debug information",
default=False,
)
'--logfile',
default='checktransupdate.log',
help='Set the logging file (default: checktransupdate.log)')
parser.add_argument(
"files", nargs="*", help="Files to check, if not specified, check all files"
)
args = parser.parse_args()
global flag_p_c, flag_p_uf, flag_debug
flag_p_c = args.print_commits
flag_p_uf = args.print_updated_files
flag_debug = args.debug
# Configure logging based on the --log argument
log_level = getattr(logging, args.log.upper(), logging.INFO)
config_logging(log_level)
# get files related to linux path
# Get files related to linux path
files = args.files
if len(files) == 0:
if args.locale is not None:
files = (
os.popen(
"find {}/Documentation/translations/{} -type f".format(
linux_path, args.locale
)
)
.read()
.split("\n")
offical_files = list_files_with_excluding_folders(
os.path.join(linux_path, "Documentation"), ["translations", "output"], "rst"
)
for file in offical_files:
# split the path into parts
path_parts = file.split(os.sep)
# find the index of the "Documentation" directory
kindex = path_parts.index("Documentation")
# insert the translations and locale after the Documentation directory
new_path_parts = path_parts[:kindex + 1] + ["translations", args.locale] \
+ path_parts[kindex + 1 :]
# join the path parts back together
new_file = os.sep.join(new_path_parts)
if os.path.isfile(new_file):
files.append(new_file)
else:
files = (
os.popen(
"find {}/Documentation/translations -type f".format(linux_path)
)
.read()
.split("\n")
)
if args.print_missing_translations:
logging.info(os.path.relpath(os.path.abspath(file), linux_path))
logging.info("No translation in the locale of %s\n", args.locale)
files = list(filter(lambda x: x != "", files))
files = list(map(lambda x: os.path.relpath(os.path.abspath(x), linux_path), files))
# cd to linux root directory
......
......@@ -54,6 +54,7 @@ my $output_section_maxlen = 50;
my $scm = 0;
my $tree = 1;
my $web = 0;
my $bug = 0;
my $subsystem = 0;
my $status = 0;
my $letters = "";
......@@ -271,6 +272,7 @@ if (!GetOptions(
'scm!' => \$scm,
'tree!' => \$tree,
'web!' => \$web,
'bug!' => \$bug,
'letters=s' => \$letters,
'pattern-depth=i' => \$pattern_depth,
'k|keywords!' => \$keywords,
......@@ -320,13 +322,14 @@ if ($sections || $letters ne "") {
$status = 0;
$subsystem = 0;
$web = 0;
$bug = 0;
$keywords = 0;
$keywords_in_file = 0;
$interactive = 0;
} else {
my $selections = $email + $scm + $status + $subsystem + $web;
my $selections = $email + $scm + $status + $subsystem + $web + $bug;
if ($selections == 0) {
die "$P: Missing required option: email, scm, status, subsystem or web\n";
die "$P: Missing required option: email, scm, status, subsystem, web or bug\n";
}
}
......@@ -631,6 +634,7 @@ my %hash_list_to;
my @list_to = ();
my @scm = ();
my @web = ();
my @bug = ();
my @subsystem = ();
my @status = ();
my %deduplicate_name_hash = ();
......@@ -662,6 +666,11 @@ if ($web) {
output(@web);
}
if ($bug) {
@bug = uniq(@bug);
output(@bug);
}
exit($exit);
sub self_test {
......@@ -847,6 +856,7 @@ sub get_maintainers {
@list_to = ();
@scm = ();
@web = ();
@bug = ();
@subsystem = ();
@status = ();
%deduplicate_name_hash = ();
......@@ -1069,6 +1079,7 @@ MAINTAINER field selection options:
--status => print status if any
--subsystem => print subsystem name if any
--web => print website(s) if any
--bug => print bug reporting info if any
Output type options:
--separator [, ] => separator for multiple entries on 1 line
......@@ -1382,6 +1393,8 @@ sub add_categories {
push(@scm, $pvalue . $suffix);
} elsif ($ptype eq "W") {
push(@web, $pvalue . $suffix);
} elsif ($ptype eq "B") {
push(@bug, $pvalue . $suffix);
} elsif ($ptype eq "S") {
push(@status, $pvalue . $suffix);
}
......
......@@ -300,8 +300,6 @@ sub check_sphinx()
}
$cur_version = get_sphinx_version($sphinx);
die ("$sphinx returned an error") if (!$cur_version);
die "$sphinx didn't return its version" if (!$cur_version);
if ($cur_version lt $min_version) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment