Commit a7668df2 authored by Patrick Mochel's avatar Patrick Mochel

[power] Update documentation.

Just the basics, more to come.
parent b4b1a575
Device Power Management
Device power management encompasses two areas - the ability to save
state and transition a device to a low-power state when the system is
entering a low-power state; and the ability to transition a device to
a low-power state while the system is running (and independently of
any other power management activity).
Methods
The methods to suspend and resume devices reside in struct bus_type:
struct bus_type {
...
int (*suspend)(struct device * dev, u32 state);
int (*resume)(struct device * dev);
};
Each bus driver is responsible implementing these methods, translating
the call into a bus-specific request and forwarding the call to the
bus-specific drivers. For example, PCI drivers implement suspend() and
resume() methods in struct pci_driver. The PCI core is simply
responsible for translating the pointers to PCI-specific ones and
calling the low-level driver.
This is done to a) ease transition to the new power management methods
and leverage the existing PM code in various bus drivers; b) allow
buses to implement generic and default PM routines for devices, and c)
make the flow of execution obvious to the reader.
System Power Management
When the system enters a low-power state, the device tree is walked in
a depth-first fashion to transition each device into a low-power
state. The ordering of the device tree is guaranteed by the order in
which devices get registered - children are never registered before
their ancestors, and devices are placed at the back of the list when
registered. By walking the list in reverse order, we are guaranteed to
suspend devices in the proper order.
Devices are suspended once with interrupts enabled. Drivers are
expected to stop I/O transactions, save device state, and place the
device into a low-power state. Drivers may sleep, allocate memory,
etc. at will.
Some devices are broken and will inevitably have problems powering
down or disabling themselves with interrupts enabled. For these
special cases, they may return -EAGAIN. This will put the device on a
list to be taken care of later. When interrupts are disabled, before
we enter the low-power state, their drivers are called again to put
their device to sleep.
On resume, the devices that returned -EAGAIN will be called to power
themselves back on with interrupts disabled. Once interrupts have been
re-enabled, the rest of the drivers will be called to resume their
devices. On resume, a driver is responsible for powering back on each
device, restoring state, and re-enabling I/O transactions for that
device.
System devices follow a slightly different API, which can be found in
include/linux/sysdev.h
drivers/base/sys.c
System devices will only be suspended with interrupts disabled, and
after all other devices have been suspended. On resume, they will be
resumed before any other devices, and also with interrupts disabled.
Power Management Interface
The power management subsystem provides a unified sysfs interface to
userspace, regardless of what architecture or platform one is
running. The interface exists in /sys/power/ directory (assuming sysfs
is mounted at /sys).
/sys/power/state controls system power state. Reading from this file
returns what states are supported, which is hard-coded to 'standby'
(Power-On Suspend), 'mem' (Suspend-to-RAM), and 'disk'
(Suspend-to-Disk).
Writing to this file one of those strings causes the system to
transition into that state. Please see the file
Documentation/power/states.txt for a description of each of those
states.
/sys/power/disk controls the operating mode of the suspend-to-disk
mechanism. Suspend-to-disk can be handled in several ways. The
greatest distinction is who writes memory to disk - the firmware or
the kernel. If the firmware does it, we assume that it also handles
suspending the system.
If the kernel does it, then we have three options for putting the system
to sleep - using the platform driver (e.g. ACPI or other PM
registers), powering off the system or rebooting the system (for
testing). The system will support either 'firmware' or 'platform', and
that is known a priori. But, the user may choose 'shutdown' or
'reboot' as alternatives.
Reading from this file will display what the mode is currently set
to. Writing to this file will accept one of
'firmware'
'platform'
'shutdown'
'reboot'
It will only change to 'firmware' or 'platform' if the system supports
it.
System Power Management States
The kernel supports three power management states generically, though
each is dependent on platform support code to implement the low-level
details for each state. This file describes each state, what they are
commonly called, what ACPI state they map to, and what string to write
to /sys/power/state to enter that state
State: Standby / Power-On Suspend
ACPI State: S1
String: "standby"
This state offers minimal, though real, power savings, while providing
a very low-latency transition back to a working system. No operating
state is lost (the CPU retains power), so the system easily starts up
again where it left off.
We try to put devices in a low-power state equivalent to D1, which
also offers low power savings, but low resume latency. Not all devices
support D1, and those that don't are left on.
A transition from Standby to the On state should take about 1-2
seconds.
State: Suspend-to-RAM
ACPI State: S3
String: "mem"
This state offers significant power savings as everything in the
system is put into a low-power state, except for memory, which is
placed in self-refresh mode to retain its contents.
System and device state is saved and kept in memory. All devices are
suspended and put into D3. In many cases, all peripheral buses lose
power when entering STR, so devices must be able to handle the
transition back to the On state.
For at least ACPI, STR requires some minimal boot-strapping code to
resume the system from STR. This may be true on other platforms.
A transition from Suspend-to-RAM to the On state should take about
3-5 seconds.
State: Suspend-to-disk
ACPI State: S4
String: "disk"
This state offers the greatest power savings, and can be used even in
the absence of low-level platform support for power management. This
state operates similarly to Suspend-to-RAM, but includes a final step
of writing memory contents to disk. On resume, this is read and memory
is restored to its pre-suspend state.
STD can be handled by the firmware or the kernel. If it is handled by
the firmware, it usually requires a dedicated partition that must be
setup via another operating system for it to use. Despite the
inconvenience, this method requires minimal work by the kernel, since
the firmware will also handle restoring memory contents on resume.
If the kernel is responsible for persistantly saving state, a mechanism
called 'swsusp' (Swap Suspend) is used to write memory contents to
free swap space. swsusp has some restrictive requirements, but should
work in most cases. Some, albeit outdated, documentation can be found
in Documentation/power/swsusp.txt.
Once memory state is written to disk, the system may either enter a
low-power state (like ACPI S4), or it may simply power down. Powering
down offers greater savings, and allows this mechanism to work on any
system. However, entering a real low-power state allows the user to
trigger wake up events (e.g. pressing a key or opening a laptop lid).
A transition from Suspend-to-Disk to the On state should take about 30
seconds, though it's typically a bit more with the current
implementation.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment