Commit 0bf1bd84 authored by Jonathan Corbet's avatar Jonathan Corbet

Merge branch 'rppt' into docs-next

This contains a set of early-boot memory-management docs from Mike
Rapoport.  It's been circulating on linux-mm for a long time; I finally
picked it up even though it changes a lot of .c files under mm/ (comments
only).
parents d1634e1a ae9d8845
===========================
Boot time memory management
===========================
Early system initialization cannot use "normal" memory management
simply because it is not set up yet. But there is still need to
allocate memory for various data structures, for instance for the
physical page allocator. To address this, a specialized allocator
called the :ref:`Boot Memory Allocator <bootmem>`, or bootmem, was
introduced. Several years later PowerPC developers added a "Logical
Memory Blocks" allocator, which was later adopted by other
architectures and renamed to :ref:`memblock <memblock>`. There is also
a compatibility layer called `nobootmem` that translates bootmem
allocation interfaces to memblock calls.
The selection of the early allocator is done using
``CONFIG_NO_BOOTMEM`` and ``CONFIG_HAVE_MEMBLOCK`` kernel
configuration options. These options are enabled or disabled
statically by the architectures' Kconfig files.
* Architectures that rely only on bootmem select
``CONFIG_NO_BOOTMEM=n && CONFIG_HAVE_MEMBLOCK=n``.
* The users of memblock with the nobootmem compatibility layer set
``CONFIG_NO_BOOTMEM=y && CONFIG_HAVE_MEMBLOCK=y``.
* And for those that use both memblock and bootmem the configuration
includes ``CONFIG_NO_BOOTMEM=n && CONFIG_HAVE_MEMBLOCK=y``.
Whichever allocator is used, it is the responsibility of the
architecture specific initialization to set it up in
:c:func:`setup_arch` and tear it down in :c:func:`mem_init` functions.
Once the early memory management is available it offers a variety of
functions and macros for memory allocations. The allocation request
may be directed to the first (and probably the only) node or to a
particular node in a NUMA system. There are API variants that panic
when an allocation fails and those that don't. And more recent and
advanced memblock even allows controlling its own behaviour.
.. _bootmem:
Bootmem
=======
(mostly stolen from Mel Gorman's "Understanding the Linux Virtual
Memory Manager" `book`_)
.. _book: https://www.kernel.org/doc/gorman/
.. kernel-doc:: mm/bootmem.c
:doc: bootmem overview
.. _memblock:
Memblock
========
.. kernel-doc:: mm/memblock.c
:doc: memblock overview
Functions and structures
========================
Common API
----------
The functions that are described in this section are available
regardless of what early memory manager is enabled.
.. kernel-doc:: mm/nobootmem.c
Bootmem specific API
--------------------
These interfaces available only with bootmem, i.e when ``CONFIG_NO_BOOTMEM=n``
.. kernel-doc:: include/linux/bootmem.h
.. kernel-doc:: mm/bootmem.c
:nodocs:
Memblock specific API
---------------------
Here is the description of memblock data structures, functions and
macros. Some of them are actually internal, but since they are
documented it would be silly to omit them. Besides, reading the
descriptions for the internal functions can help to understand what
really happens under the hood.
.. kernel-doc:: include/linux/memblock.h
.. kernel-doc:: mm/memblock.c
:nodocs:
......@@ -29,6 +29,7 @@ Core utilities
circular-buffers
gfp_mask-from-fs-io
timekeeping
boot-time-mm
Interfaces for kernel debugging
===============================
......
......@@ -27,9 +27,20 @@ extern unsigned long max_pfn;
extern unsigned long long max_possible_pfn;
#ifndef CONFIG_NO_BOOTMEM
/*
* node_bootmem_map is a map pointer - the bits represent all physical
* memory pages (including holes) on the node.
/**
* struct bootmem_data - per-node information used by the bootmem allocator
* @node_min_pfn: the starting physical address of the node's memory
* @node_low_pfn: the end physical address of the directly addressable memory
* @node_bootmem_map: is a bitmap pointer - the bits represent all physical
* memory pages (including holes) on the node.
* @last_end_off: the offset within the page of the end of the last allocation;
* if 0, the page used is full
* @hint_idx: the PFN of the page used with the last allocation;
* together with using this with the @last_end_offset field,
* a test can be made to see if allocations can be merged
* with the page used for the last allocation rather than
* using up a full new page.
* @list: list entry in the linked list ordered by the memory addresses
*/
typedef struct bootmem_data {
unsigned long node_min_pfn;
......
......@@ -20,31 +20,60 @@
#define INIT_MEMBLOCK_REGIONS 128
#define INIT_PHYSMEM_REGIONS 4
/* Definition of memblock flags. */
enum {
/**
* enum memblock_flags - definition of memory region attributes
* @MEMBLOCK_NONE: no special request
* @MEMBLOCK_HOTPLUG: hotpluggable region
* @MEMBLOCK_MIRROR: mirrored region
* @MEMBLOCK_NOMAP: don't add to kernel direct mapping
*/
enum memblock_flags {
MEMBLOCK_NONE = 0x0, /* No special request */
MEMBLOCK_HOTPLUG = 0x1, /* hotpluggable region */
MEMBLOCK_MIRROR = 0x2, /* mirrored region */
MEMBLOCK_NOMAP = 0x4, /* don't add to kernel direct mapping */
};
/**
* struct memblock_region - represents a memory region
* @base: physical address of the region
* @size: size of the region
* @flags: memory region attributes
* @nid: NUMA node id
*/
struct memblock_region {
phys_addr_t base;
phys_addr_t size;
unsigned long flags;
enum memblock_flags flags;
#ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
int nid;
#endif
};
/**
* struct memblock_type - collection of memory regions of certain type
* @cnt: number of regions
* @max: size of the allocated array
* @total_size: size of all regions
* @regions: array of regions
* @name: the memory type symbolic name
*/
struct memblock_type {
unsigned long cnt; /* number of regions */
unsigned long max; /* size of the allocated array */
phys_addr_t total_size; /* size of all regions */
unsigned long cnt;
unsigned long max;
phys_addr_t total_size;
struct memblock_region *regions;
char *name;
};
/**
* struct memblock - memblock allocator metadata
* @bottom_up: is bottom up direction?
* @current_limit: physical address of the current allocation limit
* @memory: usabe memory regions
* @reserved: reserved memory regions
* @physmem: all physical memory
*/
struct memblock {
bool bottom_up; /* is bottom up direction? */
phys_addr_t current_limit;
......@@ -72,7 +101,7 @@ void memblock_discard(void);
phys_addr_t memblock_find_in_range_node(phys_addr_t size, phys_addr_t align,
phys_addr_t start, phys_addr_t end,
int nid, ulong flags);
int nid, enum memblock_flags flags);
phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end,
phys_addr_t size, phys_addr_t align);
void memblock_allow_resize(void);
......@@ -89,19 +118,19 @@ int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
int memblock_mark_nomap(phys_addr_t base, phys_addr_t size);
int memblock_clear_nomap(phys_addr_t base, phys_addr_t size);
ulong choose_memblock_flags(void);
enum memblock_flags choose_memblock_flags(void);
/* Low level functions */
int memblock_add_range(struct memblock_type *type,
phys_addr_t base, phys_addr_t size,
int nid, unsigned long flags);
int nid, enum memblock_flags flags);
void __next_mem_range(u64 *idx, int nid, ulong flags,
void __next_mem_range(u64 *idx, int nid, enum memblock_flags flags,
struct memblock_type *type_a,
struct memblock_type *type_b, phys_addr_t *out_start,
phys_addr_t *out_end, int *out_nid);
void __next_mem_range_rev(u64 *idx, int nid, ulong flags,
void __next_mem_range_rev(u64 *idx, int nid, enum memblock_flags flags,
struct memblock_type *type_a,
struct memblock_type *type_b, phys_addr_t *out_start,
phys_addr_t *out_end, int *out_nid);
......@@ -239,7 +268,6 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
/**
* for_each_resv_unavail_range - iterate through reserved and unavailable memory
* @i: u64 used as loop variable
* @flags: pick from blocks based on memory attributes
* @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
* @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
*
......@@ -253,13 +281,13 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
NUMA_NO_NODE, MEMBLOCK_NONE, p_start, p_end, NULL)
static inline void memblock_set_region_flags(struct memblock_region *r,
unsigned long flags)
enum memblock_flags flags)
{
r->flags |= flags;
}
static inline void memblock_clear_region_flags(struct memblock_region *r,
unsigned long flags)
enum memblock_flags flags)
{
r->flags &= ~flags;
}
......@@ -317,10 +345,10 @@ static inline bool memblock_bottom_up(void)
phys_addr_t __init memblock_alloc_range(phys_addr_t size, phys_addr_t align,
phys_addr_t start, phys_addr_t end,
ulong flags);
enum memblock_flags flags);
phys_addr_t memblock_alloc_base_nid(phys_addr_t size,
phys_addr_t align, phys_addr_t max_addr,
int nid, ulong flags);
int nid, enum memblock_flags flags);
phys_addr_t memblock_alloc_base(phys_addr_t size, phys_addr_t align,
phys_addr_t max_addr);
phys_addr_t __memblock_alloc_base(phys_addr_t size, phys_addr_t align,
......@@ -367,8 +395,10 @@ phys_addr_t memblock_get_current_limit(void);
*/
/**
* memblock_region_memory_base_pfn - Return the lowest pfn intersecting with the memory region
* memblock_region_memory_base_pfn - get the lowest pfn of the memory region
* @reg: memblock_region structure
*
* Return: the lowest pfn intersecting with the memory region
*/
static inline unsigned long memblock_region_memory_base_pfn(const struct memblock_region *reg)
{
......@@ -376,8 +406,10 @@ static inline unsigned long memblock_region_memory_base_pfn(const struct membloc
}
/**
* memblock_region_memory_end_pfn - Return the end_pfn this region
* memblock_region_memory_end_pfn - get the end pfn of the memory region
* @reg: memblock_region structure
*
* Return: the end_pfn of the reserved region
*/
static inline unsigned long memblock_region_memory_end_pfn(const struct memblock_region *reg)
{
......@@ -385,8 +417,10 @@ static inline unsigned long memblock_region_memory_end_pfn(const struct memblock
}
/**
* memblock_region_reserved_base_pfn - Return the lowest pfn intersecting with the reserved region
* memblock_region_reserved_base_pfn - get the lowest pfn of the reserved region
* @reg: memblock_region structure
*
* Return: the lowest pfn intersecting with the reserved region
*/
static inline unsigned long memblock_region_reserved_base_pfn(const struct memblock_region *reg)
{
......@@ -394,8 +428,10 @@ static inline unsigned long memblock_region_reserved_base_pfn(const struct membl
}
/**
* memblock_region_reserved_end_pfn - Return the end_pfn this region
* memblock_region_reserved_end_pfn - get the end pfn of the reserved region
* @reg: memblock_region structure
*
* Return: the end_pfn of the reserved region
*/
static inline unsigned long memblock_region_reserved_end_pfn(const struct memblock_region *reg)
{
......
......@@ -21,6 +21,53 @@
#include "internal.h"
/**
* DOC: bootmem overview
*
* Bootmem is a boot-time physical memory allocator and configurator.
*
* It is used early in the boot process before the page allocator is
* set up.
*
* Bootmem is based on the most basic of allocators, a First Fit
* allocator which uses a bitmap to represent memory. If a bit is 1,
* the page is allocated and 0 if unallocated. To satisfy allocations
* of sizes smaller than a page, the allocator records the Page Frame
* Number (PFN) of the last allocation and the offset the allocation
* ended at. Subsequent small allocations are merged together and
* stored on the same page.
*
* The information used by the bootmem allocator is represented by
* :c:type:`struct bootmem_data`. An array to hold up to %MAX_NUMNODES
* such structures is statically allocated and then it is discarded
* when the system initialization completes. Each entry in this array
* corresponds to a node with memory. For UMA systems only entry 0 is
* used.
*
* The bootmem allocator is initialized during early architecture
* specific setup. Each architecture is required to supply a
* :c:func:`setup_arch` function which, among other tasks, is
* responsible for acquiring the necessary parameters to initialise
* the boot memory allocator. These parameters define limits of usable
* physical memory:
*
* * @min_low_pfn - the lowest PFN that is available in the system
* * @max_low_pfn - the highest PFN that may be addressed by low
* memory (%ZONE_NORMAL)
* * @max_pfn - the last PFN available to the system.
*
* After those limits are determined, the :c:func:`init_bootmem` or
* :c:func:`init_bootmem_node` function should be called to initialize
* the bootmem allocator. The UMA case should use the `init_bootmem`
* function. It will initialize ``contig_page_data`` structure that
* represents the only memory node in the system. In the NUMA case the
* `init_bootmem_node` function should be called to initialize the
* bootmem allocator for each node.
*
* Once the allocator is set up, it is possible to use either single
* node or NUMA variant of the allocation APIs.
*/
#ifndef CONFIG_NEED_MULTIPLE_NODES
struct pglist_data __refdata contig_page_data = {
.bdata = &bootmem_node_data[0]
......@@ -62,6 +109,8 @@ static unsigned long __init bootmap_bytes(unsigned long pages)
/**
* bootmem_bootmap_pages - calculate bitmap size in pages
* @pages: number of pages the bitmap has to represent
*
* Return: the number of pages needed to hold the bitmap.
*/
unsigned long __init bootmem_bootmap_pages(unsigned long pages)
{
......@@ -121,7 +170,7 @@ static unsigned long __init init_bootmem_core(bootmem_data_t *bdata,
* @startpfn: first pfn on the node
* @endpfn: first pfn after the node
*
* Returns the number of bytes needed to hold the bitmap for this node.
* Return: the number of bytes needed to hold the bitmap for this node.
*/
unsigned long __init init_bootmem_node(pg_data_t *pgdat, unsigned long freepfn,
unsigned long startpfn, unsigned long endpfn)
......@@ -134,7 +183,7 @@ unsigned long __init init_bootmem_node(pg_data_t *pgdat, unsigned long freepfn,
* @start: pfn where the bitmap is to be placed
* @pages: number of available physical pages
*
* Returns the number of bytes needed to hold the bitmap.
* Return: the number of bytes needed to hold the bitmap.
*/
unsigned long __init init_bootmem(unsigned long start, unsigned long pages)
{
......@@ -143,15 +192,6 @@ unsigned long __init init_bootmem(unsigned long start, unsigned long pages)
return init_bootmem_core(NODE_DATA(0)->bdata, start, 0, pages);
}
/*
* free_bootmem_late - free bootmem pages directly to page allocator
* @addr: starting physical address of the range
* @size: size of the range in bytes
*
* This is only useful when the bootmem allocator has already been torn
* down, but we are still initializing the system. Pages are given directly
* to the page allocator, no bootmem metadata is updated because it is gone.
*/
void __init free_bootmem_late(unsigned long physaddr, unsigned long size)
{
unsigned long cursor, end;
......@@ -264,11 +304,6 @@ void __init reset_all_zones_managed_pages(void)
reset_managed_pages_done = 1;
}
/**
* free_all_bootmem - release free pages to the buddy allocator
*
* Returns the number of pages actually released.
*/
unsigned long __init free_all_bootmem(void)
{
unsigned long total_pages = 0;
......@@ -385,16 +420,6 @@ static int __init mark_bootmem(unsigned long start, unsigned long end,
BUG();
}
/**
* free_bootmem_node - mark a page range as usable
* @pgdat: node the range resides on
* @physaddr: starting address of the range
* @size: size of the range in bytes
*
* Partial pages will be considered reserved and left as they are.
*
* The range must reside completely on the specified node.
*/
void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
unsigned long size)
{
......@@ -408,15 +433,6 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
mark_bootmem_node(pgdat->bdata, start, end, 0, 0);
}
/**
* free_bootmem - mark a page range as usable
* @physaddr: starting physical address of the range
* @size: size of the range in bytes
*
* Partial pages will be considered reserved and left as they are.
*
* The range must be contiguous but may span node boundaries.
*/
void __init free_bootmem(unsigned long physaddr, unsigned long size)
{
unsigned long start, end;
......@@ -439,6 +455,8 @@ void __init free_bootmem(unsigned long physaddr, unsigned long size)
* Partial pages will be reserved.
*
* The range must reside completely on the specified node.
*
* Return: 0 on success, -errno on failure.
*/
int __init reserve_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
unsigned long size, int flags)
......@@ -460,6 +478,8 @@ int __init reserve_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
* Partial pages will be reserved.
*
* The range must be contiguous but may span node boundaries.
*
* Return: 0 on success, -errno on failure.
*/
int __init reserve_bootmem(unsigned long addr, unsigned long size,
int flags)
......@@ -646,19 +666,6 @@ static void * __init ___alloc_bootmem_nopanic(unsigned long size,
return NULL;
}
/**
* __alloc_bootmem_nopanic - allocate boot memory without panicking
* @size: size of the request in bytes
* @align: alignment of the region
* @goal: preferred starting address of the region
*
* The goal is dropped if it can not be satisfied and the allocation will
* fall back to memory below @goal.
*
* Allocation may happen on any node in the system.
*
* Returns NULL on failure.
*/
void * __init __alloc_bootmem_nopanic(unsigned long size, unsigned long align,
unsigned long goal)
{
......@@ -682,19 +689,6 @@ static void * __init ___alloc_bootmem(unsigned long size, unsigned long align,
return NULL;
}
/**
* __alloc_bootmem - allocate boot memory
* @size: size of the request in bytes
* @align: alignment of the region
* @goal: preferred starting address of the region
*
* The goal is dropped if it can not be satisfied and the allocation will
* fall back to memory below @goal.
*
* Allocation may happen on any node in the system.
*
* The function panics if the request can not be satisfied.
*/
void * __init __alloc_bootmem(unsigned long size, unsigned long align,
unsigned long goal)
{
......@@ -754,21 +748,6 @@ void * __init ___alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
return NULL;
}
/**
* __alloc_bootmem_node - allocate boot memory from a specific node
* @pgdat: node to allocate from
* @size: size of the request in bytes
* @align: alignment of the region
* @goal: preferred starting address of the region
*
* The goal is dropped if it can not be satisfied and the allocation will
* fall back to memory below @goal.
*
* Allocation may fall back to any node in the system if the specified node
* can not hold the requested memory.
*
* The function panics if the request can not be satisfied.
*/
void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
{
......@@ -807,19 +786,6 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
}
/**
* __alloc_bootmem_low - allocate low boot memory
* @size: size of the request in bytes
* @align: alignment of the region
* @goal: preferred starting address of the region
*
* The goal is dropped if it can not be satisfied and the allocation will
* fall back to memory below @goal.
*
* Allocation may happen on any node in the system.
*
* The function panics if the request can not be satisfied.
*/
void * __init __alloc_bootmem_low(unsigned long size, unsigned long align,
unsigned long goal)
{
......@@ -834,21 +800,6 @@ void * __init __alloc_bootmem_low_nopanic(unsigned long size,
ARCH_LOW_ADDRESS_LIMIT);
}
/**
* __alloc_bootmem_low_node - allocate low boot memory from a specific node
* @pgdat: node to allocate from
* @size: size of the request in bytes
* @align: alignment of the region
* @goal: preferred starting address of the region
*
* The goal is dropped if it can not be satisfied and the allocation will
* fall back to memory below @goal.
*
* Allocation may fall back to any node in the system if the specified node
* can not hold the requested memory.
*
* The function panics if the request can not be satisfied.
*/
void * __init __alloc_bootmem_low_node(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
{
......
This diff is collapsed.
......@@ -42,7 +42,7 @@ static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
{
void *ptr;
u64 addr;
ulong flags = choose_memblock_flags();
enum memblock_flags flags = choose_memblock_flags();
if (limit > memblock.current_limit)
limit = memblock.current_limit;
......@@ -72,7 +72,7 @@ static void * __init __alloc_memory_core_early(int nid, u64 size, u64 align,
return ptr;
}
/*
/**
* free_bootmem_late - free bootmem pages directly to page allocator
* @addr: starting address of the range
* @size: size of the range in bytes
......@@ -176,7 +176,7 @@ void __init reset_all_zones_managed_pages(void)
/**
* free_all_bootmem - release free pages to the buddy allocator
*
* Returns the number of pages actually released.
* Return: the number of pages actually released.
*/
unsigned long __init free_all_bootmem(void)
{
......@@ -193,7 +193,7 @@ unsigned long __init free_all_bootmem(void)
/**
* free_bootmem_node - mark a page range as usable
* @pgdat: node the range resides on
* @physaddr: starting address of the range
* @physaddr: starting physical address of the range
* @size: size of the range in bytes
*
* Partial pages will be considered reserved and left as they are.
......@@ -208,7 +208,7 @@ void __init free_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
/**
* free_bootmem - mark a page range as usable
* @addr: starting address of the range
* @addr: starting physical address of the range
* @size: size of the range in bytes
*
* Partial pages will be considered reserved and left as they are.
......@@ -256,7 +256,7 @@ static void * __init ___alloc_bootmem_nopanic(unsigned long size,
*
* Allocation may happen on any node in the system.
*
* Returns NULL on failure.
* Return: address of the allocated region or %NULL on failure.
*/
void * __init __alloc_bootmem_nopanic(unsigned long size, unsigned long align,
unsigned long goal)
......@@ -293,6 +293,8 @@ static void * __init ___alloc_bootmem(unsigned long size, unsigned long align,
* Allocation may happen on any node in the system.
*
* The function panics if the request can not be satisfied.
*
* Return: address of the allocated region.
*/
void * __init __alloc_bootmem(unsigned long size, unsigned long align,
unsigned long goal)
......@@ -367,6 +369,8 @@ static void * __init ___alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
* can not hold the requested memory.
*
* The function panics if the request can not be satisfied.
*
* Return: address of the allocated region.
*/
void * __init __alloc_bootmem_node(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
......@@ -396,6 +400,8 @@ void * __init __alloc_bootmem_node_high(pg_data_t *pgdat, unsigned long size,
* Allocation may happen on any node in the system.
*
* The function panics if the request can not be satisfied.
*
* Return: address of the allocated region.
*/
void * __init __alloc_bootmem_low(unsigned long size, unsigned long align,
unsigned long goal)
......@@ -425,6 +431,8 @@ void * __init __alloc_bootmem_low_nopanic(unsigned long size,
* can not hold the requested memory.
*
* The function panics if the request can not be satisfied.
*
* Return: address of the allocated region.
*/
void * __init __alloc_bootmem_low_node(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment