Commit 192a5022 authored by Sidhartha Kumar's avatar Sidhartha Kumar Committed by Andrew Morton

Documentation/mm: update hugetlbfs documentation to mention alloc_hugetlb_folio

Link: https://lkml.kernel.org/r/20230125170537.96973-9-sidhartha.kumar@oracle.comSigned-off-by: default avatarSidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 371607a3
...@@ -181,14 +181,14 @@ Consuming Reservations/Allocating a Huge Page ...@@ -181,14 +181,14 @@ Consuming Reservations/Allocating a Huge Page
Reservations are consumed when huge pages associated with the reservations Reservations are consumed when huge pages associated with the reservations
are allocated and instantiated in the corresponding mapping. The allocation are allocated and instantiated in the corresponding mapping. The allocation
is performed within the routine alloc_huge_page():: is performed within the routine alloc_hugetlb_folio()::
struct page *alloc_huge_page(struct vm_area_struct *vma, struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve) unsigned long addr, int avoid_reserve)
alloc_huge_page is passed a VMA pointer and a virtual address, so it can alloc_hugetlb_folio is passed a VMA pointer and a virtual address, so it can
consult the reservation map to determine if a reservation exists. In addition, consult the reservation map to determine if a reservation exists. In addition,
alloc_huge_page takes the argument avoid_reserve which indicates reserves alloc_hugetlb_folio takes the argument avoid_reserve which indicates reserves
should not be used even if it appears they have been set aside for the should not be used even if it appears they have been set aside for the
specified address. The avoid_reserve argument is most often used in the case specified address. The avoid_reserve argument is most often used in the case
of Copy on Write and Page Migration where additional copies of an existing of Copy on Write and Page Migration where additional copies of an existing
...@@ -208,7 +208,8 @@ a reservation for the allocation. After determining whether a reservation ...@@ -208,7 +208,8 @@ a reservation for the allocation. After determining whether a reservation
exists and can be used for the allocation, the routine dequeue_huge_page_vma() exists and can be used for the allocation, the routine dequeue_huge_page_vma()
is called. This routine takes two arguments related to reservations: is called. This routine takes two arguments related to reservations:
- avoid_reserve, this is the same value/argument passed to alloc_huge_page() - avoid_reserve, this is the same value/argument passed to
alloc_hugetlb_folio().
- chg, even though this argument is of type long only the values 0 or 1 are - chg, even though this argument is of type long only the values 0 or 1 are
passed to dequeue_huge_page_vma. If the value is 0, it indicates a passed to dequeue_huge_page_vma. If the value is 0, it indicates a
reservation exists (see the section "Memory Policy and Reservations" for reservation exists (see the section "Memory Policy and Reservations" for
...@@ -233,9 +234,9 @@ the scope reservations. Even if a surplus page is allocated, the same ...@@ -233,9 +234,9 @@ the scope reservations. Even if a surplus page is allocated, the same
reservation based adjustments as above will be made: SetPagePrivate(page) and reservation based adjustments as above will be made: SetPagePrivate(page) and
resv_huge_pages--. resv_huge_pages--.
After obtaining a new huge page, (page)->private is set to the value of After obtaining a new hugetlb folio, (folio)->_hugetlb_subpool is set to the
the subpool associated with the page if it exists. This will be used for value of the subpool associated with the page if it exists. This will be used
subpool accounting when the page is freed. for subpool accounting when the folio is freed.
The routine vma_commit_reservation() is then called to adjust the reserve The routine vma_commit_reservation() is then called to adjust the reserve
map based on the consumption of the reservation. In general, this involves map based on the consumption of the reservation. In general, this involves
...@@ -246,8 +247,8 @@ was no reservation in a shared mapping or this was a private mapping a new ...@@ -246,8 +247,8 @@ was no reservation in a shared mapping or this was a private mapping a new
entry must be created. entry must be created.
It is possible that the reserve map could have been changed between the call It is possible that the reserve map could have been changed between the call
to vma_needs_reservation() at the beginning of alloc_huge_page() and the to vma_needs_reservation() at the beginning of alloc_hugetlb_folio() and the
call to vma_commit_reservation() after the page was allocated. This would call to vma_commit_reservation() after the folio was allocated. This would
be possible if hugetlb_reserve_pages was called for the same page in a shared be possible if hugetlb_reserve_pages was called for the same page in a shared
mapping. In such cases, the reservation count and subpool free page count mapping. In such cases, the reservation count and subpool free page count
will be off by one. This rare condition can be identified by comparing the will be off by one. This rare condition can be identified by comparing the
......
...@@ -142,14 +142,14 @@ HPAGE_RESV_OWNER标志被设置,以表明该VMA拥有预留。 ...@@ -142,14 +142,14 @@ HPAGE_RESV_OWNER标志被设置,以表明该VMA拥有预留。
消耗预留/分配一个巨页 消耗预留/分配一个巨页
=========================== ===========================
当与预留相关的巨页在相应的映射中被分配和实例化时,预留就被消耗了。该分配是在函数alloc_huge_page() 当与预留相关的巨页在相应的映射中被分配和实例化时,预留就被消耗了。该分配是在函数alloc_hugetlb_folio()
中进行的:: 中进行的::
struct page *alloc_huge_page(struct vm_area_struct *vma, struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
unsigned long addr, int avoid_reserve) unsigned long addr, int avoid_reserve)
alloc_huge_page被传递给一个VMA指针和一个虚拟地址,因此它可以查阅预留映射以确定是否存在预留。 alloc_hugetlb_folio被传递给一个VMA指针和一个虚拟地址,因此它可以查阅预留映射以确定是否存在预留。
此外,alloc_huge_page需要一个参数avoid_reserve,该参数表示即使看起来已经为指定的地址预留了 此外,alloc_hugetlb_folio需要一个参数avoid_reserve,该参数表示即使看起来已经为指定的地址预留了
预留,也不应该使用预留。avoid_reserve参数最常被用于写时拷贝和页面迁移的情况下,即现有页面的额 预留,也不应该使用预留。avoid_reserve参数最常被用于写时拷贝和页面迁移的情况下,即现有页面的额
外拷贝被分配。 外拷贝被分配。
...@@ -162,7 +162,7 @@ vma_needs_reservation()返回的值通常为0或1。如果该地址存在预留 ...@@ -162,7 +162,7 @@ vma_needs_reservation()返回的值通常为0或1。如果该地址存在预留
确定预留是否存在并可用于分配后,调用dequeue_huge_page_vma()函数。这个函数需要两个与预留有关 确定预留是否存在并可用于分配后,调用dequeue_huge_page_vma()函数。这个函数需要两个与预留有关
的参数: 的参数:
- avoid_reserve,这是传递给alloc_huge_page()的同一个值/参数。 - avoid_reserve,这是传递给alloc_hugetlb_folio()的同一个值/参数。
- chg,尽管这个参数的类型是long,但只有0或1的值被传递给dequeue_huge_page_vma。如果该值为0, - chg,尽管这个参数的类型是long,但只有0或1的值被传递给dequeue_huge_page_vma。如果该值为0,
则表明存在预留(关于可能的问题,请参见 “预留和内存策略” 一节)。如果值 则表明存在预留(关于可能的问题,请参见 “预留和内存策略” 一节)。如果值
为1,则表示不存在预留,如果可能的话,必须从全局空闲池中取出该页。 为1,则表示不存在预留,如果可能的话,必须从全局空闲池中取出该页。
...@@ -179,7 +179,7 @@ free_huge_pages的值被递减。如果有一个与该页相关的预留,将 ...@@ -179,7 +179,7 @@ free_huge_pages的值被递减。如果有一个与该页相关的预留,将
的剩余巨页和超额分配的问题。即使分配了一个多余的页面,也会进行与上面一样的基于预留的调整: 的剩余巨页和超额分配的问题。即使分配了一个多余的页面,也会进行与上面一样的基于预留的调整:
SetPagePrivate(page) 和 resv_huge_pages--. SetPagePrivate(page) 和 resv_huge_pages--.
在获得一个新的巨页后,(page)->private被设置为与该页面相关的子池的值,如果它存在的话。当页 在获得一个新的巨页后,(folio)->_hugetlb_subpool被设置为与该页面相关的子池的值,如果它存在的话。当页
面被释放时,这将被用于子池的计数。 面被释放时,这将被用于子池的计数。
然后调用函数vma_commit_reservation(),根据预留的消耗情况调整预留映射。一般来说,这涉及 然后调用函数vma_commit_reservation(),根据预留的消耗情况调整预留映射。一般来说,这涉及
...@@ -199,7 +199,7 @@ SetPagePrivate(page)和resv_huge_pages-。 ...@@ -199,7 +199,7 @@ SetPagePrivate(page)和resv_huge_pages-。
已经存在,所以不做任何改变。然而,如果共享映射中没有预留,或者这是一个私有映射,则必须创建 已经存在,所以不做任何改变。然而,如果共享映射中没有预留,或者这是一个私有映射,则必须创建
一个新的条目。 一个新的条目。
在alloc_huge_page()开始调用vma_needs_reservation()和页面分配后调用 在alloc_hugetlb_folio()开始调用vma_needs_reservation()和页面分配后调用
vma_commit_reservation()之间,预留映射有可能被改变。如果hugetlb_reserve_pages在共 vma_commit_reservation()之间,预留映射有可能被改变。如果hugetlb_reserve_pages在共
享映射中为同一页面被调用,这将是可能的。在这种情况下,预留计数和子池空闲页计数会有一个偏差。 享映射中为同一页面被调用,这将是可能的。在这种情况下,预留计数和子池空闲页计数会有一个偏差。
这种罕见的情况可以通过比较vma_needs_reservation和vma_commit_reservation的返回值来 这种罕见的情况可以通过比较vma_needs_reservation和vma_commit_reservation的返回值来
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment