- 06 May, 2024 40 commits
-
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. If our folio has a stable node, it is a (small) KSM folio -- see folio_stable_node(). Let's use folio_mapcount() in stable_tree_search() instead, which results in no functional change. The mapcount > 1 check is a bit confusing, because that's usually a check for page sharing. Looks like the reason is that we are guaranteed to not exceed ksm_max_page_sharing for the tree KSM folio when merging with that. Let's update the documentation to make that clearer. Link: https://lkml.kernel.org/r/20240416172533.663418-1-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alex Shi <alexs@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yosry Ahmed authored
These knobs offer more fine-grained control to userspace than needed and directly expose/influence kernel implementation; remove them. For disabling same_filled handling, there is no logical reason to refuse storing same-filled pages more efficiently and opt for compression. Scanning pages for patterns may be an argument, but the page contents will be read into the CPU cache anyway during compression. Also, removing the same_filled handling code does not move the needle significantly in terms of performance anyway [1]. For disabling non_same_filled handling, it was added when the compressed pages in zswap were not being properly charged to memcgs, as workloads could escape the accounting with compression [2]. This is no longer the case after commit f4840ccf ("zswap: memcg accounting"), and using zswap without compression does not make much sense. [1]https://lore.kernel.org/lkml/CAJD7tkaySFP2hBQw4pnZHJJwe3bMdjJ1t9VC2VJd=khn1_TXvA@mail.gmail.com/ [2]https://lore.kernel.org/lkml/19d5cdee-2868-41bd-83d5-6da75d72e940@maciej.szmigiero.name/ [yosryahmed@google.com: remove same_filled_pages from docs] Link: https://lkml.kernel.org/r/ZhxFVggdyvCo79jc@google.com Link: https://lkml.kernel.org/r/20240413022407.785696-5-yosryahmed@google.comSigned-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yosry Ahmed authored
Currently, zswap_store() checks zswap_same_filled_pages_enabled, kmaps the folio, then calls zswap_is_page_same_filled() to check the folio contents. Move this logic into zswap_is_page_same_filled() as well (and rename it to use 'folio' while we are at it). This makes zswap_store() cleaner, and makes following changes to that logic contained within the helper. While we are at it: - Rename the insert_entry label to store_entry to match xa_store(). - Add comment headers for same-filled functions and the main API functions (load, store, invalidate, swapon, swapoff). No functional change intended. Link: https://lkml.kernel.org/r/20240413022407.785696-4-yosryahmed@google.comSigned-off-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yosry Ahmed authored
Refactor limit and acceptance threshold checking outside of zswap_store(). This code will be moved around in a following patch, so it would be cleaner to move a function call around. Link: https://lkml.kernel.org/r/20240413022407.785696-3-yosryahmed@google.comSigned-off-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yosry Ahmed authored
Patch series "zswap same-filled and limit checking cleanups", v3. Miscellaneous cleanups for limit checking and same-filled handling in the store path. This series was broken out of the "zswap: store zero-filled pages more efficiently" series [1]. It contains the cleanups and drops the main functional changes. [1]https://lore.kernel.org/lkml/20240325235018.2028408-1-yosryahmed@google.com/ This patch (of 4): The cleanup code in zswap_store() is not pretty, particularly the 'shrink' label at the bottom that ends up jumping between cleanup labels. Instead of having a dedicated label to shrink the pool, just use zswap_pool_reached_full directly to figure out if the pool needs shrinking. zswap_pool_reached_full should be true if and only if the pool needs shrinking. The only caveat is that the value of zswap_pool_reached_full may be changed by concurrent zswap_store() calls between checking the limit and testing zswap_pool_reached_full in the cleanup code. This is fine because: - If zswap_pool_reached_full was true during limit checking then became false during the cleanup code, then someone else already took care of shrinking the pool and there is no need to queue the worker. That would be a good change. - If zswap_pool_reached_full was false during limit checking then became true during the cleanup code, then someone else hit the limit meanwhile. In this case, both threads will try to queue the worker, but it never gets queued more than once anyway. Also, calling queue_work() multiple times when the limit is hit could already happen today, so this isn't a significant change in any way. Link: https://lkml.kernel.org/r/20240413022407.785696-1-yosryahmed@google.com Link: https://lkml.kernel.org/r/20240413022407.785696-2-yosryahmed@google.comSigned-off-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Suren Baghdasaryan authored
When folio is moved with UFFDIO_MOVE it gets locked before the rmap and index are modified. Due to the folio lock being already held, WRITE_ONCE() is not needed when setting the folio index. Remove it. Link: https://lkml.kernel.org/r/20240415020821.1152951-1-surenb@google.comReported-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Cc: Lokesh Gidra <lokeshgidra@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Baolin Wang authored
Currently, compaction_capture() does not allow lower-order allocations to directly capture the movable free pages, even though lower-order allocations might also be requesting movable pages, that can lead to more compaction scanning. And, with the enablement of mTHP, such situations will become more common. Thus allowing lower-order (mTHP) allocations of movable page types directly capture the movable free pages can avoid unnecessary compaction scanning, meanwhile that won't pollute the movable pageblock. With testing 1M mTHP compaction, it can be seen that compaction scanning is significantly reduced. mm-unstable patched Ops Compaction pages isolated 116598741.00 120946702.00 Ops Compaction migrate scanned 1764870054.00 1488621550.00 Ops Compaction free scanned 7707879039.00 4986299318.00 Ops Compact scan efficiency 22.90 29.85 Ops Compaction cost 73797.69 72933.48 Link: https://lkml.kernel.org/r/8118a5d66a034736a48433beddaca60ed78577c4.1712892329.git.baolin.wang@linux.alibaba.comSigned-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Hildenbrand <david@redhat.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Like copy_pte_range()/zap_pte_range(), make mm counter batch updating in filemap_map_pages(), since folios type are same(MM_SHMEMPAGES or MM_FILEPAGES) in filemap_map_pages(), only check the first folio type is enough, the 'lat_pagefault -P 1 file' test from lmbench shows 12% improvement, and the percpu_counter_add_batch() is gone from perf flame graph. Link: https://lkml.kernel.org/r/20240412064751.119015-3-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Patch series "mm: batch mm counter updating in filemap_map_pages()", v3. Let's batch mm counter updating to accelerate filemap_map_pages(). This patch (of 2): In order to support batch mm counter updating in filemap_map_pages(), move mm counter updating out of set_pte_range(), the folios are file from filemap, and distinguish folios by vmf->flags and vma->vm_flags from another caller finish_fault(). Link: https://lkml.kernel.org/r/20240412064751.119015-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20240412064751.119015-2-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Barry Song authored
The documentation does not align with the code. In __do_huge_pmd_anonymous_page(), THP_FAULT_FALLBACK is incremented when mem_cgroup_charge() fails, despite the allocation succeeding, whereas THP_FAULT_ALLOC is only incremented after a successful charge. Link: https://lkml.kernel.org/r/20240412114858.407208-5-21cnbao@gmail.comSigned-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Kairui Song <kasong@tencent.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Barry Song authored
This patch includes documentation for mTHP counters and an ABI file for sys-kernel-mm-transparent-hugepage, which appears to have been missing for some time. [v-songbaohua@oppo.com: fix the name and unexpected indentation] Link: https://lkml.kernel.org/r/20240415054538.17071-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240412114858.407208-4-21cnbao@gmail.comSigned-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Kairui Song <kasong@tencent.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Barry Song authored
This helps to display the fragmentation situation of the swapfile, knowing the proportion of how much we haven't split large folios. So far, we only support non-split swapout for anon memory, with the possibility of expanding to shmem in the future. So, we add the "anon" prefix to the counter names. Link: https://lkml.kernel.org/r/20240412114858.407208-3-21cnbao@gmail.comSigned-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Kairui Song <kasong@tencent.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Barry Song authored
Patch series "mm: add per-order mTHP alloc and swpout counters", v6. The patchset introduces a framework to facilitate mTHP counters, starting with the allocation and swap-out counters. Currently, only four new nodes are appended to the stats directory for each mTHP size. /sys/kernel/mm/transparent_hugepage/hugepages-<size>/stats anon_fault_alloc anon_fault_fallback anon_fault_fallback_charge anon_swpout anon_swpout_fallback These nodes are crucial for us to monitor the fragmentation levels of both the buddy system and the swap partitions. In the future, we may consider adding additional nodes for further insights. This patch (of 4): Profiling a system blindly with mTHP has become challenging due to the lack of visibility into its operations. Presenting the success rate of mTHP allocations appears to be pressing need. Recently, I've been experiencing significant difficulty debugging performance improvements and regressions without these figures. It's crucial for us to understand the true effectiveness of mTHP in real-world scenarios, especially in systems with fragmented memory. This patch establishes the framework for per-order mTHP counters. It begins by introducing the anon_fault_alloc and anon_fault_fallback counters. Additionally, to maintain consistency with thp_fault_fallback_charge in /proc/vmstat, this patch also tracks anon_fault_fallback_charge when mem_cgroup_charge fails for mTHP. Incorporating additional counters should now be straightforward as well. Link: https://lkml.kernel.org/r/20240412114858.407208-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240412114858.407208-2-21cnbao@gmail.comSigned-off-by: Barry Song <v-songbaohua@oppo.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Cc: Kairui Song <kasong@tencent.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Sidhartha Kumar authored
dissolve_free_huge_pages() only uses folios internally, rename it to dissolve_free_hugetlb_folios() and change the comments which reference it. [akpm@linux-foundation.org: remove unneeded `extern'] Link: https://lkml.kernel.org/r/20240412182139.120871-2-sidhartha.kumar@oracle.comSigned-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Sidhartha Kumar authored
Allows us to rename dissolve_free_huge_pages() to dissolve_free_hugetlb_folio(). Convert one caller to pass in a folio directly and use page_folio() to convert the caller in mm/memory-failure. [sidhartha.kumar@oracle.com: remove unneeded `extern'] Link: https://lkml.kernel.org/r/71760ed4-e80d-493a-95ea-2545414b1aba@oracle.com [sidhartha.kumar@oracle.com: v2] Link: https://lkml.kernel.org/r/20240412182139.120871-1-sidhartha.kumar@oracle.com Link: https://lkml.kernel.org/r/20240411164756.261178-1-sidhartha.kumar@oracle.comSigned-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Only single page could be reached where we set stable node after write protect, so use folio converted func to replace page's. And remove the unused func set_page_stable_node(). Link: https://lkml.kernel.org/r/20240411061713.1847574-11-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
As we are removing get_ksm_page_flags(), make the flags match the new function name. Link: https://lkml.kernel.org/r/20240411061713.1847574-10-alexs@kernel.orgSigned-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Alex Shi <alexs@kernel.org> Reviewed-by: Alex Shi <alexs@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
In ksm stable tree all page are single, let's convert them to use and folios as well as stable_tree_insert/stable_tree_search funcs. And replace get_ksm_page() by ksm_get_folio() since there is no more needs. It could save a few compound_head calls. Link: https://lkml.kernel.org/r/20240411061713.1847574-9-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Compound page is checked and skipped before write_protect_page() called, use folio to save a few compound_head checks. Link: https://lkml.kernel.org/r/20240411061713.1847574-8-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Save a compound_head call. Link: https://lkml.kernel.org/r/20240411061713.1847574-7-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Use ksm_get_folio() and save 2 compound_head calls. Link: https://lkml.kernel.org/r/20240411061713.1847574-6-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Pages in stable tree are all single normal page, so uses ksm_get_folio() and folio_set_stable_node(), also saves 3 calls to compound_head(). Link: https://lkml.kernel.org/r/20240411061713.1847574-5-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Turn set_page_stable_node() into a wrapper folio_set_stable_node, and then use it to replace the former. we will merge them together after all place converted to folio. Link: https://lkml.kernel.org/r/20240411061713.1847574-4-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
To save 2 compound_head calls. Link: https://lkml.kernel.org/r/20240411061713.1847574-3-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alex Shi (tencent) authored
Patch series "transfer page to folio in KSM". This is the first part of page to folio transfer on KSM. Since only single page could be stored in KSM, we could safely transfer stable tree pages to folios. This patchset could reduce ksm.o 57kbytes from 2541776 bytes on latest akpm/mm-stable branch with CONFIG_DEBUG_VM enabled. It pass the KSM testing in LTP and kernel selftest. Thanks for Matthew Wilcox and David Hildenbrand's suggestions and comments! This patch (of 10): The ksm only contains single pages, so we could add a new func ksm_get_folio for get_ksm_page to use folio instead of pages to save a couple of compound_head calls. After all caller replaced, get_ksm_page will be removed. Link: https://lkml.kernel.org/r/20240411061713.1847574-1-alexs@kernel.org Link: https://lkml.kernel.org/r/20240411061713.1847574-2-alexs@kernel.orgSigned-off-by: Alex Shi (tencent) <alexs@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
If bad map or access, directly set code to SEGV_MAPRR or SEGV_ACCERR, also set fault to 0 and goto error handling, which make us to drop the arch's special vm fault reason. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20240411130925.73281-3-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Cristian Marussi <cristian.marussi@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Patch series "mm: remove arch's private VM_FAULT_BADMAP/BADACCESS", v2. Directly set SEGV_MAPRR or SEGV_ACCERR for arm/arm64 to remove the last two arch's private vm_fault reasons. This patch (of 2): If bad map or access, directly set si_code to SEGV_MAPRR or SEGV_ACCERR, also set fault to 0 and goto error handling, which make us to drop the arch's special vm fault reason. Link: https://lkml.kernel.org/r/20240411130925.73281-1-wangkefeng.wang@huawei.com Link: https://lkml.kernel.org/r/20240411130925.73281-2-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Aishwarya TCV <aishwarya.tcv@arm.com> Cc: Cristian Marussi <cristian.marussi@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
Let's stop talking about page_mapcount(). Link: https://lkml.kernel.org/r/20240409192301.907377-19-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
Let's simplify and only print the page mapcount: we already print the large folio mapcount and the entire folio mapcount for large folios separately; that should be sufficient to figure out what's happening. While at it, print the page mapcount also if it had an underflow, filtering out only typed pages. Link: https://lkml.kernel.org/r/20240409192301.907377-18-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. So let's convert check_tlb_entry() to perform sanity checks on folios instead of pages. This essentially already happened: page_count() is mapped to folio_ref_count(), and page_mapped() to folio_mapped() internally. However, we would have printed the page_mapount(), which does not really match what page_mapped() would have checked. Let's simply print the folio mapcount to avoid using page_mapcount(). For small folios there is no change. Link: https://lkml.kernel.org/r/20240409192301.907377-17-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. We already trace raw page->refcount, raw page->flags and raw page->mapping, and don't involve any folios. Let's also trace the raw mapcount value that does not consider the entire mapcount of large folios, and we don't add "1" to it. When dealing with typed folios, this makes a lot more sense. ... and it's for debugging purposes only either way. Link: https://lkml.kernel.org/r/20240409192301.907377-16-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. Let's convert migrate_vma_check_page() to work on a folio internally so we can remove the page_mapcount() usage. Note that we reject any large folios. There is a lot more folio conversion to be had, but that has to wait for another day. No functional change intended. Link: https://lkml.kernel.org/r/20240409192301.907377-15-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. Let's use folio_mapcount() instead of filemap_unaccount_folio(). No functional change intended, because we're only dealing with small folios. Link: https://lkml.kernel.org/r/20240409192301.907377-14-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. We're already using folio_mapped in copy_user_highpage() and copy_to_user_page() for a similar purpose so ... let's also simply use it for copy_from_user_page(). There is no change for small folios. Likely we won't stumble over many large folios on sh in that code either way. Link: https://lkml.kernel.org/r/20240409192301.907377-13-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. In add_page_for_migration(), we actually want to check if the folio is mapped shared, to reject such folios. So let's use folio_likely_mapped_shared() instead. For small folios, fully mapped THP, and hugetlb folios, there is no change. For partially mapped, shared THP, we should now do a better job at rejecting such folios. Link: https://lkml.kernel.org/r/20240409192301.907377-12-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. For tracing purposes, we use page_mapcount() in __alloc_contig_migrate_range(). Adding that mapcount to total_mapped sounds strange: total_migrated and total_reclaimed would count each page only once, not multiple times. But then, isolate_migratepages_range() adds each folio only once to the list. So for large folios, we would query the mapcount of the first page of the folio, which doesn't make too much sense for large folios. Let's simply use folio_mapped() * folio_nr_pages(), which makes more sense as nr_migratepages is also incremented by the number of pages in the folio in case of successful migration. Link: https://lkml.kernel.org/r/20240409192301.907377-11-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. We can only unmap full folios; page_mapped(), which we check here, is translated to folio_mapped() -- based on folio_mapcount(). So let's print the folio mapcount instead. Link: https://lkml.kernel.org/r/20240409192301.907377-10-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. Let's similarly check for folio_mapcount() underflows instead of page_mapcount() underflows like we do in zap_present_folio_ptes() now. Instead of the VM_BUG_ON(), we should actually be doing something like print_bad_pte(). For now, let's keep it simple and use WARN_ON_ONCE(), performing that check independently of DEBUG_VM. Link: https://lkml.kernel.org/r/20240409192301.907377-9-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We want to limit the use of page_mapcount() to the places where it is absolutely necessary. In zap_present_folio_ptes(), let's simply check the folio mapcount(). If there is some issue, it will underflow at some point either way when unmapping. As indicated already in commit 10ebac4f ("mm/memory: optimize unmap/zap with PTE-mapped THP"), we already documented "If we ever have a cheap folio_mapcount(), we might just want to check for underflows there.". There is no change for small folios. For large folios, we'll now catch more underflows when batch-unmapping, because instead of only testing the mapcount of the first subpage, we'll test if the folio mapcount underflows. Link: https://lkml.kernel.org/r/20240409192301.907377-8-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
We already handle it properly for large folios. Let's also return "0" for small typed folios, like page_mapcount() currently would. Consequently, folio_mapcount() will never return negative values for typed folios, but may return negative values for underflows. [david@redhat.com: make folio_mapcount() slightly more efficient] Link: https://lkml.kernel.org/r/c30fcda1-ed87-46f5-8297-cdedbddac009@redhat.com Link: https://lkml.kernel.org/r/20240409192301.907377-7-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Chris Zankel <chris@zankel.net> Cc: Hugh Dickins <hughd@google.com> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Richard Chang <richardycc@google.com> Cc: Rich Felker <dalias@libc.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-