- 03 Oct, 2022 14 commits
-
-
Matthew Wilcox (Oracle) authored
Remove the assertion that the page is not Compound as this function now handles large folios correctly. Link: https://lkml.kernel.org/r/20220902194653.1739778-8-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Even though we will split any large folio that comes in, write the code to handle large folios so as to not leave a trap for whoever tries to handle large folios in the swap cache. Link: https://lkml.kernel.org/r/20220902194653.1739778-7-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Convert lru_cache_add_inactive_or_unevictable() to folio_add_lru_vma() and add a compatibility wrapper. Link: https://lkml.kernel.org/r/20220902194653.1739778-6-willy@infradead.orgSigned-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
This wrapper removes a need to use split_huge_page(&folio->page). Convert two callers. Link: https://lkml.kernel.org/r/20220902194653.1739778-5-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Instead of calling compound_order() and compound_nr_pages(), use the folio directly. Saves 1905 bytes from mm/filemap.o due to folio_test_large() now being a cheaper check than PageHead(). Link: https://lkml.kernel.org/r/20220902194653.1739778-4-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Some of the static checkers get confused by extracting the page from the folio and referring to fields in the first tail page. Adding these fields to struct folio lets us avoid doing that. It has the risk that people will refer to those fields without checking that the folio is actually a large folio, so prefix them with underscores and document the preferred function to use instead. Link: https://lkml.kernel.org/r/20220902194653.1739778-3-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "MM folio changes for 6.1", v2. My focus this round has been on shmem. I believe it is now fully converted to folios. Of course, shmem interacts with a lot of the swap cache and other parts of the kernel, so there are patches all over the MM. This patch series survives a round of xfstests on tmpfs, which is nice, but hardly an exhaustive test. Hugh was nice enough to run a round of tests on it and found a bug which is fixed in this edition. This patch (of 57): A lot of comments mention pages when they should say folios. Fix them up. [akpm@linux-foundation.org: fixups for mglru additions] Link: https://lkml.kernel.org/r/20220902194653.1739778-1-willy@infradead.org Link: https://lkml.kernel.org/r/20220902194653.1739778-2-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
Convert to use common struct mm_slot, no functional change. Link: https://lkml.kernel.org/r/20220831031951.43152-8-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
In order to use common struct mm_slot, convert ksm_mm_slot.link to ksm_mm_slot.hash in advance, no functional change. Link: https://lkml.kernel.org/r/20220831031951.43152-7-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
In order to use common struct mm_slot, convert ksm_mm_slot.mm_list to ksm_mm_slot.mm_node in advance, no functional change. Link: https://lkml.kernel.org/r/20220831031951.43152-6-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
In order to prevent the name of the private structure of ksm from being the same as the name of the common structure used in subsequent patches, prefix their names with ksm in advance. Link: https://lkml.kernel.org/r/20220831031951.43152-5-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
Currently, for struct stable_node, no one uses it in both the include/linux/ksm.h file and the file that contains it. For struct mem_cgroup, it's also not used in ksm.h. So they're all redundant, just remove them. Link: https://lkml.kernel.org/r/20220831031951.43152-4-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
Rename private struct mm_slot to struct khugepaged_mm_slot and convert to use common struct mm_slot with no functional change. [zhengqi.arch@bytedance.com: fix build error with CONFIG_SHMEM disabled] Link: https://lkml.kernel.org/r/639fa8d5-8e5b-2333-69dc-40ed46219364@bytedance.com Link: https://lkml.kernel.org/r/20220831031951.43152-3-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Qi Zheng authored
Patch series "add common struct mm_slot and use it in THP and KSM", v2. At present, both THP and KSM module have similar structures mm_slot for organizing and recording the information required for scanning mm, and each defines the following exactly the same operation functions: - alloc_mm_slot - free_mm_slot - get_mm_slot - insert_to_mm_slots_hash In order to de-duplicate these codes, this patchset introduces a common struct mm_slot, and lets THP and KSM to use it. This patch (of 7): At present, both THP and KSM module have similar structures mm_slot for organizing and recording the information required for scanning mm, and each defines the following exactly the same operation functions: - alloc_mm_slot - free_mm_slot - get_mm_slot - insert_to_mm_slots_hash In order to de-duplicate these codes, this patch introduces a common struct mm_slot, and subsequent patches will let THP and KSM to use it. Link: https://lkml.kernel.org/r/20220831031951.43152-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220831031951.43152-2-zhengqi.arch@bytedance.comSigned-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
- 27 Sep, 2022 26 commits
-
-
xu xin authored
Add the description of KSM profit and how to determine it separately in system-wide range and inner a single process. Link: https://lkml.kernel.org/r/20220830144003.299870-1-xu.xin16@zte.com.cnSigned-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: Yang Yang <yang.yang29@zte.com.cn> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
xu xin authored
Patch series "ksm: count allocated rmap_items and update documentation", v5. KSM can save memory by merging identical pages, but also can consume additional memory, because it needs to generate rmap_items to save each scanned page's brief rmap information. To determine how beneficial the ksm-policy (like madvise), they are using brings, so we add a new interface /proc/<pid>/ksm_stat for each process The value "ksm_rmap_items" in it indicates the total allocated ksm rmap_items of this process. The detailed description can be seen in the following patches' commit message. This patch (of 2): KSM can save memory by merging identical pages, but also can consume additional memory, because it needs to generate rmap_items to save each scanned page's brief rmap information. Some of these pages may be merged, but some may not be abled to be merged after being checked several times, which are unprofitable memory consumed. The information about whether KSM save memory or consume memory in system-wide range can be determined by the comprehensive calculation of pages_sharing, pages_shared, pages_unshared and pages_volatile. A simple approximate calculation: profit =~ pages_sharing * sizeof(page) - (all_rmap_items) * sizeof(rmap_item); where all_rmap_items equals to the sum of pages_sharing, pages_shared, pages_unshared and pages_volatile. But we cannot calculate this kind of ksm profit inner single-process wide because the information of ksm rmap_item's number of a process is lacked. For user applications, if this kind of information could be obtained, it helps upper users know how beneficial the ksm-policy (like madvise) they are using brings, and then optimize their app code. For example, one application madvise 1000 pages as MERGEABLE, while only a few pages are really merged, then it's not cost-efficient. So we add a new interface /proc/<pid>/ksm_stat for each process in which the value of ksm_rmap_itmes is only shown now and so more values can be added in future. So similarly, we can calculate the ksm profit approximately for a single process by: profit =~ ksm_merging_pages * sizeof(page) - ksm_rmap_items * sizeof(rmap_item); where ksm_merging_pages is shown at /proc/<pid>/ksm_merging_pages, and ksm_rmap_items is shown in /proc/<pid>/ksm_stat. Link: https://lkml.kernel.org/r/20220830143731.299702-1-xu.xin16@zte.com.cn Link: https://lkml.kernel.org/r/20220830143838.299758-1-xu.xin16@zte.com.cnSigned-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: Xiaokai Ran <ran.xiaokai@zte.com.cn> Reviewed-by: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: CGEL ZTE <cgel.zte@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Izik Eidus <izik.eidus@ravellosystems.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Shakeel Butt authored
There are three users (mmzone.h, memcontrol.h, page_counter.h) using similar code for forcing cacheline padding between fields of different structures. Dedup that code. Link: https://lkml.kernel.org/r/20220826230642.566725-1-shakeelb@google.comSigned-off-by: Shakeel Butt <shakeelb@google.com> Suggested-by: Feng Tang <feng.tang@intel.com> Reviewed-by: Feng Tang <feng.tang@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Michal Hocko authored
While discussing early DMA pool pre-allocation failure with Christoph [1] I have realized that the allocation failure warning is rather noisy for constrained allocations like GFP_DMA{32}. Those zones are usually not populated on all nodes very often as their memory ranges are constrained. This is an attempt to reduce the ballast that doesn't provide any relevant information for those allocation failures investigation. Please note that I have only compile tested it (in my default config setup) and I am throwing it mostly to see what people think about it. [1] http://lkml.kernel.org/r/20220817060647.1032426-1-hch@lst.de [mhocko@suse.com: update] Link: https://lkml.kernel.org/r/Yw29bmJTIkKogTiW@dhcp22.suse.cz [mhocko@suse.com: fix build] [akpm@linux-foundation.org: fix it for mapletree] [akpm@linux-foundation.org: update it for Michal's update] [mhocko@suse.com: fix arch/powerpc/xmon/xmon.c] Link: https://lkml.kernel.org/r/Ywh3C4dKB9B93jIy@dhcp22.suse.cz [akpm@linux-foundation.org: fix arch/sparc/kernel/setup_32.c] Link: https://lkml.kernel.org/r/YwScVmVofIZkopkF@dhcp22.suse.czSigned-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Hellwig <hch@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
pte_numa() no longer exists -- replaced by pte_protnone() -- and PROT_NUMA probably never existed: MM_CP_PROT_NUMA also ends up using PROT_NONE. Let's fixup the doc. Link: https://lkml.kernel.org/r/20220825164659.89824-4-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
There seems to be no reason why FOLL_FORCE during GUP-fast would have to fallback to the slow path when stumbling over a PROT_NONE mapped page. We only have to trigger hinting faults in case FOLL_FORCE is not set, and any kind of fault handling naturally happens from the slow path -- where NUMA hinting accounting/handling would be performed. Note that the comment regarding THP migration is outdated: commit 2b4847e7 ("mm: numa: serialise parallel get_user_page against THP migration") described that this was required for THP due to lack of PMD migration entries. Nowadays, we do have proper PMD migration entries in place -- see set_pmd_migration_entry(), which does a proper pmdp_invalidate() when placing the migration entry. So let's just reuse gup_can_follow_protnone() here to make it consistent and drop the somewhat outdated comments. Link: https://lkml.kernel.org/r/20220825164659.89824-3-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Hildenbrand authored
Patch series "mm: minor cleanups around NUMA hinting". Working on some GUP cleanups (e.g., getting rid of some FOLL_ flags) and preparing for other GUP changes (getting rid of FOLL_FORCE|FOLL_WRITE for for taking a R/O longterm pin), this is something I can easily send out independently. Get rid of FOLL_NUMA, allow FOLL_FORCE access to PROT_NONE mapped pages in GUP-fast, and fixup some documentation around NUMA hinting. This patch (of 3): No need for a special flag that is not even properly documented to be internal-only. Let's just factor this check out and get rid of this flag. The separate function has the nice benefit that we can centralize comments. Link: https://lkml.kernel.org/r/20220825164659.89824-2-david@redhat.com Link: https://lkml.kernel.org/r/20220825164659.89824-1-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Haiyue Wang authored
The handling Non-LRU pages returned by follow_page() jumps directly, it doesn't call put_page() to handle the reference count, since 'FOLL_GET' flag for follow_page() has get_page() called. Fix the zone device page check by handling the page reference count correctly before returning. And as David reviewed, "device pages are never PageKsm pages". Drop this zone device page check for break_ksm(). Since the zone device page can't be a transparent huge page, so drop the redundant zone device page check for split_huge_pages_pid(). (by Miaohe) Link: https://lkml.kernel.org/r/20220823135841.934465-3-haiyue.wang@intel.com Fixes: 3218f871 ("mm: handling Non-LRU pages returned by vm_normal_pages") Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Jakub Matěna authored
When mremap call results in expansion, it might be possible to merge the VMA with the next VMA which might become adjacent. This patch adds vma_merge call after the expansion is done to try and merge. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/20220603145719.1012094-3-matenajakub@gmail.comSigned-off-by: Jakub Matěna <matenajakub@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Jakub Matěna authored
Patch series "Refactor of vma_merge and new merge call", v4. I am currently working on my master's thesis trying to increase number of merges of VMAs currently failing because of page offset incompatibility and difference in their anon_vmas. The following refactor and added merge call included in this series is just two smaller upgrades I created along the way. This patch (of 2): Refactor vma_merge() to make it shorter and more understandable. Main change is the elimination of code duplicity in the case of merge next check. This is done by first doing checks and caching the results before executing the merge itself. The variable 'area' is divided into 'mid' and 'res' as previously it was used for two purposes, as the middle VMA between prev and next and also as the result of the merge itself. Exit paths are also unified. Link: https://lkml.kernel.org/r/20220603145719.1012094-1-matenajakub@gmail.com Link: https://lkml.kernel.org/r/20220603145719.1012094-2-matenajakub@gmail.comSigned-off-by: Jakub Matěna <matenajakub@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Hugh Dickins <hughd@google.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Rik van Riel <riel@surriel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Suren Baghdasaryan authored
With the last usage of MMF_OOM_VICTIM in exit_mmap gone, this flag is now unused and can be removed. [akpm@linux-foundation.org: remove comment about now-removed mm_is_oom_victim()] Link: https://lkml.kernel.org/r/20220531223100.510392-2-surenb@google.comSigned-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Minchan Kim <minchan@kernel.org> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Suren Baghdasaryan authored
The primary reason to invoke the oom reaper from the exit_mmap path used to be a prevention of an excessive oom killing if the oom victim exit races with the oom reaper (see [1] for more details). The invocation has moved around since then because of the interaction with the munlock logic but the underlying reason has remained the same (see [2]). Munlock code is no longer a problem since [3] and there shouldn't be any blocking operation before the memory is unmapped by exit_mmap so the oom reaper invocation can be dropped. The unmapping part can be done with the non-exclusive mmap_sem and the exclusive one is only required when page tables are freed. Remove the oom_reaper from exit_mmap which will make the code easier to read. This is really unlikely to make any observable difference although some microbenchmarks could benefit from one less branch that needs to be evaluated even though it almost never is true. [1] 21292580 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") [2] 27ae357f ("mm, oom: fix concurrent munlock and oom reaper unmap, v3") [3] a213e5cf ("mm/munlock: delete munlock_vma_pages_all(), allow oomreap") [akpm@linux-foundation.org: restore Suren's mmap_read_lock() optimization] Link: https://lkml.kernel.org/r/20220531223100.510392-1-surenb@google.comSigned-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christian Brauner (Microsoft) <brauner@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A . Shutemov" <kirill@shutemov.name> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam Howlett authored
The check for mm being null has never been needed since the only caller has always passed in current->mm. Remove the check from count_mm_mlocked_page_nr(). Link: https://lkml.kernel.org/r/20220615174050.738523-1-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
__vma_link_file() resolves the mapping from the file, if there is one. Pass through the mapping and check the vm_file externally since most places already have the required information and check of vm_file. Link: https://lkml.kernel.org/r/20220906194824.2110408-71-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Since there is no longer a linked list, the range_has_overlap() function is identical to the find_vma_intersection() function. Link: https://lkml.kernel.org/r/20220906194824.2110408-70-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Replace any vm_next use with vma_find(). Update free_pgtables(), unmap_vmas(), and zap_page_range() to use the maple tree. Use the new free_pgtables() and unmap_vmas() in do_mas_align_munmap(). At the same time, alter the loop to be more compact. Now that free_pgtables() and unmap_vmas() take a maple tree as an argument, rearrange do_mas_align_munmap() to use the new tree to hold the vmas to remove. Remove __vma_link_list() and __vma_unlink_list() as they are exclusively used to update the linked list. Drop linked list update from __insert_vm_struct(). Rework validation of tree as it was depending on the linked list. [yang.lee@linux.alibaba.com: fix one kernel-doc comment] Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=1949 Link: https://lkml.kernel.org/r/20220824021918.94116-1-yang.lee@linux.alibaba.comLink: https://lkml.kernel.org/r/20220906194824.2110408-69-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use the vma iterator in in get_next_vma() instead of the linked list. [yuzhao@google.com: mm/vmscan: use the proper VMA iterator] Link: https://lkml.kernel.org/r/Yx+QGOgHg1Wk8tGK@google.com Link: https://lkml.kernel.org/r/20220906194824.2110408-68-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Yu Zhao <yuzhao@google.com> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Remove the linked list use in favour of the vma iterator. Link: https://lkml.kernel.org/r/20220906194824.2110408-67-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Use the maple tree or VMA iterator instead. This is faster and will allow us to shrink the VMA. Link: https://lkml.kernel.org/r/20220906194824.2110408-66-Liam.Howlett@oracle.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Replace the linked list in probe_range() with the VMA iterator. Link: https://lkml.kernel.org/r/20220906194824.2110408-65-Liam.Howlett@oracle.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
unuse_mm() no longer needs to reference the linked list. Link: https://lkml.kernel.org/r/20220906194824.2110408-64-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
walk_page_range() no longer uses the one vma linked list reference. Link: https://lkml.kernel.org/r/20220906194824.2110408-63-Liam.Howlett@oracle.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Use vma iterator in preparation of removing the linked list. Link: https://lkml.kernel.org/r/20220906194824.2110408-62-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Remove a single use of the vma linked list in preparation for the removal of the linked list. Uses find_vma() to get the next element. Link: https://lkml.kernel.org/r/20220906194824.2110408-61-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Using the vma_find_intersection() call allows for cleaner code and removes linked list users in preparation of the linked list removal. Also remove one user of the linked list at the same time in favour of find_vma(). Link: https://lkml.kernel.org/r/20220906194824.2110408-60-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Switch to navigating the VMA list with the maple tree operators in preparation for removing the linked list. Link: https://lkml.kernel.org/r/20220906194824.2110408-59-Liam.Howlett@oracle.comSigned-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-