- 30 Apr, 2021 40 commits
-
-
Chuck Lever authored
Patch series "SUNRPC consumer for the bulk page allocator" This patch set and the measurements below are based on yesterday's bulk allocator series: git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v5r9 The patches change SUNRPC to invoke the array-based bulk allocator instead of alloc_page(). The micro-benchmark results are promising. I ran a mixture of 256KB reads and writes over NFSv3. The server's kernel is built with KASAN enabled, so the comparison is exaggerated but I believe it is still valid. I instrumented svc_recv() to measure the latency of each call to svc_alloc_arg() and report it via a trace point. The following results are averages across the trace events. Single page: 25.007 us per call over 532,571 calls Bulk list: 6.258 us per call over 517,034 calls Bulk array: 4.590 us per call over 517,442 calls This patch (of 2) Refactor: I'm about to use the loop variable @i for something else. As far as the "i++" is concerned, that is a post-increment. The value of @i is not used subsequently, so the increment operator is unnecessary and can be removed. Also note that nfsd_read_actor() was renamed nfsd_splice_actor() by commit cf8208d0 ("sendfile: convert nfsd to splice_direct_to_actor()"). Link: https://lkml.kernel.org/r/20210325114228.27719-7-mgorman@techsingularity.netSigned-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jesper Dangaard Brouer authored
When __alloc_pages_bulk() got introduced two callers of __rmqueue_pcplist exist and the compiler chooses to not inline this function. ./scripts/bloat-o-meter vmlinux-before vmlinux-inline__rmqueue_pcplist add/remove: 0/1 grow/shrink: 2/0 up/down: 164/-125 (39) Function old new delta rmqueue 2197 2296 +99 __alloc_pages_bulk 1921 1986 +65 __rmqueue_pcplist 125 - -125 Total: Before=19374127, After=19374166, chg +0.00% modprobe page_bench04_bulk loops=$((10**7)) Type:time_bulk_page_alloc_free_array - Per elem: 106 cycles(tsc) 29.595 ns (step:64) - (measurement period time:0.295955434 sec time_interval:295955434) - (invoke count:10000000 tsc_interval:1065447105) Before: - Per elem: 110 cycles(tsc) 30.633 ns (step:64) Link: https://lkml.kernel.org/r/20210325114228.27719-6-mgorman@techsingularity.netSigned-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David Miller <davem@davemloft.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jesper Dangaard Brouer authored
Looking at perf-report and ASM-code for __alloc_pages_bulk() it is clear that the code activated is suboptimal. The compiler guesses wrong and places unlikely code at the beginning. Due to the use of WARN_ON_ONCE() macro the UD2 asm instruction is added to the code, which confuse the I-cache prefetcher in the CPU. [mgorman@techsingularity.net: minor changes and rebasing] Link: https://lkml.kernel.org/r/20210325114228.27719-5-mgorman@techsingularity.netSigned-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Acked-By: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David Miller <davem@davemloft.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
The proposed callers for the bulk allocator store pages from the bulk allocator in an array. This patch adds an array-based interface to the API to avoid multiple list iterations. The page list interface is preserved to avoid requiring all users of the bulk API to allocate and manage enough storage to store the pages. [akpm@linux-foundation.org: remove now unused local `allocated'] Link: https://lkml.kernel.org/r/20210325114228.27719-4-mgorman@techsingularity.netSigned-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David Miller <davem@davemloft.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
This patch adds a new page allocator interface via alloc_pages_bulk, and __alloc_pages_bulk_nodemask. A caller requests a number of pages to be allocated and added to a list. The API is not guaranteed to return the requested number of pages and may fail if the preferred allocation zone has limited free memory, the cpuset changes during the allocation or page debugging decides to fail an allocation. It's up to the caller to request more pages in batch if necessary. Note that this implementation is not very efficient and could be improved but it would require refactoring. The intent is to make it available early to determine what semantics are required by different callers. Once the full semantics are nailed down, it can be refactored. [mgorman@techsingularity.net: fix alloc_pages_bulk() return type, per Matthew] Link: https://lkml.kernel.org/r/20210325123713.GQ3697@techsingularity.net [mgorman@techsingularity.net: fix uninit var warning] Link: https://lkml.kernel.org/r/20210330114847.GX3697@techsingularity.net [mgorman@techsingularity.net: fix comment, per Vlastimil] Link: https://lkml.kernel.org/r/20210412110255.GV3697@techsingularity.net Link: https://lkml.kernel.org/r/20210325114228.27719-3-mgorman@techsingularity.netSigned-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Tested-by: Colin Ian King <colin.king@canonical.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: David Miller <davem@davemloft.net> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
Patch series "Introduce a bulk order-0 page allocator with two in-tree users", v6. This series introduces a bulk order-0 page allocator with sunrpc and the network page pool being the first users. The implementation is not efficient as semantics needed to be ironed out first. If no other semantic changes are needed, it can be made more efficient. Despite that, this is a performance-related for users that require multiple pages for an operation without multiple round-trips to the page allocator. Quoting the last patch for the high-speed networking use-case Kernel XDP stats CPU pps Delta Baseline XDP-RX CPU total 3,771,046 n/a List XDP-RX CPU total 3,940,242 +4.49% Array XDP-RX CPU total 4,249,224 +12.68% Via the SUNRPC traces of svc_alloc_arg() Single page: 25.007 us per call over 532,571 calls Bulk list: 6.258 us per call over 517,034 calls Bulk array: 4.590 us per call over 517,442 calls Both potential users in this series are corner cases (NFS and high-speed networks) so it is unlikely that most users will see any benefit in the short term. Other potential other users are batch allocations for page cache readahead, fault around and SLUB allocations when high-order pages are unavailable. It's unknown how much benefit would be seen by converting multiple page allocation calls to a single batch or what difference it may make to headline performance. Light testing of my own running dbench over NFS passed. Chuck and Jesper conducted their own tests and details are included in the changelogs. Patch 1 renames a variable name that is particularly unpopular Patch 2 adds a bulk page allocator Patch 3 adds an array-based version of the bulk allocator Patches 4-5 adds micro-optimisations to the implementation Patches 6-7 SUNRPC user Patches 8-9 Network page_pool user This patch (of 9): Review feedback of the bulk allocator twice found problems with "alloced" being a counter for pages allocated. The naming was based on the API name "alloc" and was based on the idea that verbal communication about malloc tends to use the fake word "malloced" instead of the fake word mallocated. To be consistent, this preparation patch renames alloced to allocated in rmqueue_bulk so the bulk allocator and per-cpu allocator use similar names when the bulk allocator is introduced. Link: https://lkml.kernel.org/r/20210325114228.27719-1-mgorman@techsingularity.net Link: https://lkml.kernel.org/r/20210325114228.27719-2-mgorman@techsingularity.netSigned-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Alexander Lobakin <alobakin@pm.me> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
zhouchuangao authored
linux/vmalloc.h is repeatedly in the file page_alloc.c Link: https://lkml.kernel.org/r/1616468751-80656-1-git-send-email-zhouchuangao@vivo.comSigned-off-by: zhouchuangao <zhouchuangao@vivo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kefeng Wang authored
The start_pfn and end_pfn are already available in move_freepages_block(), there is no need to go back and forth between page and pfn in move_freepages and move_freepages_block, and pfn_valid_within() should validate pfn first before touching the page. Link: https://lkml.kernel.org/r/20210323131215.934472-1-liushixin2@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Geert Uytterhoeven authored
Commit 214496cb ("ia64: make SPARSEMEM default and disable DISCONTIGMEM") removed the last enabler of ARCH_DISCONTIGMEM_DEFAULT, hence the memory model can no longer default to DISCONTIGMEM_MANUAL. Link: https://lkml.kernel.org/r/20210312141208.3465520-1-geert@linux-m68k.orgSigned-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Minchan Kim authored
Currently, debugging CMA allocation failures is quite limited. The most common source of these failures seems to be page migration which doesn't provide any useful information on the reason of the failure by itself. alloc_contig_range can report those failures as it holds a list of migrate-failed pages. The information logged by dump_page() has already proven helpful for debugging allocation issues, like identifying long-term pinnings on ZONE_MOVABLE or MIGRATE_CMA. Let's use the dynamic debugging infrastructure, such that we avoid flooding the logs and creating a lot of noise on frequent alloc_contig_range() calls. This information is helpful for debugging only. There are two ifdefery conditions to support common dyndbg options: - CONFIG_DYNAMIC_DEBUG_CORE && DYNAMIC_DEBUG_MODULE It aims for supporting the feature with only specific file with adding ccflags. - CONFIG_DYNAMIC_DEBUG It aims for supporting the feature with system wide globally. A simple example to enable the feature: Admin could enable the dump like this(by default, disabled) echo "func alloc_contig_dump_pages +p" > control Admin could disable it. echo "func alloc_contig_dump_pages =_" > control Detail goes Documentation/admin-guide/dynamic-debug-howto.rst A concern is utility functions in dump_page use inconsistent loglevels. In the future, we might want to make the loglevels used inside dump_page() consistent and eventually rework the way we log the information here. See [1]. [1] https://lore.kernel.org/linux-mm/YEh4doXvyuRl5BDB@google.com/ Link: https://lkml.kernel.org/r/20210311194042.825152-1-minchan@kernel.orgSigned-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: John Dias <joaodias@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Sphinx interprets the Return section as a list and complains about it. Turn it into a sentence and move it to the end of the kernel-doc to fit the kernel-doc style. Link: https://lkml.kernel.org/r/20210225150642.2582252-8-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
The current formatting doesn't quite work with kernel-doc. Link: https://lkml.kernel.org/r/20210225150642.2582252-7-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Document alloc_pages() for both NUMA and non-NUMA cases as kernel-doc doesn't care. Link: https://lkml.kernel.org/r/20210225150642.2582252-6-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
When CONFIG_NUMA is enabled, alloc_pages() is a wrapper around alloc_pages_current(). This is pointless, just implement alloc_pages() directly. Link: https://lkml.kernel.org/r/20210225150642.2582252-5-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
There are only two callers of __alloc_pages() so prune the thicket of alloc_page variants by combining the two functions together. Current callers of __alloc_pages() simply add an extra 'NULL' parameter and current callers of __alloc_pages_nodemask() call __alloc_pages() instead. Link: https://lkml.kernel.org/r/20210225150642.2582252-4-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Shorten some overly-long lines by renaming this identifier. Link: https://lkml.kernel.org/r/20210225150642.2582252-3-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "Rationalise __alloc_pages wrappers", v3. I was poking around the __alloc_pages variants trying to understand why they each exist, and couldn't really find a good justification for keeping __alloc_pages and __alloc_pages_nodemask as separate functions. That led to getting rid of alloc_pages_current() and then I noticed the documentation was bad, and then I noticed the mempolicy documentation wasn't included. Anyway, this is all cleanups & doc fixes. This patch (of 7): We have two masks involved -- the nodemask and the gfp mask, so alloc_mask is an unclear name. Link: https://lkml.kernel.org/r/20210225150642.2582252-2-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yu Zhao authored
Tidy things up and delete comments stating the obvious with typos or making no sense. Link: https://lkml.kernel.org/r/20210303071609.797782-2-yuzhao@google.comSigned-off-by: Yu Zhao <yuzhao@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yu Zhao authored
The naming convention used in include/linux/page-flags-layout.h: *_SHIFT: the number of bits trying to allocate *_WIDTH: the number of bits successfully allocated So when it comes to LAST_CPUPID_WIDTH, we need to check whether all previous *_WIDTH and LAST_CPUPID_SHIFT can fit into page flags. This means we need to use NODES_WIDTH, not NODES_SHIFT. Link: https://lkml.kernel.org/r/20210303071609.797782-1-yuzhao@google.comSigned-off-by: Yu Zhao <yuzhao@google.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Minchan Kim authored
__alloc_contig_migrate_range already has lru_add_drain_all call via migrate_prep. It's necessary to move LRU taget pages into LRU list to be able to isolated. However, lru_add_drain_all call after __alloc_contig_migrate_range is pointless since it has changed source page freeing from putback_lru_pages to put_page[1]. This patch removes it. [1] c6c919eb, ("mm: use put_page() to free page instead of putback_lru_page()" Link: https://lkml.kernel.org/r/20210303204512.2863087-1-minchan@kernel.orgSigned-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Hildenbrand authored
The information that some PFNs are busy is: a) not helpful for ordinary users: we don't even know *who* called alloc_contig_range(). This is certainly not worth a pr_info.*(). b) not really helpful for debugging: we don't have any details *why* these PFNs are busy, and that is what we usually care about. c) not complete: there are other cases where we fail alloc_contig_range() using different paths that are not getting recorded. For example, we reach this path once we succeeded in isolating pageblocks, but failed to migrate some pages - which can happen easily on ZONE_NORMAL (i.e., has_unmovable_pages() is racy) but also on ZONE_MOVABLE i.e., we would have to retry longer to migrate). For example via virtio-mem when unplugging memory, we can create quite some noise (especially with ZONE_NORMAL) that is not of interest to users - it's expected that some allocations may fail as memory is busy. Let's just drop that pr_info_ratelimit() and rather implement a dynamic debugging mechanism in the future that can give us a better reason why alloc_contig_range() failed on specific pages. Link: https://lkml.kernel.org/r/20210301150945.77012-1-david@redhat.comSigned-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kefeng Wang authored
mem_init_print_info() is called in mem_init() on each architecture, and pass NULL argument, so using void argument and move it into mm_init(). Link: https://lkml.kernel.org/r/20210317015210.33641-1-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> [x86] Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> [powerpc] Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Anatoly Pugachev <matorola@gmail.com> [sparc64] Acked-by: Russell King <rmk+kernel@armlinux.org.uk> [arm] Acked-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: "Peter Zijlstra" <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zqiang authored
Add the irq_work_queue() call stack into the KASAN auxiliary stack in order to improve KASAN reports. this will let us know where the irq work be queued. Link: https://lkml.kernel.org/r/20210331063202.28770-1-qiang.zhang@windriver.comSigned-off-by: Zqiang <qiang.zhang@windriver.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Walter Wu <walter-zh.wu@mediatek.com> Cc: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Currently, KASAN-KUnit tests can check that a particular annotated part of code causes a KASAN report. However, they do not check that no unwanted reports happen between the annotated parts. This patch implements these checks. It is done by setting report_data.report_found to false in kasan_test_init() and at the end of KUNIT_EXPECT_KASAN_FAIL() and then checking that it remains false at the beginning of KUNIT_EXPECT_KASAN_FAIL() and in kasan_test_exit(). kunit_add_named_resource() call is moved to kasan_test_init(), and the value of fail_data.report_expected is kept as false in between KUNIT_EXPECT_KASAN_FAIL() annotations for consistency. Link: https://lkml.kernel.org/r/48079c52cc329fbc52f4386996598d58022fb872.1617207873.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Walter Wu authored
Why record task_work_add() call stack? Syzbot reports many use-after-free issues for task_work, see [1]. After seeing the free stack and the current auxiliary stack, we think they are useless, we don't know where the work was registered. This work may be the free call stack, so we miss the root cause and don't solve the use-after-free. Add the task_work_add() call stack into the KASAN auxiliary stack in order to improve KASAN reports. It helps programmers solve use-after-free issues. [1]: https://groups.google.com/g/syzkaller-bugs/search?q=kasan%20use-after-free%20task_work_run Link: https://lkml.kernel.org/r/20210316024410.19967-1-walter-zh.wu@mediatek.comSigned-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Tests" section in KASAN documentation: - Add an introductory sentence. - Add proper indentation for the list of ways to run KUnit tests. - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/fb08845e25c8847ffda271fa19cda2621c04a65b.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Ignoring accesses" section in KASAN documentation: - Mention __no_sanitize_address/noinstr. - Mention kasan_disable/enable_current(). - Mention kasan_reset_tag()/page_kasan_tag_reset(). - Readability and punctuation clean-ups. Link: https://lkml.kernel.org/r/4531ba5f3eca61f6aade863c136778cc8c807a64.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Shadow memory" section in KASAN documentation: - Rearrange the introduction paragraph do it doesn't give a "KASAN has an issue" impression. - Update the list of architectures with vmalloc support. - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/00f8c38b0fd5290a3f4dced04eaba41383e67e14.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Implementation details" section for HW_TAGS KASAN: - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/ee2caf4c138cc1fd239822c2abefd5af6c057744.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Implementation details" section for SW_TAGS KASAN: - Clarify the introduction sentence. - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/69b9b2e49d8cf789358fa24558be3fc0ce4ee32c.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Implementation details" section for generic KASAN: - Don't mention kmemcheck, it's not present in the kernel anymore. - Don't mention GCC as the only supported compiler. - Update kasan_mem_to_shadow() definition to match actual code. - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/f2f35fdab701f8c709f63d328f98aec2982c8acc.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Boot parameters" section in KASAN documentation: - Mention panic_on_warn. - Mention kasan_multi_shot and its interaction with panic_on_warn. - Clarify kasan.fault=panic interaction with panic_on_warn. - A readability clean-up. Link: https://lkml.kernel.org/r/01364952f15789948f0627d6733b5cdf5209f83a.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Error reports" section in KASAN documentation: - Mention that bug titles are best-effort. - Move and reword the part about auxiliary stacks from "Implementation details". - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/3531e8fe6972cf39d1954e3643237b19eb21227e.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Usage" section in KASAN documentation: - Add inline code snippet markers. - Reword the part about stack traces for clarity. - Other minor clean-ups. Link: https://lkml.kernel.org/r/48427809cd4b8b5d6bc00926cbe87e2b5081df17.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update the "Overview" section in KASAN documentation: - Outline main use cases for each mode. - Mention that HW_TAGS mode need compiler support too. - Move the part about SLUB/SLAB support from "Usage" to "Overview". - Punctuation, readability, and other minor clean-ups. Link: https://lkml.kernel.org/r/1486fba8514de3d7db2f47df2192db59228b0a7b.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
Update KASAN documentation: - Give some sections clearer names. - Remove unneeded subsections in the "Tests" section. - Move the "For developers" section and split into subsections. Link: https://lkml.kernel.org/r/c2bbb56eaea80ad484f0ee85bb71959a3a63f1d7.1615559068.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
This change uses the previously added memory initialization feature of HW_TAGS KASAN routines for slab memory when init_on_free is enabled. With this change, memory initialization memset() is no longer called when both HW_TAGS KASAN and init_on_free are enabled. Instead, memory is initialized in KASAN runtime. For SLUB, the memory initialization memset() is moved into slab_free_hook() that currently directly follows the initialization loop. A new argument is added to slab_free_hook() that indicates whether to initialize the memory or not. To avoid discrepancies with which memory gets initialized that can be caused by future changes, both KASAN hook and initialization memset() are put together and a warning comment is added. Combining setting allocation tags with memory initialization improves HW_TAGS KASAN performance when init_on_free is enabled. Link: https://lkml.kernel.org/r/190fd15c1886654afdec0d19ebebd5ade665b601.1615296150.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
This change uses the previously added memory initialization feature of HW_TAGS KASAN routines for slab memory when init_on_alloc is enabled. With this change, memory initialization memset() is no longer called when both HW_TAGS KASAN and init_on_alloc are enabled. Instead, memory is initialized in KASAN runtime. The memory initialization memset() is moved into slab_post_alloc_hook() that currently directly follows the initialization loop. A new argument is added to slab_post_alloc_hook() that indicates whether to initialize the memory or not. To avoid discrepancies with which memory gets initialized that can be caused by future changes, both KASAN hook and initialization memset() are put together and a warning comment is added. Combining setting allocation tags with memory initialization improves HW_TAGS KASAN performance when init_on_alloc is enabled. Link: https://lkml.kernel.org/r/c1292aeb5d519da221ec74a0684a949b027d7720.1615296150.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
This change uses the previously added memory initialization feature of HW_TAGS KASAN routines for page_alloc memory when init_on_alloc/free is enabled. With this change, kernel_init_free_pages() is no longer called when both HW_TAGS KASAN and init_on_alloc/free are enabled. Instead, memory is initialized in KASAN runtime. To avoid discrepancies with which memory gets initialized that can be caused by future changes, both KASAN and kernel_init_free_pages() hooks are put together and a warning comment is added. This patch changes the order in which memory initialization and page poisoning hooks are called. This doesn't lead to any side-effects, as whenever page poisoning is enabled, memory initialization gets disabled. Combining setting allocation tags with memory initialization improves HW_TAGS KASAN performance when init_on_alloc/free is enabled. [andreyknvl@google.com: fix for "integrate page_alloc init with HW_TAGS"] Link: https://lkml.kernel.org/r/65b6028dea2e9a6e8e2cb779b5115c09457363fc.1617122211.git.andreyknvl@google.com Link: https://lkml.kernel.org/r/e77f0d5b1b20658ef0b8288625c74c2b3690e725.1615296150.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Tested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Sergei Trofimovich <slyfox@gentoo.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Konovalov authored
This change adds an argument to kasan_poison() and kasan_unpoison() that allows initializing memory along with setting the tags for HW_TAGS. Combining setting allocation tags with memory initialization will improve HW_TAGS KASAN performance when init_on_alloc/free is enabled. This change doesn't integrate memory initialization with KASAN, this is done is subsequent patches in this series. Link: https://lkml.kernel.org/r/3054314039fa64510947e674180d675cab1b4c41.1615296150.git.andreyknvl@google.comSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Branislav Rankov <Branislav.Rankov@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Evgenii Stepanov <eugenis@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-