- 08 Aug, 2024 21 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: - Have reading of event format files test if the metadata still exists. When a event is freed, a flag (EVENT_FILE_FL_FREED) in the metadata is set to state that it is to prevent any new references to it from happening while waiting for existing references to close. When the last reference closes, the metadata is freed. But the "format" was missing a check to this flag (along with some other files) that allowed new references to happen, and a use-after-free bug to occur. - Have the trace event meta data use the refcount infrastructure instead of relying on its own atomic counters. - Have tracefs inodes use alloc_inode_sb() for allocation instead of using kmem_cache_alloc() directly. - Have eventfs_create_dir() return an ERR_PTR instead of NULL as the callers expect a real object or an ERR_PTR. - Have release_ei() use call_srcu() and not call_rcu() as all the protection is on SRCU and not RCU. - Fix ftrace_graph_ret_addr() to use the task passed in and not current. - Fix overflow bug in get_free_elt() where the counter can overflow the integer and cause an infinite loop. - Remove unused function ring_buffer_nr_pages() - Have tracefs freeing use the inode RCU infrastructure instead of creating its own. When the kernel had randomize structure fields enabled, the rcu field of the tracefs_inode was overlapping the rcu field of the inode structure, and corrupting it. Instead, use the destroy_inode() callback to do the initial cleanup of the code, and then have free_inode() free it. * tag 'trace-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracefs: Use generic inode RCU for synchronizing freeing ring-buffer: Remove unused function ring_buffer_nr_pages() tracing: Fix overflow in get_free_elt() function_graph: Fix the ret_stack used by ftrace_graph_ret_addr() eventfs: Use SRCU for freeing eventfs_inodes eventfs: Don't return NULL in eventfs_create_dir() tracefs: Fix inode allocation tracing: Use refcount for trace_event_file reference counter tracing: Have format file honor EVENT_FILE_FL_FREED
-
git://evilpiepirate.org/bcachefsLinus Torvalds authored
Pull bcachefs fixes from Kent Overstreet: "Assorted little stuff: - lockdep fixup for lockdep_set_notrack_class() - we can now remove a device when using erasure coding without deadlocking, though we still hit other issues - the 'allocator stuck' timeout is now configurable, and messages are ratelimited. The default timeout has been increased from 10 seconds to 30" * tag 'bcachefs-2024-08-08' of git://evilpiepirate.org/bcachefs: bcachefs: Use bch2_wait_on_allocator() in btree node alloc path bcachefs: Make allocator stuck timeout configurable, ratelimit messages bcachefs: Add missing path_traverse() to btree_iter_next_node() bcachefs: ec should not allocate from ro devs bcachefs: Improved allocator debugging for ec bcachefs: Add missing bch2_trans_begin() call bcachefs: Add a comment for bucket helper types bcachefs: Don't rely on implicit unsigned -> signed integer conversion lockdep: Fix lockdep_set_notrack_class() for CONFIG_LOCK_STAT bcachefs: Fix double free of ca->buckets_nouse
-
Linus Torvalds authored
Russell King reported that the arm cbc(aes) crypto module hangs when loaded, and Herbert Xu bisected it to commit 9b9879fc ("modules: catch concurrent module loads, treat them as idempotent"), and noted: "So what's happening here is that the first modprobe tries to load a fallback CBC implementation, in doing so it triggers a load of the exact same module due to module aliases. IOW we're loading aes-arm-bs which provides cbc(aes). However, this needs a fallback of cbc(aes) to operate, which is made out of the generic cbc module + any implementation of aes, or ecb(aes). The latter happens to also be provided by aes-arm-cb so that's why it tries to load the same module again" So loading the aes-arm-bs module ends up wanting to recursively load itself, and the recursive load then ends up waiting for the original module load to complete. This is a regression, in that it used to be that we just tried to load the module multiple times, and then as we went on to install it the second time we would instead just error out because the module name already existed. That is actually also exactly what the original "catch concurrent loads" patch did in commit 9828ed3f ("module: error out early on concurrent load of the same module file"), but it turns out that it ends up being racy, in that erroring out before the module has been fully initialized will cause failures in dependent module loading. See commit ac2263b5 (which was the revert of that "error out early") commit for details about why erroring out before the module has been initialized is actually fundamentally racy. Now, for the actual recursive module load (as opposed to just concurrently loading the same module twice), the race is not an issue. At the same time it's hard for the kernel to see that this is recursion, because the module load is always done from a usermode helper, so the recursion is not some simple callchain within the kernel. End result: this is not the real fix, but this at least adds a warning for the situation (admittedly much too late for all the debugging pain that Russell and Herbert went through) and if we can come to a resolution on how to detect the recursion properly, this re-organizes the code to make that easier. Link: https://lore.kernel.org/all/ZrFHLqvFqhzykuYw@shell.armlinux.org.uk/Reported-by: Russell King <linux@armlinux.org.uk> Debugged-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Merge tag 'loongarch-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Enable general EFI poweroff method to make poweroff usable on hardwares which lack ACPI S5, use accessors to page table entries instead of direct dereference to avoid potential problems, and two trivial kvm cleanups" * tag 'loongarch-fixes-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Remove undefined a6 argument comment for kvm_hypercall() LoongArch: KVM: Remove unnecessary definition of KVM_PRIVATE_MEM_SLOTS LoongArch: Use accessors to page table entries instead of direct dereference LoongArch: Enable general EFI poweroff method
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2024-08-07-18-32' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Nine hotfixes. Five are cc:stable, the others either pertain to post-6.10 material or aren't considered necessary for earlier kernels. Five are MM and four are non-MM. No identifiable theme here - please see the individual changelogs" * tag 'mm-hotfixes-stable-2024-08-07-18-32' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: padata: Fix possible divide-by-0 panic in padata_mt_helper() mailmap: update entry for David Heidelberg memcg: protect concurrent access to mem_cgroup_idr mm: shmem: fix incorrect aligned index when checking conflicts mm: shmem: avoid allocating huge pages larger than MAX_PAGECACHE_ORDER for shmem mm: list_lru: fix UAF for memory cgroup kcov: properly check for softirq context MAINTAINERS: Update LTP members and web selftests: mm: add s390 to ARCH check
-
Waiman Long authored
We are hit with a not easily reproducible divide-by-0 panic in padata.c at bootup time. [ 10.017908] Oops: divide error: 0000 1 PREEMPT SMP NOPTI [ 10.017908] CPU: 26 PID: 2627 Comm: kworker/u1666:1 Not tainted 6.10.0-15.el10.x86_64 #1 [ 10.017908] Hardware name: Lenovo ThinkSystem SR950 [7X12CTO1WW]/[7X12CTO1WW], BIOS [PSE140J-2.30] 07/20/2021 [ 10.017908] Workqueue: events_unbound padata_mt_helper [ 10.017908] RIP: 0010:padata_mt_helper+0x39/0xb0 : [ 10.017963] Call Trace: [ 10.017968] <TASK> [ 10.018004] ? padata_mt_helper+0x39/0xb0 [ 10.018084] process_one_work+0x174/0x330 [ 10.018093] worker_thread+0x266/0x3a0 [ 10.018111] kthread+0xcf/0x100 [ 10.018124] ret_from_fork+0x31/0x50 [ 10.018138] ret_from_fork_asm+0x1a/0x30 [ 10.018147] </TASK> Looking at the padata_mt_helper() function, the only way a divide-by-0 panic can happen is when ps->chunk_size is 0. The way that chunk_size is initialized in padata_do_multithreaded(), chunk_size can be 0 when the min_chunk in the passed-in padata_mt_job structure is 0. Fix this divide-by-0 panic by making sure that chunk_size will be at least 1 no matter what the input parameters are. Link: https://lkml.kernel.org/r/20240806174647.1050398-1-longman@redhat.com Fixes: 004ed426 ("padata: add basic support for multithreaded jobs") Signed-off-by: Waiman Long <longman@redhat.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Waiman Long <longman@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Heidelberg authored
Link my old gmail address to my active email. Link: https://lkml.kernel.org/r/20240804054704.859503-1-david@ixit.czSigned-off-by: David Heidelberg <david@ixit.cz> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Kosina <jikos@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Shakeel Butt authored
Commit 73f576c0 ("mm: memcontrol: fix cgroup creation failure after many small jobs") decoupled the memcg IDs from the CSS ID space to fix the cgroup creation failures. It introduced IDR to maintain the memcg ID space. The IDR depends on external synchronization mechanisms for modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace() happen within css callback and thus are protected through cgroup_mutex from concurrent modifications. However idr_remove() for mem_cgroup_idr was not protected against concurrency and can be run concurrently for different memcgs when they hit their refcnt to zero. Fix that. We have been seeing list_lru based kernel crashes at a low frequency in our fleet for a long time. These crashes were in different part of list_lru code including list_lru_add(), list_lru_del() and reparenting code. Upon further inspection, it looked like for a given object (dentry and inode), the super_block's list_lru didn't have list_lru_one for the memcg of that object. The initial suspicions were either the object is not allocated through kmem_cache_alloc_lru() or somehow memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but returned success. No evidence were found for these cases. Looking more deeply, we started seeing situations where valid memcg's id is not present in mem_cgroup_idr and in some cases multiple valid memcgs have same id and mem_cgroup_idr is pointing to one of them. So, the most reasonable explanation is that these situations can happen due to race between multiple idr_remove() calls or race between idr_alloc()/idr_replace() and idr_remove(). These races are causing multiple memcgs to acquire the same ID and then offlining of one of them would cleanup list_lrus on the system for all of them. Later access from other memcgs to the list_lru cause crashes due to missing list_lru_one. Link: https://lkml.kernel.org/r/20240802235822.1830976-1-shakeel.butt@linux.dev Fixes: 73f576c0 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Muchun Song <muchun.song@linux.dev> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Baolin Wang authored
In the shmem_suitable_orders() function, xa_find() is used to check for conflicts in the pagecache to select suitable huge orders. However, when checking each huge order in every loop, the aligned index is calculated from the previous iteration, which may cause suitable huge orders to be missed. We should use the original index each time in the loop to calculate a new aligned index for checking conflicts to avoid this issue. Link: https://lkml.kernel.org/r/07433b0f16a152bffb8cee34934a5c040e8e2ad6.1722404078.git.baolin.wang@linux.alibaba.com Fixes: e7a2ab7b ("mm: shmem: add mTHP support for anonymous shmem") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Barry Song <21cnbao@gmail.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Barry Song <baohua@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Baolin Wang authored
Similar to commit d659b715 ("mm/huge_memory: avoid PMD-size page cache if needed"), ARM64 can support 512MB PMD-sized THP when the base page size is 64KB, which is larger than the maximum supported page cache size MAX_PAGECACHE_ORDER. This is not expected. To fix this issue, use THP_ORDERS_ALL_FILE_DEFAULT for shmem to filter allowable huge orders. [baolin.wang@linux.alibaba.com: remove comment, per Barry] Link: https://lkml.kernel.org/r/c55d7ef7-78aa-4ed6-b897-c3e03a3f3ab7@linux.alibaba.com [wangkefeng.wang@huawei.com: remove local `orders'] Link: https://lkml.kernel.org/r/87769ae8-b6c6-4454-925d-1864364af9c8@huawei.com Link: https://lkml.kernel.org/r/117121665254442c3c7f585248296495e5e2b45c.1722404078.git.baolin.wang@linux.alibaba.com Fixes: e7a2ab7b ("mm: shmem: add mTHP support for anonymous shmem") Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Barry Song <baohua@kernel.org> Cc: Barry Song <21cnbao@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Muchun Song authored
The mem_cgroup_from_slab_obj() is supposed to be called under rcu lock or cgroup_mutex or others which could prevent returned memcg from being freed. Fix it by adding missing rcu read lock. Found by code inspection. [songmuchun@bytedance.com: only grab rcu lock when necessary, per Vlastimil] Link: https://lkml.kernel.org/r/20240801024603.1865-1-songmuchun@bytedance.com Link: https://lkml.kernel.org/r/20240718083607.42068-1-songmuchun@bytedance.com Fixes: 0a97c01c ("list_lru: allow explicit memcg and NUMA node selection") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Andrey Konovalov authored
When collecting coverage from softirqs, KCOV uses in_serving_softirq() to check whether the code is running in the softirq context. Unfortunately, in_serving_softirq() is > 0 even when the code is running in the hardirq or NMI context for hardirqs and NMIs that happened during a softirq. As a result, if a softirq handler contains a remote coverage collection section and a hardirq with another remote coverage collection section happens during handling the softirq, KCOV incorrectly detects a nested softirq coverate collection section and prints a WARNING, as reported by syzbot. This issue was exposed by commit a7f3813e ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using hrtimer and made the timer's callback be executed in the hardirq context. Change the related checks in KCOV to account for this behavior of in_serving_softirq() and make KCOV ignore remote coverage collection sections in the hardirq and NMI contexts. This prevents the WARNING printed by syzbot but does not fix the inability of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd is in use (caused by a7f3813e); a separate patch is required for that. Link: https://lkml.kernel.org/r/20240729022158.92059-1-andrey.konovalov@linux.dev Fixes: 5ff3b30a ("kcov: collect coverage from interrupts") Signed-off-by: Andrey Konovalov <andreyknvl@gmail.com> Reported-by: syzbot+2388cdaeb6b10f0c13ac@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13acAcked-by: Marco Elver <elver@google.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Aleksandr Nogikh <nogikh@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Marcello Sylvester Bauer <sylv@sylv.io> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Petr Vorel authored
LTP project uses now readthedocs.org instance instead of GitHub wiki. LTP maintainers are listed in alphabetical order. Link: https://lkml.kernel.org/r/20240726072009.1021599-1-pvorel@suse.czSigned-off-by: Petr Vorel <pvorel@suse.cz> Reviewed-by: Li Wang <liwang@redhat.com> Reviewed-by: Cyril Hrubis <chrubis@suse.cz> Cc: Jan Stancek <jstancek@redhat.com> Cc: Xiao Yang <yangx.jy@fujitsu.com> Cc: Yang Xu <xuyang2018.jy@fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Nico Pache authored
commit 0518dbe9 ("selftests/mm: fix cross compilation with LLVM") changed the env variable for the architecture from MACHINE to ARCH. This is preventing 3 required TEST_GEN_FILES from being included when cross compiling s390x and errors when trying to run the test suite. This is due to the ARCH variable already being set and the arch folder name being s390. Add "s390" to the filtered list to cover this case and have the 3 files included in the build. Link: https://lkml.kernel.org/r/20240724213517.23918-1-npache@redhat.com Fixes: 0518dbe9 ("selftests/mm: fix cross compilation with LLVM") Signed-off-by: Nico Pache <npache@redhat.com> Cc: Mark Brown <broonie@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kent Overstreet authored
If the allocator gets stuck, we need to know why. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
Limit these messages to once every 2 minutes to avoid spamming logs; with multiple devices the output can be quite significant. Also, up the default timeout to 30 seconds from 10 seconds. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
This fixes a bug exposed by the next path - we pop an assert in path_set_should_be_locked(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Steven Rostedt authored
With structure layout randomization enabled for 'struct inode' we need to avoid overlapping any of the RCU-used / initialized-only-once members, e.g. i_lru or i_sb_list to not corrupt related list traversals when making use of the rcu_head. For an unlucky structure layout of 'struct inode' we may end up with the following splat when running the ftrace selftests: [<...>] list_del corruption, ffff888103ee2cb0->next (tracefs_inode_cache+0x0/0x4e0 [slab object]) is NULL (prev is tracefs_inode_cache+0x78/0x4e0 [slab object]) [<...>] ------------[ cut here ]------------ [<...>] kernel BUG at lib/list_debug.c:54! [<...>] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [<...>] CPU: 3 PID: 2550 Comm: mount Tainted: G N 6.8.12-grsec+ #122 ed2f536ca62f28b087b90e3cc906a8d25b3ddc65 [<...>] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [<...>] RIP: 0010:[<ffffffff84656018>] __list_del_entry_valid_or_report+0x138/0x3e0 [<...>] Code: 48 b8 99 fb 65 f2 ff ff ff ff e9 03 5c d9 fc cc 48 b8 99 fb 65 f2 ff ff ff ff e9 33 5a d9 fc cc 48 b8 99 fb 65 f2 ff ff ff ff <0f> 0b 4c 89 e9 48 89 ea 48 89 ee 48 c7 c7 60 8f dd 89 31 c0 e8 2f [<...>] RSP: 0018:fffffe80416afaf0 EFLAGS: 00010283 [<...>] RAX: 0000000000000098 RBX: ffff888103ee2cb0 RCX: 0000000000000000 [<...>] RDX: ffffffff84655fe8 RSI: ffffffff89dd8b60 RDI: 0000000000000001 [<...>] RBP: ffff888103ee2cb0 R08: 0000000000000001 R09: fffffbd0082d5f25 [<...>] R10: fffffe80416af92f R11: 0000000000000001 R12: fdf99c16731d9b6d [<...>] R13: 0000000000000000 R14: ffff88819ad4b8b8 R15: 0000000000000000 [<...>] RBX: tracefs_inode_cache+0x0/0x4e0 [slab object] [<...>] RDX: __list_del_entry_valid_or_report+0x108/0x3e0 [<...>] RSI: __func__.47+0x4340/0x4400 [<...>] RBP: tracefs_inode_cache+0x0/0x4e0 [slab object] [<...>] RSP: process kstack fffffe80416afaf0+0x7af0/0x8000 [mount 2550 2550] [<...>] R09: kasan shadow of process kstack fffffe80416af928+0x7928/0x8000 [mount 2550 2550] [<...>] R10: process kstack fffffe80416af92f+0x792f/0x8000 [mount 2550 2550] [<...>] R14: tracefs_inode_cache+0x78/0x4e0 [slab object] [<...>] FS: 00006dcb380c1840(0000) GS:ffff8881e0600000(0000) knlGS:0000000000000000 [<...>] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [<...>] CR2: 000076ab72b30e84 CR3: 000000000b088004 CR4: 0000000000360ef0 shadow CR4: 0000000000360ef0 [<...>] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [<...>] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [<...>] ASID: 0003 [<...>] Stack: [<...>] ffffffff818a2315 00000000f5c856ee ffffffff896f1840 ffff888103ee2cb0 [<...>] ffff88812b6b9750 0000000079d714b6 fffffbfff1e9280b ffffffff8f49405f [<...>] 0000000000000001 0000000000000000 ffff888104457280 ffffffff8248b392 [<...>] Call Trace: [<...>] <TASK> [<...>] [<ffffffff818a2315>] ? lock_release+0x175/0x380 fffffe80416afaf0 [<...>] [<ffffffff8248b392>] list_lru_del+0x152/0x740 fffffe80416afb48 [<...>] [<ffffffff8248ba93>] list_lru_del_obj+0x113/0x280 fffffe80416afb88 [<...>] [<ffffffff8940fd19>] ? _atomic_dec_and_lock+0x119/0x200 fffffe80416afb90 [<...>] [<ffffffff8295b244>] iput_final+0x1c4/0x9a0 fffffe80416afbb8 [<...>] [<ffffffff8293a52b>] dentry_unlink_inode+0x44b/0xaa0 fffffe80416afbf8 [<...>] [<ffffffff8293fefc>] __dentry_kill+0x23c/0xf00 fffffe80416afc40 [<...>] [<ffffffff8953a85f>] ? __this_cpu_preempt_check+0x1f/0xa0 fffffe80416afc48 [<...>] [<ffffffff82949ce5>] ? shrink_dentry_list+0x1c5/0x760 fffffe80416afc70 [<...>] [<ffffffff82949b71>] ? shrink_dentry_list+0x51/0x760 fffffe80416afc78 [<...>] [<ffffffff82949da8>] shrink_dentry_list+0x288/0x760 fffffe80416afc80 [<...>] [<ffffffff8294ae75>] shrink_dcache_sb+0x155/0x420 fffffe80416afcc8 [<...>] [<ffffffff8953a7c3>] ? debug_smp_processor_id+0x23/0xa0 fffffe80416afce0 [<...>] [<ffffffff8294ad20>] ? do_one_tree+0x140/0x140 fffffe80416afcf8 [<...>] [<ffffffff82997349>] ? do_remount+0x329/0xa00 fffffe80416afd18 [<...>] [<ffffffff83ebf7a1>] ? security_sb_remount+0x81/0x1c0 fffffe80416afd38 [<...>] [<ffffffff82892096>] reconfigure_super+0x856/0x14e0 fffffe80416afd70 [<...>] [<ffffffff815d1327>] ? ns_capable_common+0xe7/0x2a0 fffffe80416afd90 [<...>] [<ffffffff82997436>] do_remount+0x416/0xa00 fffffe80416afdd0 [<...>] [<ffffffff829b2ba4>] path_mount+0x5c4/0x900 fffffe80416afe28 [<...>] [<ffffffff829b25e0>] ? finish_automount+0x13a0/0x13a0 fffffe80416afe60 [<...>] [<ffffffff82903812>] ? user_path_at_empty+0xb2/0x140 fffffe80416afe88 [<...>] [<ffffffff829b2ff5>] do_mount+0x115/0x1c0 fffffe80416afeb8 [<...>] [<ffffffff829b2ee0>] ? path_mount+0x900/0x900 fffffe80416afed8 [<...>] [<ffffffff8272461c>] ? __kasan_check_write+0x1c/0xa0 fffffe80416afee0 [<...>] [<ffffffff829b31cf>] __do_sys_mount+0x12f/0x280 fffffe80416aff30 [<...>] [<ffffffff829b36cd>] __x64_sys_mount+0xcd/0x2e0 fffffe80416aff70 [<...>] [<ffffffff819f8818>] ? syscall_trace_enter+0x218/0x380 fffffe80416aff88 [<...>] [<ffffffff8111655e>] x64_sys_call+0x5d5e/0x6720 fffffe80416affa8 [<...>] [<ffffffff8952756d>] do_syscall_64+0xcd/0x3c0 fffffe80416affb8 [<...>] [<ffffffff8100119b>] entry_SYSCALL_64_safe_stack+0x4c/0x87 fffffe80416affe8 [<...>] </TASK> [<...>] <PTREGS> [<...>] RIP: 0033:[<00006dcb382ff66a>] vm_area_struct[mount 2550 2550 file 6dcb38225000-6dcb3837e000 22 55(read|exec|mayread|mayexec)]+0x0/0xb8 [userland map] [<...>] Code: 48 8b 0d 29 18 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f6 17 0d 00 f7 d8 64 89 01 48 [<...>] RSP: 002b:0000763d68192558 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [<...>] RAX: ffffffffffffffda RBX: 00006dcb38433264 RCX: 00006dcb382ff66a [<...>] RDX: 000017c3e0d11210 RSI: 000017c3e0d1a5a0 RDI: 000017c3e0d1ae70 [<...>] RBP: 000017c3e0d10fb0 R08: 000017c3e0d11260 R09: 00006dcb383d1be0 [<...>] R10: 000000000020002e R11: 0000000000000246 R12: 0000000000000000 [<...>] R13: 000017c3e0d1ae70 R14: 000017c3e0d11210 R15: 000017c3e0d10fb0 [<...>] RBX: vm_area_struct[mount 2550 2550 file 6dcb38433000-6dcb38434000 5b 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RCX: vm_area_struct[mount 2550 2550 file 6dcb38225000-6dcb3837e000 22 55(read|exec|mayread|mayexec)]+0x0/0xb8 [userland map] [<...>] RDX: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RSI: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RDI: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RBP: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] RSP: vm_area_struct[mount 2550 2550 anon 763d68173000-763d68195000 7ffffffdd 100133(read|write|mayread|maywrite|growsdown|account)]+0x0/0xb8 [userland map] [<...>] R08: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R09: vm_area_struct[mount 2550 2550 file 6dcb383d1000-6dcb383d3000 1cd 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R13: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R14: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] R15: vm_area_struct[mount 2550 2550 anon 17c3e0d0f000-17c3e0d31000 17c3e0d0f 100033(read|write|mayread|maywrite|account)]+0x0/0xb8 [userland map] [<...>] </PTREGS> [<...>] Modules linked in: [<...>] ---[ end trace 0000000000000000 ]--- The list debug message as well as RBX's symbolic value point out that the object in question was allocated from 'tracefs_inode_cache' and that the list's '->next' member is at offset 0. Dumping the layout of the relevant parts of 'struct tracefs_inode' gives the following: struct tracefs_inode { union { struct inode { struct list_head { struct list_head * next; /* 0 8 */ struct list_head * prev; /* 8 8 */ } i_lru; [...] } vfs_inode; struct callback_head { void (*func)(struct callback_head *); /* 0 8 */ struct callback_head * next; /* 8 8 */ } rcu; }; [...] }; Above shows that 'vfs_inode.i_lru' overlaps with 'rcu' which will destroy the 'i_lru' list as soon as the 'rcu' member gets used, e.g. in call_rcu() or later when calling the RCU callback. This will disturb concurrent list traversals as well as object reuse which assumes these list heads will keep their integrity. For reproduction, the following diff manually overlays 'i_lru' with 'rcu' as, otherwise, one would require some good portion of luck for gambling an unlucky RANDSTRUCT seed: --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -629,6 +629,7 @@ struct inode { umode_t i_mode; unsigned short i_opflags; kuid_t i_uid; + struct list_head i_lru; /* inode LRU list */ kgid_t i_gid; unsigned int i_flags; @@ -690,7 +691,6 @@ struct inode { u16 i_wb_frn_avg_time; u16 i_wb_frn_history; #endif - struct list_head i_lru; /* inode LRU list */ struct list_head i_sb_list; struct list_head i_wb_list; /* backing dev writeback list */ union { The tracefs inode does not need to supply its own RCU delayed destruction of its inode. The inode code itself offers both a "destroy_inode()" callback that gets called when the last reference of the inode is released, and the "free_inode()" which is called after a RCU synchronization period from the "destroy_inode()". The tracefs code can unlink the inode from its list in the destroy_inode() callback, and the simply free it from the free_inode() callback. This should provide the same protection. Link: https://lore.kernel.org/all/20240807115143.45927-3-minipli@grsecurity.net/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Ilkka =?utf-8?b?TmF1bGFww6TDpA==?= <digirigawa@gmail.com> Link: https://lore.kernel.org/20240807185402.61410544@gandalf.local.home Fixes: baa23a8d ("tracefs: Reset permissions on remount if permissions are options") Reported-by: Mathias Krause <minipli@grsecurity.net> Reported-by: Brad Spengler <spender@grsecurity.net> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Jianhui Zhou authored
Because ring_buffer_nr_pages() is not an inline function and user accesses buffer->buffers[cpu]->nr_pages directly, the function ring_buffer_nr_pages is removed. Signed-off-by: Jianhui Zhou <912460177@qq.com> Link: https://lore.kernel.org/tencent_F4A7E9AB337F44E0F4B858D07D19EF460708@qq.comSigned-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Tze-nan Wu authored
"tracing_map->next_elt" in get_free_elt() is at risk of overflowing. Once it overflows, new elements can still be inserted into the tracing_map even though the maximum number of elements (`max_elts`) has been reached. Continuing to insert elements after the overflow could result in the tracing_map containing "tracing_map->max_size" elements, leaving no empty entries. If any attempt is made to insert an element into a full tracing_map using `__tracing_map_insert()`, it will cause an infinite loop with preemption disabled, leading to a CPU hang problem. Fix this by preventing any further increments to "tracing_map->next_elt" once it reaches "tracing_map->max_elt". Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 08d43a5f ("tracing: Add lock-free tracing_map") Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> Link: https://lore.kernel.org/20240805055922.6277-1-Tze-nan.Wu@mediatek.comSigned-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Petr Pavlu authored
When ftrace_graph_ret_addr() is invoked to convert a found stack return address to its original value, the function can end up producing the following crash: [ 95.442712] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 95.442720] #PF: supervisor read access in kernel mode [ 95.442724] #PF: error_code(0x0000) - not-present page [ 95.442727] PGD 0 P4D 0- [ 95.442731] Oops: Oops: 0000 [#1] PREEMPT SMP PTI [ 95.442736] CPU: 1 UID: 0 PID: 2214 Comm: insmod Kdump: loaded Tainted: G OE K 6.11.0-rc1-default #1 67c62a3b3720562f7e7db5f11c1fdb40b7a2857c [ 95.442747] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE, [K]=LIVEPATCH [ 95.442750] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 95.442754] RIP: 0010:ftrace_graph_ret_addr+0x42/0xc0 [ 95.442766] Code: [...] [ 95.442773] RSP: 0018:ffff979b80ff7718 EFLAGS: 00010006 [ 95.442776] RAX: ffffffff8ca99b10 RBX: ffff979b80ff7760 RCX: ffff979b80167dc0 [ 95.442780] RDX: ffffffff8ca99b10 RSI: ffff979b80ff7790 RDI: 0000000000000005 [ 95.442783] RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000 [ 95.442786] R10: 0000000000000005 R11: 0000000000000000 R12: ffffffff8e9491e0 [ 95.442790] R13: ffffffff8d6f70f0 R14: ffff979b80167da8 R15: ffff979b80167dc8 [ 95.442793] FS: 00007fbf83895740(0000) GS:ffff8a0afdd00000(0000) knlGS:0000000000000000 [ 95.442797] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.442800] CR2: 0000000000000028 CR3: 0000000005070002 CR4: 0000000000370ef0 [ 95.442806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 95.442809] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 95.442816] Call Trace: [ 95.442823] <TASK> [ 95.442896] unwind_next_frame+0x20d/0x830 [ 95.442905] arch_stack_walk_reliable+0x94/0xe0 [ 95.442917] stack_trace_save_tsk_reliable+0x7d/0xe0 [ 95.442922] klp_check_and_switch_task+0x55/0x1a0 [ 95.442931] task_call_func+0xd3/0xe0 [ 95.442938] klp_try_switch_task.part.5+0x37/0x150 [ 95.442942] klp_try_complete_transition+0x79/0x2d0 [ 95.442947] klp_enable_patch+0x4db/0x890 [ 95.442960] do_one_initcall+0x41/0x2e0 [ 95.442968] do_init_module+0x60/0x220 [ 95.442975] load_module+0x1ebf/0x1fb0 [ 95.443004] init_module_from_file+0x88/0xc0 [ 95.443010] idempotent_init_module+0x190/0x240 [ 95.443015] __x64_sys_finit_module+0x5b/0xc0 [ 95.443019] do_syscall_64+0x74/0x160 [ 95.443232] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 95.443236] RIP: 0033:0x7fbf82f2c709 [ 95.443241] Code: [...] [ 95.443247] RSP: 002b:00007fffd5ea3b88 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 95.443253] RAX: ffffffffffffffda RBX: 000056359c48e750 RCX: 00007fbf82f2c709 [ 95.443257] RDX: 0000000000000000 RSI: 000056356ed4efc5 RDI: 0000000000000003 [ 95.443260] RBP: 000056356ed4efc5 R08: 0000000000000000 R09: 00007fffd5ea3c10 [ 95.443263] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 [ 95.443267] R13: 000056359c48e6f0 R14: 0000000000000000 R15: 0000000000000000 [ 95.443272] </TASK> [ 95.443274] Modules linked in: [...] [ 95.443385] Unloaded tainted modules: intel_uncore_frequency(E):1 isst_if_common(E):1 skx_edac(E):1 [ 95.443414] CR2: 0000000000000028 The bug can be reproduced with kselftests: cd linux/tools/testing/selftests make TARGETS='ftrace livepatch' (cd ftrace; ./ftracetest test.d/ftrace/fgraph-filter.tc) (cd livepatch; ./test-livepatch.sh) The problem is that ftrace_graph_ret_addr() is supposed to operate on the ret_stack of a selected task but wrongly accesses the ret_stack of the current task. Specifically, the above NULL dereference occurs when task->curr_ret_stack is non-zero, but current->ret_stack is NULL. Correct ftrace_graph_ret_addr() to work with the right ret_stack. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reported-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/20240803131211.17255-1-petr.pavlu@suse.com Fixes: 7aa1eaef ("function_graph: Allow multiple users to attach to function graph") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
- 07 Aug, 2024 17 commits
-
-
Mathias Krause authored
To mirror the SRCU lock held in eventfs_iterate() when iterating over eventfs inodes, use call_srcu() to free them too. This was accidentally(?) degraded to RCU in commit 43aa6f97 ("eventfs: Get rid of dentry pointers without refcounts"). Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20240723210755.8970-1-minipli@grsecurity.net Fixes: 43aa6f97 ("eventfs: Get rid of dentry pointers without refcounts") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Mathias Krause authored
Commit 77a06c33 ("eventfs: Test for ei->is_freed when accessing ei->dentry") added another check, testing if the parent was freed after we released the mutex. If so, the function returns NULL. However, all callers expect it to either return a valid pointer or an error pointer, at least since commit 5264a2f4 ("tracing: Fix a NULL vs IS_ERR() bug in event_subsystem_dir()"). Returning NULL will therefore fail the error condition check in the caller. Fix this by substituting the NULL return value with a fitting error pointer. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org Fixes: 77a06c33 ("eventfs: Test for ei->is_freed when accessing ei->dentry") Link: https://lore.kernel.org/20240723122522.2724-1-minipli@grsecurity.netReviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Mathias Krause authored
The leading comment above alloc_inode_sb() is pretty explicit about it: /* * This must be used for allocating filesystems specific inodes to set * up the inode reclaim context correctly. */ Switch tracefs over to alloc_inode_sb() to make sure inodes are properly linked. Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20240807115143.45927-2-minipli@grsecurity.net Fixes: ba37ff75 ("eventfs: Implement tracefs_inode_cache") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt authored
Instead of using an atomic counter for the trace_event_file reference counter, use the refcount interface. It has various checks to make sure the reference counting is correct, and will warn if it detects an error (like refcount_inc() on '0'). Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240726144208.687cce24@rorschach.local.homeAcked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt authored
When eventfs was introduced, special care had to be done to coordinate the freeing of the file meta data with the files that are exposed to user space. The file meta data would have a ref count that is set when the file is created and would be decremented and freed after the last user that opened the file closed it. When the file meta data was to be freed, it would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed, and any new references made (like new opens or reads) would fail as it is marked freed. This allowed other meta data to be freed after this flag was set (under the event_mutex). All the files that were dynamically created in the events directory had a pointer to the file meta data and would call event_release() when the last reference to the user space file was closed. This would be the time that it is safe to free the file meta data. A shortcut was made for the "format" file. It's i_private would point to the "call" entry directly and not point to the file's meta data. This is because all format files are the same for the same "call", so it was thought there was no reason to differentiate them. The other files maintain state (like the "enable", "trigger", etc). But this meant if the file were to disappear, the "format" file would be unaware of it. This caused a race that could be trigger via the user_events test (that would create dynamic events and free them), and running a loop that would read the user_events format files: In one console run: # cd tools/testing/selftests/user_events # while true; do ./ftrace_test; done And in another console run: # cd /sys/kernel/tracing/ # while true; do cat events/user_events/__test_event/format; done 2>/dev/null With KASAN memory checking, it would trigger a use-after-free bug report (which was a real bug). This was because the format file was not checking the file's meta data flag "EVENT_FILE_FL_FREED", so it would access the event that the file meta data pointed to after the event was freed. After inspection, there are other locations that were found to not check the EVENT_FILE_FL_FREED flag when accessing the trace_event_file. Add a new helper function: event_file_file() that will make sure that the event_mutex is held, and will return NULL if the trace_event_file has the EVENT_FILE_FL_FREED flag set. Have the first reference of the struct file pointer use event_file_file() and check for NULL. Later uses can still use the event_file_data() helper function if the event_mutex is still held and was not released since the event_file_file() call. Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Ilkka Naulapää <digirigawa@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Florian Fainelli <florian.fainelli@broadcom.com> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com> Cc: Vasavi Sirnapalli <vasavi.sirnapalli@broadcom.com> Link: https://lore.kernel.org/20240730110657.3b69d3c1@gandalf.local.home Fixes: b63db58e ("eventfs/tracing: Add callback for release of an eventfs_inode") Reported-by: Mathias Krause <minipli@grsecurity.net> Tested-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs fixes from David Sterba: - fix double inode unlock for direct IO sync writes (reported by syzbot) - fix root tree id/name map definitions, don't use fixed size buffers for name (reported by -Werror=unterminated-string-initialization) - fix qgroup reserve leaks in bufferd write path - update scrub status structure more often so it can be reported in user space more accurately and let 'resume' not repeat work - in preparation to remove space cache v1 in the future print a warning if it's detected * tag 'for-6.11-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid using fixed char array size for tree names btrfs: fix double inode unlock for direct IO sync writes btrfs: emit a warning about space cache v1 being deprecated btrfs: fix qgroup reserve leaks in cow_file_range btrfs: implement launder_folio for clearing dirty page reserve btrfs: scrub: update last_physical after scrubbing one stripe btrfs: factor out stripe length calculation into a helper
-
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supplyLinus Torvalds authored
Pull power supply fixes from Sebastian Reichel: "rt5033: - fix driver regression causing kernel oops axp288-charger: - fix charge voltage setup qcom-battmgr: - fix thermal zone spamming errors - fix init on Qualcomm X Elite" * tag 'for-v6.11-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: qcom_battmgr: Ignore extra __le32 in info payload power: supply: qcom_battmgr: return EAGAIN when firmware service is not up power: supply: axp288_charger: Round constant_charge_voltage writes down power: supply: axp288_charger: Fix constant_charge_voltage writes power: supply: rt5033: Bring back i2c_set_clientdata
-
Kent Overstreet authored
This fixes a device removal deadlock when using erasure coding. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
chasing down a device removal deadlock with erasure coding Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
We've had bugs in the past with incorrect integer conversions in disk accounting code, which is why bucket helpers now always return s64s; add a comment explaining this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
implicit integer conversion is a fertile source of bugs, and we really would rather not have the min()/max() macros doing it implicitly. bcachefs appears to be the only place in the kernel where this happens, so let's fix it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Kent Overstreet authored
We won't find a contended lock if it's not being tracked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
-
Dandan Zhang authored
The kvm_hypercall() set for LoongArch is limited to a1-a5. So the mention of a6 in the comment is undefined that needs to be rectified. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Dandan Zhang <zhangdandan@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Yuli Wang authored
1. "KVM_PRIVATE_MEM_SLOTS" is renamed as "KVM_INTERNAL_MEM_SLOTS". 2. "KVM_INTERNAL_MEM_SLOTS" defaults to zero, so it is not necessary to define it in LoongArch's asm/kvm_host.h. Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bdd1c37a315bc50ab14066c4852bc8dcf070451e Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b075450868dbc0950f0942617f222eeb989cad10Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Huacai Chen authored
As very well explained in commit 20a004e7 ("arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables"), an architecture whose page table walker can modify the PTE in parallel must use READ_ONCE()/ WRITE_ONCE() macro to avoid any compiler transformation. So apply that to LoongArch which is such an architecture, in order to avoid potential problems. Similar to commit edf95564 ("riscv: Use accessors to page table entries instead of direct dereference"). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
Miao Wang authored
efi_shutdown_init() can register a general sys_off handler named efi_power_off(). Enable this by providing efi_poweroff_required(), like arm and x86. Since EFI poweroff is also supported on LoongArch, and the enablement makes the poweroff function usable for hardwares which lack ACPI S5. We prefer ACPI poweroff rather than EFI poweroff (like x86), so we only require EFI poweroff if acpi_gbl_reduced_hardware or acpi_no_s5 is true. Cc: stable@vger.kernel.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Miao Wang <shankerwangmiao@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
-
- 06 Aug, 2024 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio fix from Michael Tsirkin: "Fix a single, long-standing issue with kick pass-through vdpa" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes: - Fix ACPI notifier racing with itself (intel-vbtn) - Initialize local variable to cover a timeout corner case (intel/ifs) - WMI docs spelling New device IDs: - amd/{pmc,pmf}: AMD 1Ah model 60h series. - amd/pmf: SPS quirk support for ASUS ROG Ally X" * tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/ifs: Initialize union ifs_status to zero platform/x86: msi-wmi-platform: Fix spelling mistakes platform/x86/amd/pmf: Add new ACPI ID AMDI0107 platform/x86/amd/pmc: Send OS_HINT command for new AMD platform platform/x86/amd: pmf: Add quirk for ROG Ally X platform/x86: intel-vbtn: Protect ACPI notify handler against recursion
-