- 03 Sep, 2021 28 commits
-
-
Sebastian Andrzej Siewior authored
flush_all() flushes a specific SLAB cache on each CPU (where the cache is present). The deactivate_slab()/__free_slab() invocation happens within IPI handler and is problematic for PREEMPT_RT. The flush operation is not a frequent operation or a hot path. The per-CPU flush operation can be moved to within a workqueue. Because a workqueue handler, unlike IPI handler, does not disable irqs, flush_slab() now has to disable them for working with the kmem_cache_cpu fields. deactivate_slab() is safe to call with irqs enabled. [vbabka@suse.cz: adapt to new SLUB changes] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
flush_slab() is called either as part IPI handler on given live cpu, or as a cleanup on behalf of another cpu that went offline. The first case needs to protect updating the kmem_cache_cpu fields with disabled irqs. Currently the whole call happens with irqs disabled by the IPI handler, but the following patch will change from IPI to workqueue, and flush_slab() will have to disable irqs (to be replaced with a local lock later) in the critical part. To prepare for this change, replace the call to flush_slab() for the dead cpu handling with an opencoded variant that will not disable irqs nor take a local lock. Suggested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
slub_cpu_dead() cleans up for an offlined cpu from another cpu and calls only functions that are now irq safe, so we don't need to disable irqs anymore. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
__unfreeze_partials() no longer needs to have irqs disabled, except for making the spin_lock operations irq-safe, so convert the spin_locks operations and remove the separate irq handling. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
Unfreezing partial list can be split to two phases - detaching the list from struct kmem_cache_cpu, and processing the list. The whole operation does not need to be protected by disabled irqs. Restructure the code to separate the detaching (with disabled irqs) and unfreezing (with irq disabling to be reduced in the next patch). Also, unfreeze_partials() can be called from another cpu on behalf of a cpu that is being offlined, where disabling irqs on the local cpu has no sense, so restructure the code as follows: - __unfreeze_partials() is the bulk of unfreeze_partials() that processes the detached percpu partial list - unfreeze_partials() detaches list from current cpu with irqs disabled and calls __unfreeze_partials() - unfreeze_partials_cpu() is to be called for the offlined cpu so it needs no irq disabling, and is called from __flush_cpu_slab() - flush_cpu_slab() is for the local cpu thus it needs to call unfreeze_partials(). So it can't simply call __flush_cpu_slab(smp_processor_id()) anymore and we have to open-code the proper calls. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
Instead of iterating through the live percpu partial list, detach it from the kmem_cache_cpu at once. This is simpler and will allow further optimization. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
No need for disabled irqs when discarding slabs, so restore them before discarding. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
unfreeze_partials() can be optimized so that it doesn't need irqs disabled for the whole time. As the first step, move irq control into the function and remove it from the put_cpu_partial() caller. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
The function is now safe to be called with irqs enabled, so move the calls outside of irq disabled sections. When called from ___slab_alloc() -> flush_slab() we have irqs disabled, so to reenable them before deactivate_slab() we need to open-code flush_slab() in ___slab_alloc() and reenable irqs after modifying the kmem_cache_cpu fields. But that means a IRQ handler meanwhile might have assigned a new page to kmem_cache_cpu.page so we have to retry the whole check. The remaining callers of flush_slab() are the IPI handler which has disabled irqs anyway, and slub_cpu_dead() which will be dealt with in the following patch. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
dectivate_slab() now no longer touches the kmem_cache_cpu structure, so it will be possible to call it with irqs enabled. Just convert the spin_lock calls to their irq saving/restoring variants to make it irq-safe. Note we now have to use cmpxchg_double_slab() for irq-safe slab_lock(), because in some situations we don't take the list_lock, which would disable irqs. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
deactivate_slab() removes the cpu slab by merging the cpu freelist with slab's freelist and putting the slab on the proper node's list. It also sets the respective kmem_cache_cpu pointers to NULL. By extracting the kmem_cache_cpu operations from the function, we can make it not dependent on disabled irqs. Also if we return a single free pointer from ___slab_alloc, we no longer have to assign kmem_cache_cpu.page before deactivation or care if somebody preempted us and assigned a different page to our kmem_cache_cpu in the process. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
The function get_partial() does not need to have irqs disabled as a whole. It's sufficient to convert spin_lock operations to their irq saving/restoring versions. As a result, it's now possible to reach the page allocator from the slab allocator without disabling and re-enabling interrupts on the way. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
Building on top of the previous patch, re-enable irqs before checking new pages. alloc_debug_processing() is now called with enabled irqs so we need to remove VM_BUG_ON(!irqs_disabled()); in check_slab() - there doesn't seem to be a need for it anyway. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
When we obtain a new slab page from node partial list or page allocator, we assign it to kmem_cache_cpu, perform some checks, and if they fail, we undo the assignment. In order to allow doing the checks without irq disabled, restructure the code so that the checks are done first, and kmem_cache_cpu.page assignment only after they pass. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
allocate_slab() currently re-enables irqs before calling to the page allocator. It depends on gfpflags_allow_blocking() to determine if it's safe to do so. Now we can instead simply restore irq before calling it through new_slab(). The other caller early_kmem_cache_node_alloc() is unaffected by this. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
Continue reducing the irq disabled scope. Check for per-cpu partial slabs with first with irqs enabled and then recheck with irqs disabled before grabbing the slab page. Mostly preparatory for the following patches. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
As another step of shortening irq disabled sections in ___slab_alloc(), delay disabling irqs until we pass the initial checks if there is a cached percpu slab and it's suitable for our allocation. Now we have to recheck c->page after actually disabling irqs as an allocation in irq handler might have replaced it. Because we call pfmemalloc_match() as one of the checks, we might hit VM_BUG_ON_PAGE(!PageSlab(page)) in PageSlabPfmemalloc in case we get interrupted and the page is freed. Thus introduce a pfmemalloc_match_unsafe() variant that lacks the PageSlab check. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
Currently __slab_alloc() disables irqs around the whole ___slab_alloc(). This includes cases where this is not needed, such as when the allocation ends up in the page allocator and has to awkwardly enable irqs back based on gfp flags. Also the whole kmem_cache_alloc_bulk() is executed with irqs disabled even when it hits the __slab_alloc() slow path, and long periods with disabled interrupts are undesirable. As a first step towards reducing irq disabled periods, move irq handling into ___slab_alloc(). Callers will instead prevent the s->cpu_slab percpu pointer from becoming invalid via get_cpu_ptr(), thus preempt_disable(). This does not protect against modification by an irq handler, which is still done by disabled irq for most of ___slab_alloc(). As a small immediate benefit, slab_out_of_memory() from ___slab_alloc() is now called with irqs enabled. kmem_cache_alloc_bulk() disables irqs for its fastpath and then re-enables them before calling ___slab_alloc(), which then disables them at its discretion. The whole kmem_cache_alloc_bulk() operation also disables preemption. When ___slab_alloc() calls new_slab() to allocate a new page, re-enable preemption, because new_slab() will re-enable interrupts in contexts that allow blocking (this will be improved by later patches). The patch itself will thus increase overhead a bit due to disabled preemption (on configs where it matters) and increased disabling/enabling irqs in kmem_cache_alloc_bulk(), but that will be gradually improved in the following patches. Note in __slab_alloc() we need to change the #ifdef CONFIG_PREEMPT guard to CONFIG_PREEMPT_COUNT to make sure preempt disable/enable is properly paired in all configurations. On configs without involuntary preemption and debugging the re-read of kmem_cache_cpu pointer is still compiled out as it was before. [ Mike Galbraith <efault@gmx.de>: Fix kmem_cache_alloc_bulk() error path ] Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
In slab_alloc_node() and do_slab_free() fastpaths we need to guarantee that our kmem_cache_cpu pointer is from the same cpu as the tid value. Currently that's done by reading the tid first using this_cpu_read(), then the kmem_cache_cpu pointer and verifying we read the same tid using the pointer and plain READ_ONCE(). This can be simplified to just fetching kmem_cache_cpu pointer and then reading tid using the pointer. That guarantees they are from the same cpu. We don't need to read the tid using this_cpu_read() because the value will be validated by this_cpu_cmpxchg_double(), making sure we are on the correct cpu and the freelist didn't change by anyone preempting us since reading the tid. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
When we allocate slab object from a newly acquired page (from node's partial list or page allocator), we usually also retain the page as a new percpu slab. There are two exceptions - when pfmemalloc status of the page doesn't match our gfp flags, or when the cache has debugging enabled. The current code for these decisions is not easy to follow, so restructure it and add comments. The new structure will also help with the following changes. No functional change. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
The function get_partial() finds a suitable page on a partial list, acquires and returns its freelist and assigns the page pointer to kmem_cache_cpu. In later patch we will need more control over the kmem_cache_cpu.page assignment, so instead of passing a kmem_cache_cpu pointer, pass a pointer to a pointer to a page that get_partial() can fill and the caller can assign the kmem_cache_cpu.page pointer. No functional change as all of this still happens with disabled IRQs. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
The later patches will need more fine grained control over individual actions in ___slab_alloc(), the only caller of new_slab_objects(), so dissolve it there. This is a preparatory step with no functional change. The only minor change is moving WARN_ON_ONCE() for using a constructor together with __GFP_ZERO to new_slab(), which makes it somewhat less frequent, but still able to catch a development change introducing a systematic misuse. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
The later patches will need more fine grained control over individual actions in ___slab_alloc(), the only caller of new_slab_objects(), so this is a first preparatory step with no functional change. This adds a goto label that appears unnecessary at this point, but will be useful for later changes. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com>
-
Vlastimil Babka authored
Commit d6e0b7fa ("slub: make dead caches discard free slabs immediately") introduced cpu partial flushing for kmemcg caches, based on setting the target cpu_partial to 0 and adding a flushing check in put_cpu_partial(). This code that sets cpu_partial to 0 was later moved by c9fc5864 ("slab: introduce __kmemcg_cache_deactivate()") and ultimately removed by 9855609b ("mm: memcg/slab: use a single set of kmem_caches for all accounted allocations"). However the check and flush in put_cpu_partial() was never removed, although it's effectively a dead code. So this patch removes it. Note that d6e0b7fa also added preempt_disable()/enable() to unfreeze_partials() which could be thus also considered unnecessary. But further patches will rely on it, so keep it. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
-
Vlastimil Babka authored
In slab_free_hook() we disable irqs around the debug_check_no_locks_freed() call, which is unnecessary, as irqs are already being disabled inside the call. This seems to be leftover from the past where there were more calls inside the irq disabled sections. Remove the irq disable/enable operations. Mel noted: > Looks like it was needed for kmemcheck which went away back in 4.15 Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
validate_slab_cache() is called either to handle a sysfs write, or from a self-test context. In both situations it's straightforward to preallocate a private object bitmap instead of grabbing the shared static one meant for critical sections, so let's do that. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
Slub has a static spinlock protected bitmap for marking which objects are on freelist when it wants to list them, for situations where dynamically allocating such map can lead to recursion or locking issues, and on-stack bitmap would be too large. The handlers of debugfs files alloc_traces and free_traces also currently use this shared bitmap, but their syscall context makes it straightforward to allocate a private map before entering locked sections, so switch these processing paths to use a private bitmap. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Mel Gorman <mgorman@techsingularity.net>
-
Vlastimil Babka authored
slab_debug_trace_open() can only be called on caches with SLAB_STORE_USER flag and as with all slub debugging flags, such caches avoid cpu or percpu partial slabs altogether, so there's nothing to flush. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Christoph Lameter <cl@linux.com>
-
- 29 Aug, 2021 8 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fix from Stephen Boyd: "One hotfix for a NULL pointer deref in the Renesas usb clk driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Borislav Petkov: - Have get_push_task() check whether current has migration disabled and thus avoid useless invocations of the migration thread - Rework initialization flow so that all rq->core's are initialized, even of CPUs which have not been onlined yet, so that iterating over them all works as expected * tag 'sched_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix get_push_task() vs migrate_disable() sched: Fix Core-wide rq->lock for uninitialized CPUs
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fix from Borislav Petkov: - Have msix_mask_all() check a global control which says whether MSI-X masking should be done and thus make it usable on Xen-PV too * tag 'irq_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Skip masking MSI-X on Xen PV
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Borislav Petkov: - Prevent the amd/power module from being removed while in use - Mark AMD IBS as not supporting content exclusion - Add a workaround for AMD erratum #1197 where IBS registers might not be restored properly after exiting CC6 state - Fix a potential truncation of a 32-bit variable due to shifting - Read the correct bits describing the number of configurable address ranges on Intel PT * tag 'perf_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/amd/power: Assign pmu.module perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op perf/x86/amd/ibs: Work around erratum #1197 perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32 perf/x86/intel/pt: Fix mask of num_address_ranges
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Borislav Petkov: - Fix build error on RHEL where -Werror=maybe-uninitialized is set. - Restore the firmware's IDT when calling EFI boot services and before ExitBootServices() has been called. This fixes a boot failure on what appears to be a tablet with 32-bit UEFI running a 64-bit kernel. * tag 'x86_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Fix a maybe-uninitialized build warning treated as error x86/efi: Restore Firmware IDT before calling ExitBootServices()
-
Helge Deller authored
This reverts commit 83af58f8. It turns out that at least the assembly implementation for strncpy() was buggy. Revert the whole commit and return back to the default coding. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.4+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Adam Ford authored
The probe was manually passing NULL instead of dev to devm_clk_hw_register. This caused a Unable to handle kernel NULL pointer dereference error. Fix this by passing 'dev'. Signed-off-by: Adam Ford <aford173@gmail.com> Fixes: a20a40a8 ("clk: renesas: rcar-usb2-clock-sel: Fix error handling in .probe()") Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
-
- 28 Aug, 2021 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fix from James Bottomley: "A single fix for a race introduced by a fix that went into 5.14-rc5" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: core: Fix hang of freezing queue between blocking and running device
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are a few tiny USB fixes for reported issues with some USB drivers. These fixes include: - gadget driver fixes for regressions - tcpm driver fix - dwc3 driver fixes - xhci renesas firmware loading fix, again. - usb serial option driver device id addition - usb serial ch341 revert for regression All all of these have been in linux-next with no reported problems" * tag 'usb-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: gadget: u_audio: fix race condition on endpoint stop usb: gadget: f_uac2: fixup feedback endpoint stop usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running usb: renesas-xhci: Prefer firmware loading on unknown ROM state usb: dwc3: gadget: Stop EP0 transfers during pullup disable usb: dwc3: gadget: Fix dwc3_calc_trbs_left() Revert "USB: serial: ch341: fix character loss at high transfer rates" USB: serial: option: add new VID/PID to support Fibocom FG150
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Fix scv implicit soft-mask table for relocated (eg. kdump) kernels - Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK, which was disabled due to a typo Thanks to Lukas Bulwahn, Nicholas Piggin, and Daniel Axtens. * tag 'powerpc-5.14-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: Fix scv implicit soft-mask table for relocated kernels powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
-
- 27 Aug, 2021 1 commit
-
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: - Revert the mq-deadline priority handling, it's causing serious performance regressions. While experimental patches exists to fix this up, it's too late to do so now. Revert it and re-do it properly for 5.15 instead. - Fix a NULL vs IS_ERR() regression in this release (Dan) - Fix a mq-deadline accounting regression in this release (Bart) - Mark cryptoloop as deprecated. It's broken and dm-crypt fully supports it, and it's actively intefering with loop. Plan on removal for 5.16 (Christoph) * tag 'block-5.14-2021-08-27' of git://git.kernel.dk/linux-block: cryptoloop: add a deprecation warning pd: fix a NULL vs IS_ERR() check Revert "block/mq-deadline: Prioritize high-priority requests" mq-deadline: Fix request accounting
-