Commit 52abb27a authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - The "common kmalloc v4" series [1] by Hyeonggon Yoo.

   While the plan after LPC is to try again if it's possible to get rid
   of SLOB and SLAB (and if any critical aspect of those is not possible
   to achieve with SLUB today, modify it accordingly), it will take a
   while even in case there are no objections.

   Meanwhile this is a nice cleanup and some parts (e.g. to the
   tracepoints) will be useful even if we end up with a single slab
   implementation in the future:

      - Improves the mm/slab_common.c wrappers to allow deleting
        duplicated code between SLAB and SLUB.

      - Large kmalloc() allocations in SLAB are passed to page allocator
        like in SLUB, reducing number of kmalloc caches.

      - Removes the {kmem_cache_alloc,kmalloc}_node variants of
        tracepoints, node id parameter added to non-_node variants.

 - Addition of kmalloc_size_roundup()

   The first two patches from a series by Kees Cook [2] that introduce
   kmalloc_size_roundup(). This will allow merging of per-subsystem
   patches using the new function and ultimately stop (ab)using ksize()
   in a way that causes ongoing trouble for debugging functionality and
   static checkers.

 - Wasted kmalloc() memory tracking in debugfs alloc_traces

   A patch from Feng Tang that enhances the existing debugfs
   alloc_traces file for kmalloc caches with information about how much
   space is wasted by allocations that needs less space than the
   particular kmalloc cache provides.

 - My series [3] to fix validation races for caches with enabled
   debugging:

      - By decoupling the debug cache operation more from non-debug
        fastpaths, extra locking simplifications were possible and thus
        done afterwards.

      - Additional cleanup of PREEMPT_RT specific code on top, by Thomas
        Gleixner.

      - A late fix for slab page leaks caused by the series, by Feng
        Tang.

 - Smaller fixes and cleanups:

      - Unneeded variable removals, by ye xingchen

      - A cleanup removing a BUG_ON() in create_unique_id(), by Chao Yu

Link: https://lore.kernel.org/all/20220817101826.236819-1-42.hyeyoo@gmail.com/ [1]
Link: https://lore.kernel.org/all/20220923202822.2667581-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/all/20220823170400.26546-1-vbabka@suse.cz/ [3]

* tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (30 commits)
  mm/slub: fix a slab missed to be freed problem
  slab: Introduce kmalloc_size_roundup()
  slab: Remove __malloc attribute from realloc functions
  mm/slub: clean up create_unique_id()
  mm/slub: enable debugging memory wasting of kmalloc
  slub: Make PREEMPT_RT support less convoluted
  mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock()
  mm/slub: convert object_map_lock to non-raw spinlock
  mm/slub: remove slab_lock() usage for debug operations
  mm/slub: restrict sysfs validation to debug caches and make it safe
  mm/sl[au]b: check if large object is valid in __ksize()
  mm/slab_common: move declaration of __ksize() to mm/slab.h
  mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
  mm/slab_common: unify NUMA and UMA version of tracepoints
  mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace()
  mm/sl[au]b: generalize kmalloc subsystem
  mm/slub: move free_debug_processing() further
  mm/sl[au]b: introduce common alloc/free functions without tracepoint
  mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
  mm/slab_common: cleanup kmalloc_large()
  ...
parents 55be6084 00a7829b
...@@ -400,21 +400,30 @@ information: ...@@ -400,21 +400,30 @@ information:
allocated objects. The output is sorted by frequency of each trace. allocated objects. The output is sorted by frequency of each trace.
Information in the output: Information in the output:
Number of objects, allocating function, minimal/average/maximal jiffies since alloc, Number of objects, allocating function, possible memory wastage of
pid range of the allocating processes, cpu mask of allocating cpus, and stack trace. kmalloc objects(total/per-object), minimal/average/maximal jiffies
since alloc, pid range of the allocating processes, cpu mask of
allocating cpus, numa node mask of origins of memory, and stack trace.
Example::: Example:::
1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1:: 338 pci_alloc_dev+0x2c/0xa0 waste=521872/1544 age=290837/291891/293509 pid=1 cpus=106 nodes=0-1
__slab_alloc+0x6d/0x90 __kmem_cache_alloc_node+0x11f/0x4e0
kmem_cache_alloc_trace+0x2eb/0x300 kmalloc_trace+0x26/0xa0
populate_error_injection_list+0x97/0x110 pci_alloc_dev+0x2c/0xa0
init_error_injection+0x1b/0x71 pci_scan_single_device+0xd2/0x150
do_one_initcall+0x5f/0x2d0 pci_scan_slot+0xf7/0x2d0
kernel_init_freeable+0x26f/0x2d7 pci_scan_child_bus_extend+0x4e/0x360
kernel_init+0xe/0x118 acpi_pci_root_create+0x32e/0x3b0
ret_from_fork+0x22/0x30 pci_acpi_scan_root+0x2b9/0x2d0
acpi_pci_root_add.cold.11+0x110/0xb0a
acpi_bus_attach+0x262/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0
2. free_traces:: 2. free_traces::
......
...@@ -35,7 +35,8 @@ ...@@ -35,7 +35,8 @@
/* /*
* Note: do not use this directly. Instead, use __alloc_size() since it is conditionally * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally
* available and includes other attributes. * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined
* in compiler-gcc.h, due to misbehaviors.
* *
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
* clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
......
...@@ -271,14 +271,16 @@ struct ftrace_likely_data { ...@@ -271,14 +271,16 @@ struct ftrace_likely_data {
/* /*
* Any place that could be marked with the "alloc_size" attribute is also * Any place that could be marked with the "alloc_size" attribute is also
* a place to be marked with the "malloc" attribute. Do this as part of the * a place to be marked with the "malloc" attribute, except those that may
* __alloc_size macro to avoid redundant attributes and to avoid missing a * be performing a _reallocation_, as that may alias the existing pointer.
* __malloc marking. * For these, use __realloc_size().
*/ */
#ifdef __alloc_size__ #ifdef __alloc_size__
# define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc # define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc
# define __realloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__)
#else #else
# define __alloc_size(x, ...) __malloc # define __alloc_size(x, ...) __malloc
# define __realloc_size(x, ...)
#endif #endif
#ifndef asm_volatile_goto #ifndef asm_volatile_goto
......
This diff is collapsed.
...@@ -9,16 +9,15 @@ ...@@ -9,16 +9,15 @@
#include <linux/tracepoint.h> #include <linux/tracepoint.h>
#include <trace/events/mmflags.h> #include <trace/events/mmflags.h>
DECLARE_EVENT_CLASS(kmem_alloc, TRACE_EVENT(kmem_cache_alloc,
TP_PROTO(unsigned long call_site, TP_PROTO(unsigned long call_site,
const void *ptr, const void *ptr,
struct kmem_cache *s, struct kmem_cache *s,
size_t bytes_req, gfp_t gfp_flags,
size_t bytes_alloc, int node),
gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags), TP_ARGS(call_site, ptr, s, gfp_flags, node),
TP_STRUCT__entry( TP_STRUCT__entry(
__field( unsigned long, call_site ) __field( unsigned long, call_site )
...@@ -26,56 +25,42 @@ DECLARE_EVENT_CLASS(kmem_alloc, ...@@ -26,56 +25,42 @@ DECLARE_EVENT_CLASS(kmem_alloc,
__field( size_t, bytes_req ) __field( size_t, bytes_req )
__field( size_t, bytes_alloc ) __field( size_t, bytes_alloc )
__field( unsigned long, gfp_flags ) __field( unsigned long, gfp_flags )
__field( int, node )
__field( bool, accounted ) __field( bool, accounted )
), ),
TP_fast_assign( TP_fast_assign(
__entry->call_site = call_site; __entry->call_site = call_site;
__entry->ptr = ptr; __entry->ptr = ptr;
__entry->bytes_req = bytes_req; __entry->bytes_req = s->object_size;
__entry->bytes_alloc = bytes_alloc; __entry->bytes_alloc = s->size;
__entry->gfp_flags = (__force unsigned long)gfp_flags; __entry->gfp_flags = (__force unsigned long)gfp_flags;
__entry->node = node;
__entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ? __entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ?
((gfp_flags & __GFP_ACCOUNT) || ((gfp_flags & __GFP_ACCOUNT) ||
(s && s->flags & SLAB_ACCOUNT)) : false; (s->flags & SLAB_ACCOUNT)) : false;
), ),
TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s accounted=%s", TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s",
(void *)__entry->call_site, (void *)__entry->call_site,
__entry->ptr, __entry->ptr,
__entry->bytes_req, __entry->bytes_req,
__entry->bytes_alloc, __entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags), show_gfp_flags(__entry->gfp_flags),
__entry->node,
__entry->accounted ? "true" : "false") __entry->accounted ? "true" : "false")
); );
DEFINE_EVENT(kmem_alloc, kmalloc, TRACE_EVENT(kmalloc,
TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s,
size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags)
);
DEFINE_EVENT(kmem_alloc, kmem_cache_alloc,
TP_PROTO(unsigned long call_site, const void *ptr, struct kmem_cache *s,
size_t bytes_req, size_t bytes_alloc, gfp_t gfp_flags),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags)
);
DECLARE_EVENT_CLASS(kmem_alloc_node,
TP_PROTO(unsigned long call_site, TP_PROTO(unsigned long call_site,
const void *ptr, const void *ptr,
struct kmem_cache *s,
size_t bytes_req, size_t bytes_req,
size_t bytes_alloc, size_t bytes_alloc,
gfp_t gfp_flags, gfp_t gfp_flags,
int node), int node),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node), TP_ARGS(call_site, ptr, bytes_req, bytes_alloc, gfp_flags, node),
TP_STRUCT__entry( TP_STRUCT__entry(
__field( unsigned long, call_site ) __field( unsigned long, call_site )
...@@ -84,7 +69,6 @@ DECLARE_EVENT_CLASS(kmem_alloc_node, ...@@ -84,7 +69,6 @@ DECLARE_EVENT_CLASS(kmem_alloc_node,
__field( size_t, bytes_alloc ) __field( size_t, bytes_alloc )
__field( unsigned long, gfp_flags ) __field( unsigned long, gfp_flags )
__field( int, node ) __field( int, node )
__field( bool, accounted )
), ),
TP_fast_assign( TP_fast_assign(
...@@ -94,9 +78,6 @@ DECLARE_EVENT_CLASS(kmem_alloc_node, ...@@ -94,9 +78,6 @@ DECLARE_EVENT_CLASS(kmem_alloc_node,
__entry->bytes_alloc = bytes_alloc; __entry->bytes_alloc = bytes_alloc;
__entry->gfp_flags = (__force unsigned long)gfp_flags; __entry->gfp_flags = (__force unsigned long)gfp_flags;
__entry->node = node; __entry->node = node;
__entry->accounted = IS_ENABLED(CONFIG_MEMCG_KMEM) ?
((gfp_flags & __GFP_ACCOUNT) ||
(s && s->flags & SLAB_ACCOUNT)) : false;
), ),
TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s", TP_printk("call_site=%pS ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s node=%d accounted=%s",
...@@ -106,25 +87,8 @@ DECLARE_EVENT_CLASS(kmem_alloc_node, ...@@ -106,25 +87,8 @@ DECLARE_EVENT_CLASS(kmem_alloc_node,
__entry->bytes_alloc, __entry->bytes_alloc,
show_gfp_flags(__entry->gfp_flags), show_gfp_flags(__entry->gfp_flags),
__entry->node, __entry->node,
__entry->accounted ? "true" : "false") (IS_ENABLED(CONFIG_MEMCG_KMEM) &&
); (__entry->gfp_flags & (__force unsigned long)__GFP_ACCOUNT)) ? "true" : "false")
DEFINE_EVENT(kmem_alloc_node, kmalloc_node,
TP_PROTO(unsigned long call_site, const void *ptr,
struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc,
gfp_t gfp_flags, int node),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node)
);
DEFINE_EVENT(kmem_alloc_node, kmem_cache_alloc_node,
TP_PROTO(unsigned long call_site, const void *ptr,
struct kmem_cache *s, size_t bytes_req, size_t bytes_alloc,
gfp_t gfp_flags, int node),
TP_ARGS(call_site, ptr, s, bytes_req, bytes_alloc, gfp_flags, node)
); );
TRACE_EVENT(kfree, TRACE_EVENT(kfree,
...@@ -149,20 +113,20 @@ TRACE_EVENT(kfree, ...@@ -149,20 +113,20 @@ TRACE_EVENT(kfree,
TRACE_EVENT(kmem_cache_free, TRACE_EVENT(kmem_cache_free,
TP_PROTO(unsigned long call_site, const void *ptr, const char *name), TP_PROTO(unsigned long call_site, const void *ptr, const struct kmem_cache *s),
TP_ARGS(call_site, ptr, name), TP_ARGS(call_site, ptr, s),
TP_STRUCT__entry( TP_STRUCT__entry(
__field( unsigned long, call_site ) __field( unsigned long, call_site )
__field( const void *, ptr ) __field( const void *, ptr )
__string( name, name ) __string( name, s->name )
), ),
TP_fast_assign( TP_fast_assign(
__entry->call_site = call_site; __entry->call_site = call_site;
__entry->ptr = ptr; __entry->ptr = ptr;
__assign_str(name, name); __assign_str(name, s->name);
), ),
TP_printk("call_site=%pS ptr=%p name=%s", TP_printk("call_site=%pS ptr=%p name=%s",
......
...@@ -86,6 +86,7 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries ...@@ -86,6 +86,7 @@ static int get_stack_skipnr(const unsigned long stack_entries[], int num_entries
/* Also the *_bulk() variants by only checking prefixes. */ /* Also the *_bulk() variants by only checking prefixes. */
if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") || if (str_has_prefix(buf, ARCH_FUNC_PREFIX "kfree") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") || str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_free") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmem_cache_free") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") || str_has_prefix(buf, ARCH_FUNC_PREFIX "__kmalloc") ||
str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc")) str_has_prefix(buf, ARCH_FUNC_PREFIX "kmem_cache_alloc"))
goto found; goto found;
......
This diff is collapsed.
...@@ -273,6 +273,11 @@ void create_kmalloc_caches(slab_flags_t); ...@@ -273,6 +273,11 @@ void create_kmalloc_caches(slab_flags_t);
/* Find the kmalloc slab corresponding for a certain size */ /* Find the kmalloc slab corresponding for a certain size */
struct kmem_cache *kmalloc_slab(size_t, gfp_t); struct kmem_cache *kmalloc_slab(size_t, gfp_t);
void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t orig_size,
unsigned long caller);
void __kmem_cache_free(struct kmem_cache *s, void *x, unsigned long caller);
#endif #endif
gfp_t kmalloc_fix_flags(gfp_t flags); gfp_t kmalloc_fix_flags(gfp_t flags);
...@@ -658,8 +663,13 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) ...@@ -658,8 +663,13 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x)
print_tracking(cachep, x); print_tracking(cachep, x);
return cachep; return cachep;
} }
void free_large_kmalloc(struct folio *folio, void *object);
#endif /* CONFIG_SLOB */ #endif /* CONFIG_SLOB */
size_t __ksize(const void *objp);
static inline size_t slab_ksize(const struct kmem_cache *s) static inline size_t slab_ksize(const struct kmem_cache *s)
{ {
#ifndef CONFIG_SLUB #ifndef CONFIG_SLUB
......
...@@ -511,13 +511,9 @@ EXPORT_SYMBOL(kmem_cache_destroy); ...@@ -511,13 +511,9 @@ EXPORT_SYMBOL(kmem_cache_destroy);
*/ */
int kmem_cache_shrink(struct kmem_cache *cachep) int kmem_cache_shrink(struct kmem_cache *cachep)
{ {
int ret;
kasan_cache_shrink(cachep); kasan_cache_shrink(cachep);
ret = __kmem_cache_shrink(cachep);
return ret; return __kmem_cache_shrink(cachep);
} }
EXPORT_SYMBOL(kmem_cache_shrink); EXPORT_SYMBOL(kmem_cache_shrink);
...@@ -665,7 +661,8 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name, ...@@ -665,7 +661,8 @@ struct kmem_cache *__init create_kmalloc_cache(const char *name,
if (!s) if (!s)
panic("Out of memory when creating slab %s\n", name); panic("Out of memory when creating slab %s\n", name);
create_boot_cache(s, name, size, flags, useroffset, usersize); create_boot_cache(s, name, size, flags | SLAB_KMALLOC, useroffset,
usersize);
kasan_cache_create_kmalloc(s); kasan_cache_create_kmalloc(s);
list_add(&s->list, &slab_caches); list_add(&s->list, &slab_caches);
s->refcount = 1; s->refcount = 1;
...@@ -737,6 +734,26 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags) ...@@ -737,6 +734,26 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
return kmalloc_caches[kmalloc_type(flags)][index]; return kmalloc_caches[kmalloc_type(flags)][index];
} }
size_t kmalloc_size_roundup(size_t size)
{
struct kmem_cache *c;
/* Short-circuit the 0 size case. */
if (unlikely(size == 0))
return 0;
/* Short-circuit saturated "too-large" case. */
if (unlikely(size == SIZE_MAX))
return SIZE_MAX;
/* Above the smaller buckets, size is a multiple of page size. */
if (size > KMALLOC_MAX_CACHE_SIZE)
return PAGE_SIZE << get_order(size);
/* The flags don't matter since size_index is common to all. */
c = kmalloc_slab(size, GFP_KERNEL);
return c ? c->object_size : 0;
}
EXPORT_SYMBOL(kmalloc_size_roundup);
#ifdef CONFIG_ZONE_DMA #ifdef CONFIG_ZONE_DMA
#define KMALLOC_DMA_NAME(sz) .name[KMALLOC_DMA] = "dma-kmalloc-" #sz, #define KMALLOC_DMA_NAME(sz) .name[KMALLOC_DMA] = "dma-kmalloc-" #sz,
#else #else
...@@ -760,8 +777,8 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags) ...@@ -760,8 +777,8 @@ struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
/* /*
* kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time. * kmalloc_info[] is to make slub_debug=,kmalloc-xx option work at boot time.
* kmalloc_index() supports up to 2^25=32MB, so the final entry of the table is * kmalloc_index() supports up to 2^21=2MB, so the final entry of the table is
* kmalloc-32M. * kmalloc-2M.
*/ */
const struct kmalloc_info_struct kmalloc_info[] __initconst = { const struct kmalloc_info_struct kmalloc_info[] __initconst = {
INIT_KMALLOC_INFO(0, 0), INIT_KMALLOC_INFO(0, 0),
...@@ -785,11 +802,7 @@ const struct kmalloc_info_struct kmalloc_info[] __initconst = { ...@@ -785,11 +802,7 @@ const struct kmalloc_info_struct kmalloc_info[] __initconst = {
INIT_KMALLOC_INFO(262144, 256k), INIT_KMALLOC_INFO(262144, 256k),
INIT_KMALLOC_INFO(524288, 512k), INIT_KMALLOC_INFO(524288, 512k),
INIT_KMALLOC_INFO(1048576, 1M), INIT_KMALLOC_INFO(1048576, 1M),
INIT_KMALLOC_INFO(2097152, 2M), INIT_KMALLOC_INFO(2097152, 2M)
INIT_KMALLOC_INFO(4194304, 4M),
INIT_KMALLOC_INFO(8388608, 8M),
INIT_KMALLOC_INFO(16777216, 16M),
INIT_KMALLOC_INFO(33554432, 32M)
}; };
/* /*
...@@ -902,6 +915,155 @@ void __init create_kmalloc_caches(slab_flags_t flags) ...@@ -902,6 +915,155 @@ void __init create_kmalloc_caches(slab_flags_t flags)
/* Kmalloc array is now usable */ /* Kmalloc array is now usable */
slab_state = UP; slab_state = UP;
} }
void free_large_kmalloc(struct folio *folio, void *object)
{
unsigned int order = folio_order(folio);
if (WARN_ON_ONCE(order == 0))
pr_warn_once("object pointer: 0x%p\n", object);
kmemleak_free(object);
kasan_kfree_large(object);
mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
-(PAGE_SIZE << order));
__free_pages(folio_page(folio, 0), order);
}
static void *__kmalloc_large_node(size_t size, gfp_t flags, int node);
static __always_inline
void *__do_kmalloc_node(size_t size, gfp_t flags, int node, unsigned long caller)
{
struct kmem_cache *s;
void *ret;
if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
ret = __kmalloc_large_node(size, flags, node);
trace_kmalloc(_RET_IP_, ret, size,
PAGE_SIZE << get_order(size), flags, node);
return ret;
}
s = kmalloc_slab(size, flags);
if (unlikely(ZERO_OR_NULL_PTR(s)))
return s;
ret = __kmem_cache_alloc_node(s, flags, node, size, caller);
ret = kasan_kmalloc(s, ret, size, flags);
trace_kmalloc(_RET_IP_, ret, size, s->size, flags, node);
return ret;
}
void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
return __do_kmalloc_node(size, flags, node, _RET_IP_);
}
EXPORT_SYMBOL(__kmalloc_node);
void *__kmalloc(size_t size, gfp_t flags)
{
return __do_kmalloc_node(size, flags, NUMA_NO_NODE, _RET_IP_);
}
EXPORT_SYMBOL(__kmalloc);
void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
int node, unsigned long caller)
{
return __do_kmalloc_node(size, flags, node, caller);
}
EXPORT_SYMBOL(__kmalloc_node_track_caller);
/**
* kfree - free previously allocated memory
* @object: pointer returned by kmalloc.
*
* If @object is NULL, no operation is performed.
*
* Don't free memory not originally allocated by kmalloc()
* or you will run into trouble.
*/
void kfree(const void *object)
{
struct folio *folio;
struct slab *slab;
struct kmem_cache *s;
trace_kfree(_RET_IP_, object);
if (unlikely(ZERO_OR_NULL_PTR(object)))
return;
folio = virt_to_folio(object);
if (unlikely(!folio_test_slab(folio))) {
free_large_kmalloc(folio, (void *)object);
return;
}
slab = folio_slab(folio);
s = slab->slab_cache;
__kmem_cache_free(s, (void *)object, _RET_IP_);
}
EXPORT_SYMBOL(kfree);
/**
* __ksize -- Report full size of underlying allocation
* @objp: pointer to the object
*
* This should only be used internally to query the true size of allocations.
* It is not meant to be a way to discover the usable size of an allocation
* after the fact. Instead, use kmalloc_size_roundup(). Using memory beyond
* the originally requested allocation size may trigger KASAN, UBSAN_BOUNDS,
* and/or FORTIFY_SOURCE.
*
* Return: size of the actual memory used by @objp in bytes
*/
size_t __ksize(const void *object)
{
struct folio *folio;
if (unlikely(object == ZERO_SIZE_PTR))
return 0;
folio = virt_to_folio(object);
if (unlikely(!folio_test_slab(folio))) {
if (WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE))
return 0;
if (WARN_ON(object != folio_address(folio)))
return 0;
return folio_size(folio);
}
return slab_ksize(folio_slab(folio)->slab_cache);
}
#ifdef CONFIG_TRACING
void *kmalloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size)
{
void *ret = __kmem_cache_alloc_node(s, gfpflags, NUMA_NO_NODE,
size, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmalloc_trace);
void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size)
{
void *ret = __kmem_cache_alloc_node(s, gfpflags, node, size, _RET_IP_);
trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, node);
ret = kasan_kmalloc(s, ret, size, gfpflags);
return ret;
}
EXPORT_SYMBOL(kmalloc_node_trace);
#endif /* !CONFIG_TRACING */
#endif /* !CONFIG_SLOB */ #endif /* !CONFIG_SLOB */
gfp_t kmalloc_fix_flags(gfp_t flags) gfp_t kmalloc_fix_flags(gfp_t flags)
...@@ -921,37 +1083,50 @@ gfp_t kmalloc_fix_flags(gfp_t flags) ...@@ -921,37 +1083,50 @@ gfp_t kmalloc_fix_flags(gfp_t flags)
* directly to the page allocator. We use __GFP_COMP, because we will need to * directly to the page allocator. We use __GFP_COMP, because we will need to
* know the allocation order to free the pages properly in kfree. * know the allocation order to free the pages properly in kfree.
*/ */
void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
{ {
void *ret = NULL;
struct page *page; struct page *page;
void *ptr = NULL;
unsigned int order = get_order(size);
if (unlikely(flags & GFP_SLAB_BUG_MASK)) if (unlikely(flags & GFP_SLAB_BUG_MASK))
flags = kmalloc_fix_flags(flags); flags = kmalloc_fix_flags(flags);
flags |= __GFP_COMP; flags |= __GFP_COMP;
page = alloc_pages(flags, order); page = alloc_pages_node(node, flags, order);
if (likely(page)) { if (page) {
ret = page_address(page); ptr = page_address(page);
mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B, mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
PAGE_SIZE << order); PAGE_SIZE << order);
} }
ret = kasan_kmalloc_large(ret, size, flags);
/* As ret might get tagged, call kmemleak hook after KASAN. */ ptr = kasan_kmalloc_large(ptr, size, flags);
kmemleak_alloc(ret, size, 1, flags); /* As ptr might get tagged, call kmemleak hook after KASAN. */
kmemleak_alloc(ptr, size, 1, flags);
return ptr;
}
void *kmalloc_large(size_t size, gfp_t flags)
{
void *ret = __kmalloc_large_node(size, flags, NUMA_NO_NODE);
trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
flags, NUMA_NO_NODE);
return ret; return ret;
} }
EXPORT_SYMBOL(kmalloc_order); EXPORT_SYMBOL(kmalloc_large);
#ifdef CONFIG_TRACING void *kmalloc_large_node(size_t size, gfp_t flags, int node)
void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
{ {
void *ret = kmalloc_order(size, flags, order); void *ret = __kmalloc_large_node(size, flags, node);
trace_kmalloc(_RET_IP_, ret, NULL, size, PAGE_SIZE << order, flags);
trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
flags, node);
return ret; return ret;
} }
EXPORT_SYMBOL(kmalloc_order_trace); EXPORT_SYMBOL(kmalloc_large_node);
#endif
#ifdef CONFIG_SLAB_FREELIST_RANDOM #ifdef CONFIG_SLAB_FREELIST_RANDOM
/* Randomize a generic freelist */ /* Randomize a generic freelist */
...@@ -1150,8 +1325,8 @@ module_init(slab_proc_init); ...@@ -1150,8 +1325,8 @@ module_init(slab_proc_init);
#endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */ #endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */
static __always_inline void *__do_krealloc(const void *p, size_t new_size, static __always_inline __realloc_size(2) void *
gfp_t flags) __do_krealloc(const void *p, size_t new_size, gfp_t flags)
{ {
void *ret; void *ret;
size_t ks; size_t ks;
...@@ -1283,8 +1458,6 @@ EXPORT_SYMBOL(ksize); ...@@ -1283,8 +1458,6 @@ EXPORT_SYMBOL(ksize);
/* Tracepoints definitions. */ /* Tracepoints definitions. */
EXPORT_TRACEPOINT_SYMBOL(kmalloc); EXPORT_TRACEPOINT_SYMBOL(kmalloc);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc); EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc);
EXPORT_TRACEPOINT_SYMBOL(kmalloc_node);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_alloc_node);
EXPORT_TRACEPOINT_SYMBOL(kfree); EXPORT_TRACEPOINT_SYMBOL(kfree);
EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free); EXPORT_TRACEPOINT_SYMBOL(kmem_cache_free);
......
...@@ -507,8 +507,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller) ...@@ -507,8 +507,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
*m = size; *m = size;
ret = (void *)m + minalign; ret = (void *)m + minalign;
trace_kmalloc_node(caller, ret, NULL, trace_kmalloc(caller, ret, size, size + minalign, gfp, node);
size, size + minalign, gfp, node);
} else { } else {
unsigned int order = get_order(size); unsigned int order = get_order(size);
...@@ -516,8 +515,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller) ...@@ -516,8 +515,7 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
gfp |= __GFP_COMP; gfp |= __GFP_COMP;
ret = slob_new_pages(gfp, order, node); ret = slob_new_pages(gfp, order, node);
trace_kmalloc_node(caller, ret, NULL, trace_kmalloc(caller, ret, size, PAGE_SIZE << order, gfp, node);
size, PAGE_SIZE << order, gfp, node);
} }
kmemleak_alloc(ret, size, 1, gfp); kmemleak_alloc(ret, size, 1, gfp);
...@@ -530,20 +528,12 @@ void *__kmalloc(size_t size, gfp_t gfp) ...@@ -530,20 +528,12 @@ void *__kmalloc(size_t size, gfp_t gfp)
} }
EXPORT_SYMBOL(__kmalloc); EXPORT_SYMBOL(__kmalloc);
void *__kmalloc_track_caller(size_t size, gfp_t gfp, unsigned long caller)
{
return __do_kmalloc_node(size, gfp, NUMA_NO_NODE, caller);
}
EXPORT_SYMBOL(__kmalloc_track_caller);
#ifdef CONFIG_NUMA
void *__kmalloc_node_track_caller(size_t size, gfp_t gfp, void *__kmalloc_node_track_caller(size_t size, gfp_t gfp,
int node, unsigned long caller) int node, unsigned long caller)
{ {
return __do_kmalloc_node(size, gfp, node, caller); return __do_kmalloc_node(size, gfp, node, caller);
} }
EXPORT_SYMBOL(__kmalloc_node_track_caller); EXPORT_SYMBOL(__kmalloc_node_track_caller);
#endif
void kfree(const void *block) void kfree(const void *block)
{ {
...@@ -574,6 +564,20 @@ void kfree(const void *block) ...@@ -574,6 +564,20 @@ void kfree(const void *block)
} }
EXPORT_SYMBOL(kfree); EXPORT_SYMBOL(kfree);
size_t kmalloc_size_roundup(size_t size)
{
/* Short-circuit the 0 size case. */
if (unlikely(size == 0))
return 0;
/* Short-circuit saturated "too-large" case. */
if (unlikely(size == SIZE_MAX))
return SIZE_MAX;
return ALIGN(size, ARCH_KMALLOC_MINALIGN);
}
EXPORT_SYMBOL(kmalloc_size_roundup);
/* can't use ksize for kmem_cache_alloc memory, only kmalloc */ /* can't use ksize for kmem_cache_alloc memory, only kmalloc */
size_t __ksize(const void *block) size_t __ksize(const void *block)
{ {
...@@ -594,7 +598,6 @@ size_t __ksize(const void *block) ...@@ -594,7 +598,6 @@ size_t __ksize(const void *block)
m = (unsigned int *)(block - align); m = (unsigned int *)(block - align);
return SLOB_UNITS(*m) * SLOB_UNIT; return SLOB_UNITS(*m) * SLOB_UNIT;
} }
EXPORT_SYMBOL(__ksize);
int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags) int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags)
{ {
...@@ -602,6 +605,9 @@ int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags) ...@@ -602,6 +605,9 @@ int __kmem_cache_create(struct kmem_cache *c, slab_flags_t flags)
/* leave room for rcu footer at the end of object */ /* leave room for rcu footer at the end of object */
c->size += sizeof(struct slob_rcu); c->size += sizeof(struct slob_rcu);
} }
/* Actual size allocated */
c->size = SLOB_UNITS(c->size) * SLOB_UNIT;
c->flags = flags; c->flags = flags;
return 0; return 0;
} }
...@@ -616,14 +622,10 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node) ...@@ -616,14 +622,10 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
if (c->size < PAGE_SIZE) { if (c->size < PAGE_SIZE) {
b = slob_alloc(c->size, flags, c->align, node, 0); b = slob_alloc(c->size, flags, c->align, node, 0);
trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node);
SLOB_UNITS(c->size) * SLOB_UNIT,
flags, node);
} else { } else {
b = slob_new_pages(flags, get_order(c->size), node); b = slob_new_pages(flags, get_order(c->size), node);
trace_kmem_cache_alloc_node(_RET_IP_, b, NULL, c->object_size, trace_kmem_cache_alloc(_RET_IP_, b, c, flags, node);
PAGE_SIZE << get_order(c->size),
flags, node);
} }
if (b && c->ctor) { if (b && c->ctor) {
...@@ -647,7 +649,7 @@ void *kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru, gfp_ ...@@ -647,7 +649,7 @@ void *kmem_cache_alloc_lru(struct kmem_cache *cachep, struct list_lru *lru, gfp_
return slob_alloc_node(cachep, flags, NUMA_NO_NODE); return slob_alloc_node(cachep, flags, NUMA_NO_NODE);
} }
EXPORT_SYMBOL(kmem_cache_alloc_lru); EXPORT_SYMBOL(kmem_cache_alloc_lru);
#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t gfp, int node) void *__kmalloc_node(size_t size, gfp_t gfp, int node)
{ {
return __do_kmalloc_node(size, gfp, node, _RET_IP_); return __do_kmalloc_node(size, gfp, node, _RET_IP_);
...@@ -659,7 +661,6 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t gfp, int node) ...@@ -659,7 +661,6 @@ void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t gfp, int node)
return slob_alloc_node(cachep, gfp, node); return slob_alloc_node(cachep, gfp, node);
} }
EXPORT_SYMBOL(kmem_cache_alloc_node); EXPORT_SYMBOL(kmem_cache_alloc_node);
#endif
static void __kmem_cache_free(void *b, int size) static void __kmem_cache_free(void *b, int size)
{ {
...@@ -680,7 +681,7 @@ static void kmem_rcu_free(struct rcu_head *head) ...@@ -680,7 +681,7 @@ static void kmem_rcu_free(struct rcu_head *head)
void kmem_cache_free(struct kmem_cache *c, void *b) void kmem_cache_free(struct kmem_cache *c, void *b)
{ {
kmemleak_free_recursive(b, c->flags); kmemleak_free_recursive(b, c->flags);
trace_kmem_cache_free(_RET_IP_, b, c->name); trace_kmem_cache_free(_RET_IP_, b, c);
if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) { if (unlikely(c->flags & SLAB_TYPESAFE_BY_RCU)) {
struct slob_rcu *slob_rcu; struct slob_rcu *slob_rcu;
slob_rcu = b + (c->size - sizeof(struct slob_rcu)); slob_rcu = b + (c->size - sizeof(struct slob_rcu));
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment