1. 30 Nov, 2022 3 commits
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/kmalloc_redzone' into slab/for-next · 61766652
      Vlastimil Babka authored
      Add a new slub_kunit test for the extended kmalloc redzone check, by
      Feng Tang. Also prevent unwanted kfence interaction with all slub kunit
      tests.
      61766652
    • Feng Tang's avatar
      mm/slub, kunit: Add a test case for kmalloc redzone check · 6cd6d33c
      Feng Tang authored
      kmalloc redzone check for slub has been merged, and it's better to add
      a kunit case for it, which is inspired by a real-world case as described
      in commit 120ee599 ("staging: octeon-usb: prevent memory corruption"):
      
      "
        octeon-hcd will crash the kernel when SLOB is used. This usually happens
        after the 18-byte control transfer when a device descriptor is read.
        The DMA engine is always transferring full 32-bit words and if the
        transfer is shorter, some random garbage appears after the buffer.
        The problem is not visible with SLUB since it rounds up the allocations
        to word boundary, and the extra bytes will go undetected.
      "
      
      To avoid interrupting the normal functioning of kmalloc caches, a
      kmem_cache mimicing kmalloc cache is created with similar flags, and
      kmalloc_trace() is used to really test the orig_size and redzone setup.
      Suggested-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      6cd6d33c
    • Feng Tang's avatar
      mm/slub, kunit: add SLAB_SKIP_KFENCE flag for cache creation · 4d9dd4b0
      Feng Tang authored
      When kfence is enabled, the buffer allocated from the test case
      could be from a kfence pool, and the operation could be also
      caught and reported by kfence first, causing the case to fail.
      
      With default kfence setting, this is very difficult to be triggered.
      By changing CONFIG_KFENCE_NUM_OBJECTS from 255 to 16383, and
      CONFIG_KFENCE_SAMPLE_INTERVAL from 100 to 5, the allocation from
      kfence did hit 7 times in different slub_kunit cases out of 900
      times of boot test.
      
      To avoid this, initially we tried is_kfence_address() to check this
      and repeated allocation till finding a non-kfence address. Vlastimil
      Babka suggested SLAB_SKIP_KFENCE flag could be used to achieve this,
      and better add a wrapper function for simplifying cache creation.
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      4d9dd4b0
  2. 21 Nov, 2022 14 commits
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/alloc_size' into slab/for-next · b5e72d27
      Vlastimil Babka authored
      Two patches from Kees Cook [1]:
      
      These patches work around a deficiency in GCC (>=11) and Clang (<16)
      where the __alloc_size attribute does not apply to inlines.
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503
      
      This manifests as reduced overflow detection coverage for many allocation
      sites under CONFIG_FORTIFY_SOURCE=y, where the allocation size was not
      actually being propagated to __builtin_dynamic_object_size().
      
      [1] https://lore.kernel.org/all/20221118034713.gonna.754-kees@kernel.org/
      b5e72d27
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/kmalloc_redzone' into slab/for-next · 90e9b23a
      Vlastimil Babka authored
      kmalloc() redzone improvements by Feng Tang
      
      From cover letter [1]:
      
      kmalloc's API family is critical for mm, and one of its nature is that
      it will round up the request size to a fixed one (mostly power of 2).
      When user requests memory for '2^n + 1' bytes, actually 2^(n+1) bytes
      could be allocated, so there is an extra space than what is originally
      requested.
      
      This patchset tries to extend the redzone sanity check to the extra
      kmalloced buffer than requested, to better detect un-legitimate access
      to it. (depends on SLAB_STORE_USER & SLAB_RED_ZONE)
      
      [1] https://lore.kernel.org/all/20221021032405.1825078-1-feng.tang@intel.com/
      90e9b23a
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/fit_rcu_head' into slab/for-next · 76537db3
      Vlastimil Babka authored
      A series by myself to reorder fields in struct slab to allow the
      embedded rcu_head to grow (for debugging purposes). Requires changes to
      isolate_movable_page() to skip slab pages which can otherwise become
      false-positive __PageMovable due to its use of low bits in
      page->mapping.
      76537db3
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/tools' into slab/for-next · 1c1aaa33
      Vlastimil Babka authored
      A patch for tools/vm/slabinfo to give more useful feedback when not run
      as a root, by Rong Tao.
      1c1aaa33
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/slub-sysfs' into slab/for-next · c64b95d3
      Vlastimil Babka authored
      - Two patches for SLUB's sysfs by Rasmus Villemoes to remove dead code
        and optimize boot time with late initialization.
      - Allow SLUB's sysfs 'failslab' parameter to be runtime-controllable
        again as it can be both useful and safe, by Alexander Atanasov.
      c64b95d3
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/locking' into slab/for-next · 14d3eb66
      Vlastimil Babka authored
      A patch from Jiri Kosina that makes SLAB's list_lock a raw_spinlock_t.
      While there are no plans to make SLAB actually compatible with
      PREEMPT_RT or any other future, it makes !PREEMPT_RT lockdep happy.
      14d3eb66
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/cleanups' into slab/for-next · 4b28ba9e
      Vlastimil Babka authored
      - Removal of dead code from deactivate_slab() by Hyeonggon Yoo.
      - Fix of BUILD_BUG_ON() for sufficient early percpu size by Baoquan He.
      - Make kmem_cache_alloc() kernel-doc less misleading, by myself.
      4b28ba9e
    • Kees Cook's avatar
      slab: Remove special-casing of const 0 size allocations · 6fa57d78
      Kees Cook authored
      Passing a constant-0 size allocation into kmalloc() or kmalloc_node()
      does not need to be a fast-path operation, so the static return value
      can be removed entirely. This makes sure that all paths through the
      inlines result in a full extern function call, where __alloc_size()
      hints will actually be seen[1] by GCC. (A constant return value of 0
      means the "0" allocation size won't be propagated by the inline.)
      
      [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96503
      
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: linux-mm@kvack.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      6fa57d78
    • Kees Cook's avatar
      slab: Clean up SLOB vs kmalloc() definition · 3bf01933
      Kees Cook authored
      As already done for kmalloc_node(), clean up the #ifdef usage in the
      definition of kmalloc() so that the SLOB-only version is an entirely
      separate and much more readable function.
      
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: linux-mm@kvack.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      3bf01933
    • Vlastimil Babka's avatar
      mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head · 130d4df5
      Vlastimil Babka authored
      Joel reports [1] that increasing the rcu_head size for debugging
      purposes used to work before struct slab was split from struct page, but
      now runs into the various SLAB_MATCH() sanity checks of the layout.
      
      This is because the rcu_head in struct page is in union with large
      sub-structures and has space to grow without exceeding their size, while
      in struct slab (for SLAB and SLUB) it's in union only with a list_head.
      
      On closer inspection (and after the previous patch) we can put all
      fields except slab_cache to a union with rcu_head, as slab_cache is
      sufficient for the rcu freeing callbacks to work and the rest can be
      overwritten by rcu_head without causing issues.
      
      This is only somewhat complicated by the need to keep SLUB's
      freelist+counters aligned for cmpxchg_double. As a result the fields
      need to be reordered so that slab_cache is first (after page flags) and
      the union with rcu_head follows. For consistency, do that for SLAB as
      well, although not necessary there.
      
      As a result, the rcu_head field in struct page and struct slab is no
      longer at the same offset, but that doesn't matter as there is no
      casting that would rely on that in the slab freeing callbacks, so we can
      just drop the respective SLAB_MATCH() check.
      
      Also we need to update the SLAB_MATCH() for compound_head to reflect the
      new ordering.
      
      While at it, also add a static_assert to check the alignment needed for
      cmpxchg_double so mistakes are found sooner than a runtime GPF.
      
      [1] https://lore.kernel.org/all/85afd876-d8bb-0804-b2c5-48ed3055e702@joelfernandes.org/Reported-by: default avatarJoel Fernandes <joel@joelfernandes.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      130d4df5
    • Vlastimil Babka's avatar
      mm/migrate: make isolate_movable_page() skip slab pages · 8b881763
      Vlastimil Babka authored
      In the next commit we want to rearrange struct slab fields to allow a larger
      rcu_head. Afterwards, the page->mapping field will overlap with SLUB's "struct
      list_head slab_list", where the value of prev pointer can become LIST_POISON2,
      which is 0x122 + POISON_POINTER_DELTA.  Unfortunately the bit 1 being set can
      confuse PageMovable() to be a false positive and cause a GPF as reported by lkp
      [1].
      
      To fix this, make isolate_movable_page() skip pages with the PageSlab flag set.
      This is a bit tricky as we need to add memory barriers to SLAB and SLUB's page
      allocation and freeing, and their counterparts to isolate_movable_page().
      
      Based on my RFC from [2]. Added a comment update from Matthew's variant in [3]
      and, as done there, moved the PageSlab checks to happen before trying to take
      the page lock.
      
      [1] https://lore.kernel.org/all/208c1757-5edd-fd42-67d4-1940cc43b50f@intel.com/
      [2] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@suse.cz/
      [3] https://lore.kernel.org/all/YzsVM8eToHUeTP75@casper.infradead.org/Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      8b881763
    • Vlastimil Babka's avatar
      mm/slab: move and adjust kernel-doc for kmem_cache_alloc · 838de63b
      Vlastimil Babka authored
      Alexander reports an issue with the kmem_cache_alloc() comment in
      mm/slab.c:
      
      > The current comment mentioned that the flags only matters if the
      > cache has no available objects. It's different for the __GFP_ZERO
      > flag which will ensure that the returned object is always zeroed
      > in any case.
      
      > I have the feeling I run into this question already two times if
      > the user need to zero the object or not, but the user does not need
      > to zero the object afterwards. However another use of __GFP_ZERO
      > and only zero the object if the cache has no available objects would
      > also make no sense.
      
      and suggests thus mentioning __GFP_ZERO as the exception. But on closer
      inspection, the part about flags being only relevant if cache has no
      available objects is misleading. The slab user has no reliable way to
      determine if there are available objects, and e.g. the might_sleep()
      debug check can be performed even if objects are available, so passing
      correct flags given the allocation context always matters.
      
      Thus remove that sentence completely, and while at it, move the comment
      to from SLAB-specific mm/slab.c to the common include/linux/slab.h
      The comment otherwise refers flags description for kmalloc(), so add
      __GFP_ZERO comment there and remove a very misleading GFP_HIGHUSER
      (not applicable to slab) description from there. Mention kzalloc() and
      kmem_cache_zalloc() shortcuts.
      Reported-by: default avatarAlexander Aring <aahringo@redhat.com>
      Link: https://lore.kernel.org/all/20221011145413.8025-1-aahringo@redhat.com/Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      838de63b
    • Baoquan He's avatar
      mm/slub, percpu: correct the calculation of early percpu allocation size · a0dc161a
      Baoquan He authored
      SLUB allocator relies on percpu allocator to initialize its ->cpu_slab
      during early boot. For that, the dynamic chunk of percpu which serves
      the early allocation need be large enough to satisfy the kmalloc
      creation.
      
      However, the current BUILD_BUG_ON() in alloc_kmem_cache_cpus() doesn't
      consider the kmalloc array with NR_KMALLOC_TYPES length. Fix that
      with correct calculation.
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      a0dc161a
    • Baoquan He's avatar
      percpu: adjust the value of PERCPU_DYNAMIC_EARLY_SIZE · e8753e41
      Baoquan He authored
      LKP reported a build failure as below on the following patch "mm/slub,
      percpu: correct the calculation of early percpu allocation size"
      
      ~~~~~~
      In file included from <command-line>:
      In function 'alloc_kmem_cache_cpus',
         inlined from 'kmem_cache_open' at mm/slub.c:4340:6:
      >> >> include/linux/compiler_types.h:357:45: error: call to '__compiletime_assert_474' declared with attribute error:
      BUILD_BUG_ON failed: PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu)
           357 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      ~~~~~~
      
      From the kernel config file provided by LKP, the building was made on
      arm64 with below Kconfig item enabled:
      
        CONFIG_ZONE_DMA=y
        CONFIG_SLUB_CPU_PARTIAL=y
        CONFIG_DEBUG_LOCK_ALLOC=y
        CONFIG_SLUB_STATS=y
        CONFIG_ARM64_PAGE_SHIFT=16
        CONFIG_ARM64_64K_PAGES=y
      
      Then we will have:
        NR_KMALLOC_TYPES:4
        KMALLOC_SHIFT_HIGH:17
        sizeof(struct kmem_cache_cpu):184
      
      The product of them is 12512, which is bigger than PERCPU_DYNAMIC_EARLY_SIZE,
      12K. Hence, the BUILD_BUG_ON in alloc_kmem_cache_cpus() is triggered.
      
      Earlier, in commit 099a19d9 ("percpu: allow limited allocation
      before slab is online"), PERCPU_DYNAMIC_EARLY_SIZE was introduced and
      set to 12K which is equal to the then PERPCU_DYNAMIC_RESERVE.
      Later, in commit 1a4d7607 ("percpu: implement asynchronous chunk
      population"), PERPCU_DYNAMIC_RESERVE was increased by 8K, while
      PERCPU_DYNAMIC_EARLY_SIZE was kept unchanged.
      
      So, here increase PERCPU_DYNAMIC_EARLY_SIZE by 8K too to accommodate to
      the slub's requirement.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      e8753e41
  3. 11 Nov, 2022 1 commit
    • Feng Tang's avatar
      mm/slub: extend redzone check to extra allocated kmalloc space than requested · 946fa0db
      Feng Tang authored
      kmalloc will round up the request size to a fixed size (mostly power
      of 2), so there could be a extra space than what is requested, whose
      size is the actual buffer size minus original request size.
      
      To better detect out of bound access or abuse of this space, add
      redzone sanity check for it.
      
      In current kernel, some kmalloc user already knows the existence of
      the space and utilizes it after calling 'ksize()' to know the real
      size of the allocated buffer. So we skip the sanity check for objects
      which have been called with ksize(), as treating them as legitimate
      users. Kees Cook is working on sanitizing all these user cases,
      by using kmalloc_size_roundup() to avoid ambiguous usages. And after
      this is done, this special handling for ksize() can be removed.
      
      In some cases, the free pointer could be saved inside the latter
      part of object data area, which may overlap the redzone part(for
      small sizes of kmalloc objects). As suggested by Hyeonggon Yoo,
      force the free pointer to be in meta data area when kmalloc redzone
      debug is enabled, to make all kmalloc objects covered by redzone
      check.
      Suggested-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      946fa0db
  4. 10 Nov, 2022 3 commits
  5. 07 Nov, 2022 1 commit
  6. 06 Nov, 2022 1 commit
    • Kees Cook's avatar
      mm/slab_common: Restore passing "caller" for tracing · 32868715
      Kees Cook authored
      The "caller" argument was accidentally being ignored in a few places
      that were recently refactored. Restore these "caller" arguments, instead
      of _RET_IP_.
      
      Fixes: 11e9734b ("mm/slab_common: unify NUMA and UMA version of tracepoints")
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: linux-mm@kvack.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      32868715
  7. 04 Nov, 2022 1 commit
  8. 03 Nov, 2022 1 commit
  9. 24 Oct, 2022 6 commits
    • Vlastimil Babka's avatar
      mm/slub: perform free consistency checks before call_rcu · bc29d5bd
      Vlastimil Babka authored
      For SLAB_TYPESAFE_BY_RCU caches we use call_rcu to perform empty slab
      freeing. The rcu callback rcu_free_slab() calls __free_slab() that
      currently includes checking the slab consistency for caches with
      SLAB_CONSISTENCY_CHECKS flags. This check needs the slab->objects field
      to be intact.
      
      Because in the next patch we want to allow rcu_head in struct slab to
      become larger in debug configurations and thus potentially overwrite
      more fields through a union than slab_list, we want to limit the fields
      used in rcu_free_slab().  Thus move the consistency checks to
      free_slab() before call_rcu(). This can be done safely even for
      SLAB_TYPESAFE_BY_RCU caches where accesses to the objects can still
      occur after freeing them.
      
      As a result, only the slab->slab_cache field has to be physically
      separate from rcu_head for the freeing callback to work. We also save
      some cycles in the rcu callback for caches with consistency checks
      enabled.
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      bc29d5bd
    • Jiri Kosina's avatar
      mm/slab: Annotate kmem_cache_node->list_lock as raw · b539ce9f
      Jiri Kosina authored
      The list_lock can be taken in hardirq context when do_drain() is being
      called via IPI on all cores, and therefore lockdep complains about it,
      because it can't be preempted on PREEMPT_RT.
      
      That's not a real issue, as SLAB can't be built on PREEMPT_RT anyway, but
      we still want to get rid of the warning on non-PREEMPT_RT builds.
      
      Annotate it therefore as a raw lock in order to get rid of he lockdep
      warning below.
      
      	 =============================
      	 [ BUG: Invalid wait context ]
      	 6.1.0-rc1-00134-ge35184f3 #4 Not tainted
      	 -----------------------------
      	 swapper/3/0 is trying to lock:
      	 ffff8bc88086dc18 (&parent->list_lock){..-.}-{3:3}, at: do_drain+0x57/0xb0
      	 other info that might help us debug this:
      	 context-{2:2}
      	 no locks held by swapper/3/0.
      	 stack backtrace:
      	 CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.1.0-rc1-00134-ge35184f3 #4
      	 Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
      	 Call Trace:
      	  <IRQ>
      	  dump_stack_lvl+0x6b/0x9d
      	  __lock_acquire+0x1519/0x1730
      	  ? build_sched_domains+0x4bd/0x1590
      	  ? __lock_acquire+0xad2/0x1730
      	  lock_acquire+0x294/0x340
      	  ? do_drain+0x57/0xb0
      	  ? sched_clock_tick+0x41/0x60
      	  _raw_spin_lock+0x2c/0x40
      	  ? do_drain+0x57/0xb0
      	  do_drain+0x57/0xb0
      	  __flush_smp_call_function_queue+0x138/0x220
      	  __sysvec_call_function+0x4f/0x210
      	  sysvec_call_function+0x4b/0x90
      	  </IRQ>
      	  <TASK>
      	  asm_sysvec_call_function+0x16/0x20
      	 RIP: 0010:mwait_idle+0x5e/0x80
      	 Code: 31 d2 65 48 8b 04 25 80 ed 01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 75 14 66 90 0f 00 2d 0b 78 46 00 31 c0 48 89 c1 fb 0f 01 c9 <eb> 06 fb 0f 1f 44 00 00 65 48 8b 04 25 80 ed 01 00 f0 80 60 02 df
      	 RSP: 0000:ffffa90940217ee0 EFLAGS: 00000246
      	 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      	 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9bb9f93a
      	 RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000001
      	 R10: ffffa90940217ea8 R11: 0000000000000000 R12: ffffffffffffffff
      	 R13: 0000000000000000 R14: ffff8bc88127c500 R15: 0000000000000000
      	  ? default_idle_call+0x1a/0xa0
      	  default_idle_call+0x4b/0xa0
      	  do_idle+0x1f1/0x2c0
      	  ? _raw_spin_unlock_irqrestore+0x56/0x70
      	  cpu_startup_entry+0x19/0x20
      	  start_secondary+0x122/0x150
      	  secondary_startup_64_no_verify+0xce/0xdb
      	  </TASK>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      b539ce9f
    • Hyeonggon Yoo's avatar
      mm/slub: remove dead code for debug caches on deactivate_slab() · a8e53869
      Hyeonggon Yoo authored
      After commit c7323a5a ("mm/slub: restrict sysfs validation to debug
      caches and make it safe"), SLUB never installs percpu slab for debug caches
      and thus never deactivates percpu slab for them.
      
      Since only debug caches use the full list, SLUB no longer deactivates to
      full list. Remove dead code in deactivate_slab().
      Signed-off-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      a8e53869
    • Alexander Atanasov's avatar
      mm: Make failslab writable again · 7c82b3b3
      Alexander Atanasov authored
      In (060807f8 mm, slub: make remaining slub_debug related attributes
      read-only) failslab was made read-only.
      I think it became a collateral victim to the two other options for which
      the reasons are perfectly valid.
      Here is why:
       - sanity_checks and trace are slab internal debug options,
         failslab is used for fault injection.
       - for fault injections, which by presumption are random, it
         does not matter if it is not set atomically. And you need to
         set atleast one more option to trigger fault injection.
       - in a testing scenario you may need to change it at runtime
         example: module loading - you test all allocations limited
         by the space option. Then you move to test only your module's
         own slabs.
       - when set by command line flags it effectively disables all
         cache merges.
      
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Vijayanand Jitta <vjitta@codeaurora.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.czSigned-off-by: default avatarAlexander Atanasov <alexander.atanasov@virtuozzo.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      7c82b3b3
    • Rasmus Villemoes's avatar
      mm: slub: make slab_sysfs_init() a late_initcall · 1a5ad30b
      Rasmus Villemoes authored
      Currently, slab_sysfs_init() is an __initcall aka device_initcall. It
      is rather time-consuming; on my board it takes around 11ms. That's
      about 1% of the time budget I have from U-Boot letting go and until
      linux must assume responsibility of keeping the external watchdog
      happy.
      
      There's no particular reason this would need to run at device_initcall
      time, so instead make it a late_initcall to allow vital functionality
      to get started a bit sooner.
      
      This actually ends up winning more than just those 11ms, because the
      slab caches that get created during other device_initcalls (and before
      my watchdog device gets probed) now don't end up doing the somewhat
      expensive sysfs_slab_add() themselves. Some example lines (with
      initcall_debug set) before/after:
      
      initcall ext4_init_fs+0x0/0x1ac returned 0 after 1386 usecs
      initcall journal_init+0x0/0x138 returned 0 after 517 usecs
      initcall init_fat_fs+0x0/0x68 returned 0 after 294 usecs
      
      initcall ext4_init_fs+0x0/0x1ac returned 0 after 240 usecs
      initcall journal_init+0x0/0x138 returned 0 after 32 usecs
      initcall init_fat_fs+0x0/0x68 returned 0 after 18 usecs
      
      Altogether, this means I now get to petting the watchdog around 17ms
      sooner. [Of course, the time the other initcalls save is instead spent
      in slab_sysfs_init(), which goes from 11ms to 16ms, so there's no
      overall change in boot time.]
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      1a5ad30b
    • Rasmus Villemoes's avatar
      mm: slub: remove dead and buggy code from sysfs_slab_add() · 979857ea
      Rasmus Villemoes authored
      The function sysfs_slab_add() has two callers:
      
      One is slab_sysfs_init(), which first initializes slab_kset, and only
      when that succeeds sets slab_state to FULL, and then proceeds to call
      sysfs_slab_add() for all previously created slabs.
      
      The other is __kmem_cache_create(), but only after a
      
      	if (slab_state <= UP)
      		return 0;
      
      check.
      
      So in other words, sysfs_slab_add() is never called without
      slab_kset (aka the return value of cache_kset()) being non-NULL.
      
      And this is just as well, because if we ever did take this path and
      called kobject_init(&s->kobj), and then later when called again from
      slab_sysfs_init() would end up calling kobject_init_and_add(), we
      would hit
      
      	if (kobj->state_initialized) {
      		/* do not error out as sometimes we can recover */
      		pr_err("kobject (%p): tried to init an initialized object, something is seriously wrong.\n",
      		dump_stack();
      	}
      
      in kobject.c.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      979857ea
  10. 23 Oct, 2022 9 commits