1. 21 Nov, 2022 7 commits
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/tools' into slab/for-next · 1c1aaa33
      Vlastimil Babka authored
      A patch for tools/vm/slabinfo to give more useful feedback when not run
      as a root, by Rong Tao.
      1c1aaa33
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/slub-sysfs' into slab/for-next · c64b95d3
      Vlastimil Babka authored
      - Two patches for SLUB's sysfs by Rasmus Villemoes to remove dead code
        and optimize boot time with late initialization.
      - Allow SLUB's sysfs 'failslab' parameter to be runtime-controllable
        again as it can be both useful and safe, by Alexander Atanasov.
      c64b95d3
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/locking' into slab/for-next · 14d3eb66
      Vlastimil Babka authored
      A patch from Jiri Kosina that makes SLAB's list_lock a raw_spinlock_t.
      While there are no plans to make SLAB actually compatible with
      PREEMPT_RT or any other future, it makes !PREEMPT_RT lockdep happy.
      14d3eb66
    • Vlastimil Babka's avatar
      Merge branch 'slab/for-6.2/cleanups' into slab/for-next · 4b28ba9e
      Vlastimil Babka authored
      - Removal of dead code from deactivate_slab() by Hyeonggon Yoo.
      - Fix of BUILD_BUG_ON() for sufficient early percpu size by Baoquan He.
      - Make kmem_cache_alloc() kernel-doc less misleading, by myself.
      4b28ba9e
    • Vlastimil Babka's avatar
      mm/slab: move and adjust kernel-doc for kmem_cache_alloc · 838de63b
      Vlastimil Babka authored
      Alexander reports an issue with the kmem_cache_alloc() comment in
      mm/slab.c:
      
      > The current comment mentioned that the flags only matters if the
      > cache has no available objects. It's different for the __GFP_ZERO
      > flag which will ensure that the returned object is always zeroed
      > in any case.
      
      > I have the feeling I run into this question already two times if
      > the user need to zero the object or not, but the user does not need
      > to zero the object afterwards. However another use of __GFP_ZERO
      > and only zero the object if the cache has no available objects would
      > also make no sense.
      
      and suggests thus mentioning __GFP_ZERO as the exception. But on closer
      inspection, the part about flags being only relevant if cache has no
      available objects is misleading. The slab user has no reliable way to
      determine if there are available objects, and e.g. the might_sleep()
      debug check can be performed even if objects are available, so passing
      correct flags given the allocation context always matters.
      
      Thus remove that sentence completely, and while at it, move the comment
      to from SLAB-specific mm/slab.c to the common include/linux/slab.h
      The comment otherwise refers flags description for kmalloc(), so add
      __GFP_ZERO comment there and remove a very misleading GFP_HIGHUSER
      (not applicable to slab) description from there. Mention kzalloc() and
      kmem_cache_zalloc() shortcuts.
      Reported-by: default avatarAlexander Aring <aahringo@redhat.com>
      Link: https://lore.kernel.org/all/20221011145413.8025-1-aahringo@redhat.com/Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      838de63b
    • Baoquan He's avatar
      mm/slub, percpu: correct the calculation of early percpu allocation size · a0dc161a
      Baoquan He authored
      SLUB allocator relies on percpu allocator to initialize its ->cpu_slab
      during early boot. For that, the dynamic chunk of percpu which serves
      the early allocation need be large enough to satisfy the kmalloc
      creation.
      
      However, the current BUILD_BUG_ON() in alloc_kmem_cache_cpus() doesn't
      consider the kmalloc array with NR_KMALLOC_TYPES length. Fix that
      with correct calculation.
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      a0dc161a
    • Baoquan He's avatar
      percpu: adjust the value of PERCPU_DYNAMIC_EARLY_SIZE · e8753e41
      Baoquan He authored
      LKP reported a build failure as below on the following patch "mm/slub,
      percpu: correct the calculation of early percpu allocation size"
      
      ~~~~~~
      In file included from <command-line>:
      In function 'alloc_kmem_cache_cpus',
         inlined from 'kmem_cache_open' at mm/slub.c:4340:6:
      >> >> include/linux/compiler_types.h:357:45: error: call to '__compiletime_assert_474' declared with attribute error:
      BUILD_BUG_ON failed: PERCPU_DYNAMIC_EARLY_SIZE < NR_KMALLOC_TYPES * KMALLOC_SHIFT_HIGH * sizeof(struct kmem_cache_cpu)
           357 |         _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
      ~~~~~~
      
      From the kernel config file provided by LKP, the building was made on
      arm64 with below Kconfig item enabled:
      
        CONFIG_ZONE_DMA=y
        CONFIG_SLUB_CPU_PARTIAL=y
        CONFIG_DEBUG_LOCK_ALLOC=y
        CONFIG_SLUB_STATS=y
        CONFIG_ARM64_PAGE_SHIFT=16
        CONFIG_ARM64_64K_PAGES=y
      
      Then we will have:
        NR_KMALLOC_TYPES:4
        KMALLOC_SHIFT_HIGH:17
        sizeof(struct kmem_cache_cpu):184
      
      The product of them is 12512, which is bigger than PERCPU_DYNAMIC_EARLY_SIZE,
      12K. Hence, the BUILD_BUG_ON in alloc_kmem_cache_cpus() is triggered.
      
      Earlier, in commit 099a19d9 ("percpu: allow limited allocation
      before slab is online"), PERCPU_DYNAMIC_EARLY_SIZE was introduced and
      set to 12K which is equal to the then PERPCU_DYNAMIC_RESERVE.
      Later, in commit 1a4d7607 ("percpu: implement asynchronous chunk
      population"), PERPCU_DYNAMIC_RESERVE was increased by 8K, while
      PERCPU_DYNAMIC_EARLY_SIZE was kept unchanged.
      
      So, here increase PERCPU_DYNAMIC_EARLY_SIZE by 8K too to accommodate to
      the slub's requirement.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      e8753e41
  2. 10 Nov, 2022 1 commit
    • Rong Tao's avatar
      tools/vm/slabinfo: indicates the cause of the EACCES error · 654058e6
      Rong Tao authored
      If you don't run slabinfo with a superuser, return 0 when read_slab_dir()
      reads get_obj_and_str("slabs", &t), because fopen() fails (sometimes
      EACCES), causing slabcache() to return directly, without any error during
      this time, we should tell the user about the EACCES problem instead of
      running successfully($?=0) without any error printing.
      
       For example:
       $ ./slabinfo
       Permission denied, Try using superuser  <== What this submission did
       $ sudo ./slabinfo
       Name            Objects Objsize   Space Slabs/Part/Cpu  O/S O %Fr %Ef Flg
       Acpi-Namespace     5950      48  286.7K         65/0/5   85 0   0  99
       Acpi-Operand      13664      72  999.4K       231/0/13   56 0   0  98
       ...
      Signed-off-by: default avatarRong Tao <rongtao@cestc.cn>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      654058e6
  3. 07 Nov, 2022 1 commit
  4. 06 Nov, 2022 1 commit
    • Kees Cook's avatar
      mm/slab_common: Restore passing "caller" for tracing · 32868715
      Kees Cook authored
      The "caller" argument was accidentally being ignored in a few places
      that were recently refactored. Restore these "caller" arguments, instead
      of _RET_IP_.
      
      Fixes: 11e9734b ("mm/slab_common: unify NUMA and UMA version of tracepoints")
      Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: linux-mm@kvack.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      32868715
  5. 04 Nov, 2022 1 commit
  6. 03 Nov, 2022 1 commit
  7. 24 Oct, 2022 5 commits
    • Jiri Kosina's avatar
      mm/slab: Annotate kmem_cache_node->list_lock as raw · b539ce9f
      Jiri Kosina authored
      The list_lock can be taken in hardirq context when do_drain() is being
      called via IPI on all cores, and therefore lockdep complains about it,
      because it can't be preempted on PREEMPT_RT.
      
      That's not a real issue, as SLAB can't be built on PREEMPT_RT anyway, but
      we still want to get rid of the warning on non-PREEMPT_RT builds.
      
      Annotate it therefore as a raw lock in order to get rid of he lockdep
      warning below.
      
      	 =============================
      	 [ BUG: Invalid wait context ]
      	 6.1.0-rc1-00134-ge35184f3 #4 Not tainted
      	 -----------------------------
      	 swapper/3/0 is trying to lock:
      	 ffff8bc88086dc18 (&parent->list_lock){..-.}-{3:3}, at: do_drain+0x57/0xb0
      	 other info that might help us debug this:
      	 context-{2:2}
      	 no locks held by swapper/3/0.
      	 stack backtrace:
      	 CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.1.0-rc1-00134-ge35184f3 #4
      	 Hardware name: LENOVO 20K5S22R00/20K5S22R00, BIOS R0IET38W (1.16 ) 05/31/2017
      	 Call Trace:
      	  <IRQ>
      	  dump_stack_lvl+0x6b/0x9d
      	  __lock_acquire+0x1519/0x1730
      	  ? build_sched_domains+0x4bd/0x1590
      	  ? __lock_acquire+0xad2/0x1730
      	  lock_acquire+0x294/0x340
      	  ? do_drain+0x57/0xb0
      	  ? sched_clock_tick+0x41/0x60
      	  _raw_spin_lock+0x2c/0x40
      	  ? do_drain+0x57/0xb0
      	  do_drain+0x57/0xb0
      	  __flush_smp_call_function_queue+0x138/0x220
      	  __sysvec_call_function+0x4f/0x210
      	  sysvec_call_function+0x4b/0x90
      	  </IRQ>
      	  <TASK>
      	  asm_sysvec_call_function+0x16/0x20
      	 RIP: 0010:mwait_idle+0x5e/0x80
      	 Code: 31 d2 65 48 8b 04 25 80 ed 01 00 48 89 d1 0f 01 c8 48 8b 00 a8 08 75 14 66 90 0f 00 2d 0b 78 46 00 31 c0 48 89 c1 fb 0f 01 c9 <eb> 06 fb 0f 1f 44 00 00 65 48 8b 04 25 80 ed 01 00 f0 80 60 02 df
      	 RSP: 0000:ffffa90940217ee0 EFLAGS: 00000246
      	 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      	 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff9bb9f93a
      	 RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000001
      	 R10: ffffa90940217ea8 R11: 0000000000000000 R12: ffffffffffffffff
      	 R13: 0000000000000000 R14: ffff8bc88127c500 R15: 0000000000000000
      	  ? default_idle_call+0x1a/0xa0
      	  default_idle_call+0x4b/0xa0
      	  do_idle+0x1f1/0x2c0
      	  ? _raw_spin_unlock_irqrestore+0x56/0x70
      	  cpu_startup_entry+0x19/0x20
      	  start_secondary+0x122/0x150
      	  secondary_startup_64_no_verify+0xce/0xdb
      	  </TASK>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      b539ce9f
    • Hyeonggon Yoo's avatar
      mm/slub: remove dead code for debug caches on deactivate_slab() · a8e53869
      Hyeonggon Yoo authored
      After commit c7323a5a ("mm/slub: restrict sysfs validation to debug
      caches and make it safe"), SLUB never installs percpu slab for debug caches
      and thus never deactivates percpu slab for them.
      
      Since only debug caches use the full list, SLUB no longer deactivates to
      full list. Remove dead code in deactivate_slab().
      Signed-off-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      a8e53869
    • Alexander Atanasov's avatar
      mm: Make failslab writable again · 7c82b3b3
      Alexander Atanasov authored
      In (060807f8 mm, slub: make remaining slub_debug related attributes
      read-only) failslab was made read-only.
      I think it became a collateral victim to the two other options for which
      the reasons are perfectly valid.
      Here is why:
       - sanity_checks and trace are slab internal debug options,
         failslab is used for fault injection.
       - for fault injections, which by presumption are random, it
         does not matter if it is not set atomically. And you need to
         set atleast one more option to trigger fault injection.
       - in a testing scenario you may need to change it at runtime
         example: module loading - you test all allocations limited
         by the space option. Then you move to test only your module's
         own slabs.
       - when set by command line flags it effectively disables all
         cache merges.
      
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Vijayanand Jitta <vjitta@codeaurora.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.czSigned-off-by: default avatarAlexander Atanasov <alexander.atanasov@virtuozzo.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      7c82b3b3
    • Rasmus Villemoes's avatar
      mm: slub: make slab_sysfs_init() a late_initcall · 1a5ad30b
      Rasmus Villemoes authored
      Currently, slab_sysfs_init() is an __initcall aka device_initcall. It
      is rather time-consuming; on my board it takes around 11ms. That's
      about 1% of the time budget I have from U-Boot letting go and until
      linux must assume responsibility of keeping the external watchdog
      happy.
      
      There's no particular reason this would need to run at device_initcall
      time, so instead make it a late_initcall to allow vital functionality
      to get started a bit sooner.
      
      This actually ends up winning more than just those 11ms, because the
      slab caches that get created during other device_initcalls (and before
      my watchdog device gets probed) now don't end up doing the somewhat
      expensive sysfs_slab_add() themselves. Some example lines (with
      initcall_debug set) before/after:
      
      initcall ext4_init_fs+0x0/0x1ac returned 0 after 1386 usecs
      initcall journal_init+0x0/0x138 returned 0 after 517 usecs
      initcall init_fat_fs+0x0/0x68 returned 0 after 294 usecs
      
      initcall ext4_init_fs+0x0/0x1ac returned 0 after 240 usecs
      initcall journal_init+0x0/0x138 returned 0 after 32 usecs
      initcall init_fat_fs+0x0/0x68 returned 0 after 18 usecs
      
      Altogether, this means I now get to petting the watchdog around 17ms
      sooner. [Of course, the time the other initcalls save is instead spent
      in slab_sysfs_init(), which goes from 11ms to 16ms, so there's no
      overall change in boot time.]
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Acked-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      1a5ad30b
    • Rasmus Villemoes's avatar
      mm: slub: remove dead and buggy code from sysfs_slab_add() · 979857ea
      Rasmus Villemoes authored
      The function sysfs_slab_add() has two callers:
      
      One is slab_sysfs_init(), which first initializes slab_kset, and only
      when that succeeds sets slab_state to FULL, and then proceeds to call
      sysfs_slab_add() for all previously created slabs.
      
      The other is __kmem_cache_create(), but only after a
      
      	if (slab_state <= UP)
      		return 0;
      
      check.
      
      So in other words, sysfs_slab_add() is never called without
      slab_kset (aka the return value of cache_kset()) being non-NULL.
      
      And this is just as well, because if we ever did take this path and
      called kobject_init(&s->kobj), and then later when called again from
      slab_sysfs_init() would end up calling kobject_init_and_add(), we
      would hit
      
      	if (kobj->state_initialized) {
      		/* do not error out as sometimes we can recover */
      		pr_err("kobject (%p): tried to init an initialized object, something is seriously wrong.\n",
      		dump_stack();
      	}
      
      in kobject.c.
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Reviewed-by: default avatarHyeonggon Yoo <42.hyeyoo@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      979857ea
  8. 23 Oct, 2022 9 commits
  9. 22 Oct, 2022 14 commits