Commit 8ff12cfc authored by Christoph Lameter's avatar Christoph Lameter Committed by Christoph Lameter

SLUB: Support for performance statistics

The statistics provided here allow the monitoring of allocator behavior but
at the cost of some (minimal) loss of performance. Counters are placed in
SLUB's per cpu data structure. The per cpu structure may be extended by the
statistics to grow larger than one cacheline which will increase the cache
footprint of SLUB.

There is a compile option to enable/disable the inclusion of the runtime
statistics and its off by default.

The slabinfo tool is enhanced to support these statistics via two options:

-D 	Switches the line of information displayed for a slab from size
	mode to activity mode.

-A	Sorts the slabs displayed by activity. This allows the display of
	the slabs most important to the performance of a certain load.

-r	Report option will report detailed statistics on

Example (tbench load):

slabinfo -AD		->Shows the most active slabs

Name                   Objects    Alloc     Free   %Fast
skbuff_fclone_cache         33 111953835 111953835  99  99
:0000192                  2666  5283688  5281047  99  99
:0001024                   849  5247230  5246389  83  83
vm_area_struct            1349   119642   118355  91  22
:0004096                    15    66753    66751  98  98
:0000064                  2067    25297    23383  98  78
dentry                   10259    28635    18464  91  45
:0000080                 11004    18950     8089  98  98
:0000096                  1703    12358    10784  99  98
:0000128                   762    10582     9875  94  18
:0000512                   184     9807     9647  95  81
:0002048                   479     9669     9195  83  65
anon_vma                   777     9461     9002  99  71
kmalloc-8                 6492     9981     5624  99  97
:0000768                   258     7174     6931  58  15

So the skbuff_fclone_cache is of highest importance for the tbench load.
Pretty high load on the 192 sized slab. Look for the aliases

slabinfo -a | grep 000192
:0000192     <- xfs_btree_cur filp kmalloc-192 uid_cache tw_sock_TCP
	request_sock_TCPv6 tw_sock_TCPv6 skbuff_head_cache xfs_ili

Likely skbuff_head_cache.


Looking into the statistics of the skbuff_fclone_cache is possible through

slabinfo skbuff_fclone_cache	->-r option implied if cache name is mentioned


.... Usual output ...

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath             111953360 111946981  99  99
Slowpath                 1044     7423   0   0
Page Alloc                272      264   0   0
Add partial                25      325   0   0
Remove partial             86      264   0   0
RemoteObj/SlabFrozen      350     4832   0   0
Total                111954404 111954404

Flushes       49 Refill        0
Deactivate Full=325(92%) Empty=0(0%) ToHead=24(6%) ToTail=1(0%)

Looks good because the fastpath is overwhelmingly taken.


skbuff_head_cache:

Slab Perf Counter       Alloc     Free %Al %Fr
--------------------------------------------------
Fastpath              5297262  5259882  99  99
Slowpath                 4477    39586   0   0
Page Alloc                937      824   0   0
Add partial                 0     2515   0   0
Remove partial           1691      824   0   0
RemoteObj/SlabFrozen     2621     9684   0   0
Total                 5301739  5299468

Deactivate Full=2620(100%) Empty=0(0%) ToHead=0(0%) ToTail=0(0%)


Descriptions of the output:

Total:		The total number of allocation and frees that occurred for a
		slab

Fastpath:	The number of allocations/frees that used the fastpath.

Slowpath:	Other allocations

Page Alloc:	Number of calls to the page allocator as a result of slowpath
		processing

Add Partial:	Number of slabs added to the partial list through free or
		alloc (occurs during cpuslab flushes)

Remove Partial:	Number of slabs removed from the partial list as a result of
		allocations retrieving a partial slab or by a free freeing
		the last object of a slab.

RemoteObj/Froz:	How many times were remotely freed object encountered when a
		slab was about to be deactivated. Frozen: How many times was
		free able to skip list processing because the slab was in use
		as the cpuslab of another processor.

Flushes:	Number of times the cpuslab was flushed on request
		(kmem_cache_shrink, may result from races in __slab_alloc)

Refill:		Number of times we were able to refill the cpuslab from
		remotely freed objects for the same slab.

Deactivate:	Statistics how slabs were deactivated. Shows how they were
		put onto the partial list.

In general fastpath is very good. Slowpath without partial list processing is
also desirable. Any touching of partial list uses node specific locks which
may potentially cause list lock contention.
Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
parent 1f84260c
...@@ -32,6 +32,13 @@ struct slabinfo { ...@@ -32,6 +32,13 @@ struct slabinfo {
int sanity_checks, slab_size, store_user, trace; int sanity_checks, slab_size, store_user, trace;
int order, poison, reclaim_account, red_zone; int order, poison, reclaim_account, red_zone;
unsigned long partial, objects, slabs; unsigned long partial, objects, slabs;
unsigned long alloc_fastpath, alloc_slowpath;
unsigned long free_fastpath, free_slowpath;
unsigned long free_frozen, free_add_partial, free_remove_partial;
unsigned long alloc_from_partial, alloc_slab, free_slab, alloc_refill;
unsigned long cpuslab_flush, deactivate_full, deactivate_empty;
unsigned long deactivate_to_head, deactivate_to_tail;
unsigned long deactivate_remote_frees;
int numa[MAX_NODES]; int numa[MAX_NODES];
int numa_partial[MAX_NODES]; int numa_partial[MAX_NODES];
} slabinfo[MAX_SLABS]; } slabinfo[MAX_SLABS];
...@@ -64,8 +71,10 @@ int show_inverted = 0; ...@@ -64,8 +71,10 @@ int show_inverted = 0;
int show_single_ref = 0; int show_single_ref = 0;
int show_totals = 0; int show_totals = 0;
int sort_size = 0; int sort_size = 0;
int sort_active = 0;
int set_debug = 0; int set_debug = 0;
int show_ops = 0; int show_ops = 0;
int show_activity = 0;
/* Debug options */ /* Debug options */
int sanity = 0; int sanity = 0;
...@@ -93,8 +102,10 @@ void usage(void) ...@@ -93,8 +102,10 @@ void usage(void)
printf("slabinfo 5/7/2007. (c) 2007 sgi. clameter@sgi.com\n\n" printf("slabinfo 5/7/2007. (c) 2007 sgi. clameter@sgi.com\n\n"
"slabinfo [-ahnpvtsz] [-d debugopts] [slab-regexp]\n" "slabinfo [-ahnpvtsz] [-d debugopts] [slab-regexp]\n"
"-a|--aliases Show aliases\n" "-a|--aliases Show aliases\n"
"-A|--activity Most active slabs first\n"
"-d<options>|--debug=<options> Set/Clear Debug options\n" "-d<options>|--debug=<options> Set/Clear Debug options\n"
"-e|--empty Show empty slabs\n" "-D|--display-active Switch line format to activity\n"
"-e|--empty Show empty slabs\n"
"-f|--first-alias Show first alias\n" "-f|--first-alias Show first alias\n"
"-h|--help Show usage information\n" "-h|--help Show usage information\n"
"-i|--inverted Inverted list\n" "-i|--inverted Inverted list\n"
...@@ -281,8 +292,11 @@ int line = 0; ...@@ -281,8 +292,11 @@ int line = 0;
void first_line(void) void first_line(void)
{ {
printf("Name Objects Objsize Space " if (show_activity)
"Slabs/Part/Cpu O/S O %%Fr %%Ef Flg\n"); printf("Name Objects Alloc Free %%Fast\n");
else
printf("Name Objects Objsize Space "
"Slabs/Part/Cpu O/S O %%Fr %%Ef Flg\n");
} }
/* /*
...@@ -309,6 +323,12 @@ unsigned long slab_size(struct slabinfo *s) ...@@ -309,6 +323,12 @@ unsigned long slab_size(struct slabinfo *s)
return s->slabs * (page_size << s->order); return s->slabs * (page_size << s->order);
} }
unsigned long slab_activity(struct slabinfo *s)
{
return s->alloc_fastpath + s->free_fastpath +
s->alloc_slowpath + s->free_slowpath;
}
void slab_numa(struct slabinfo *s, int mode) void slab_numa(struct slabinfo *s, int mode)
{ {
int node; int node;
...@@ -392,6 +412,71 @@ const char *onoff(int x) ...@@ -392,6 +412,71 @@ const char *onoff(int x)
return "Off"; return "Off";
} }
void slab_stats(struct slabinfo *s)
{
unsigned long total_alloc;
unsigned long total_free;
unsigned long total;
if (!s->alloc_slab)
return;
total_alloc = s->alloc_fastpath + s->alloc_slowpath;
total_free = s->free_fastpath + s->free_slowpath;
if (!total_alloc)
return;
printf("\n");
printf("Slab Perf Counter Alloc Free %%Al %%Fr\n");
printf("--------------------------------------------------\n");
printf("Fastpath %8lu %8lu %3lu %3lu\n",
s->alloc_fastpath, s->free_fastpath,
s->alloc_fastpath * 100 / total_alloc,
s->free_fastpath * 100 / total_free);
printf("Slowpath %8lu %8lu %3lu %3lu\n",
total_alloc - s->alloc_fastpath, s->free_slowpath,
(total_alloc - s->alloc_fastpath) * 100 / total_alloc,
s->free_slowpath * 100 / total_free);
printf("Page Alloc %8lu %8lu %3lu %3lu\n",
s->alloc_slab, s->free_slab,
s->alloc_slab * 100 / total_alloc,
s->free_slab * 100 / total_free);
printf("Add partial %8lu %8lu %3lu %3lu\n",
s->deactivate_to_head + s->deactivate_to_tail,
s->free_add_partial,
(s->deactivate_to_head + s->deactivate_to_tail) * 100 / total_alloc,
s->free_add_partial * 100 / total_free);
printf("Remove partial %8lu %8lu %3lu %3lu\n",
s->alloc_from_partial, s->free_remove_partial,
s->alloc_from_partial * 100 / total_alloc,
s->free_remove_partial * 100 / total_free);
printf("RemoteObj/SlabFrozen %8lu %8lu %3lu %3lu\n",
s->deactivate_remote_frees, s->free_frozen,
s->deactivate_remote_frees * 100 / total_alloc,
s->free_frozen * 100 / total_free);
printf("Total %8lu %8lu\n\n", total_alloc, total_free);
if (s->cpuslab_flush)
printf("Flushes %8lu\n", s->cpuslab_flush);
if (s->alloc_refill)
printf("Refill %8lu\n", s->alloc_refill);
total = s->deactivate_full + s->deactivate_empty +
s->deactivate_to_head + s->deactivate_to_tail;
if (total)
printf("Deactivate Full=%lu(%lu%%) Empty=%lu(%lu%%) "
"ToHead=%lu(%lu%%) ToTail=%lu(%lu%%)\n",
s->deactivate_full, (s->deactivate_full * 100) / total,
s->deactivate_empty, (s->deactivate_empty * 100) / total,
s->deactivate_to_head, (s->deactivate_to_head * 100) / total,
s->deactivate_to_tail, (s->deactivate_to_tail * 100) / total);
}
void report(struct slabinfo *s) void report(struct slabinfo *s)
{ {
if (strcmp(s->name, "*") == 0) if (strcmp(s->name, "*") == 0)
...@@ -430,6 +515,7 @@ void report(struct slabinfo *s) ...@@ -430,6 +515,7 @@ void report(struct slabinfo *s)
ops(s); ops(s);
show_tracking(s); show_tracking(s);
slab_numa(s, 1); slab_numa(s, 1);
slab_stats(s);
} }
void slabcache(struct slabinfo *s) void slabcache(struct slabinfo *s)
...@@ -479,13 +565,27 @@ void slabcache(struct slabinfo *s) ...@@ -479,13 +565,27 @@ void slabcache(struct slabinfo *s)
*p++ = 'T'; *p++ = 'T';
*p = 0; *p = 0;
printf("%-21s %8ld %7d %8s %14s %4d %1d %3ld %3ld %s\n", if (show_activity) {
s->name, s->objects, s->object_size, size_str, dist_str, unsigned long total_alloc;
s->objs_per_slab, s->order, unsigned long total_free;
s->slabs ? (s->partial * 100) / s->slabs : 100,
s->slabs ? (s->objects * s->object_size * 100) / total_alloc = s->alloc_fastpath + s->alloc_slowpath;
(s->slabs * (page_size << s->order)) : 100, total_free = s->free_fastpath + s->free_slowpath;
flags);
printf("%-21s %8ld %8ld %8ld %3ld %3ld \n",
s->name, s->objects,
total_alloc, total_free,
total_alloc ? (s->alloc_fastpath * 100 / total_alloc) : 0,
total_free ? (s->free_fastpath * 100 / total_free) : 0);
}
else
printf("%-21s %8ld %7d %8s %14s %4d %1d %3ld %3ld %s\n",
s->name, s->objects, s->object_size, size_str, dist_str,
s->objs_per_slab, s->order,
s->slabs ? (s->partial * 100) / s->slabs : 100,
s->slabs ? (s->objects * s->object_size * 100) /
(s->slabs * (page_size << s->order)) : 100,
flags);
} }
/* /*
...@@ -892,6 +992,8 @@ void sort_slabs(void) ...@@ -892,6 +992,8 @@ void sort_slabs(void)
if (sort_size) if (sort_size)
result = slab_size(s1) < slab_size(s2); result = slab_size(s1) < slab_size(s2);
else if (sort_active)
result = slab_activity(s1) < slab_activity(s2);
else else
result = strcasecmp(s1->name, s2->name); result = strcasecmp(s1->name, s2->name);
...@@ -1074,6 +1176,23 @@ void read_slab_dir(void) ...@@ -1074,6 +1176,23 @@ void read_slab_dir(void)
free(t); free(t);
slab->store_user = get_obj("store_user"); slab->store_user = get_obj("store_user");
slab->trace = get_obj("trace"); slab->trace = get_obj("trace");
slab->alloc_fastpath = get_obj("alloc_fastpath");
slab->alloc_slowpath = get_obj("alloc_slowpath");
slab->free_fastpath = get_obj("free_fastpath");
slab->free_slowpath = get_obj("free_slowpath");
slab->free_frozen= get_obj("free_frozen");
slab->free_add_partial = get_obj("free_add_partial");
slab->free_remove_partial = get_obj("free_remove_partial");
slab->alloc_from_partial = get_obj("alloc_from_partial");
slab->alloc_slab = get_obj("alloc_slab");
slab->alloc_refill = get_obj("alloc_refill");
slab->free_slab = get_obj("free_slab");
slab->cpuslab_flush = get_obj("cpuslab_flush");
slab->deactivate_full = get_obj("deactivate_full");
slab->deactivate_empty = get_obj("deactivate_empty");
slab->deactivate_to_head = get_obj("deactivate_to_head");
slab->deactivate_to_tail = get_obj("deactivate_to_tail");
slab->deactivate_remote_frees = get_obj("deactivate_remote_frees");
chdir(".."); chdir("..");
if (slab->name[0] == ':') if (slab->name[0] == ':')
alias_targets++; alias_targets++;
...@@ -1124,7 +1243,9 @@ void output_slabs(void) ...@@ -1124,7 +1243,9 @@ void output_slabs(void)
struct option opts[] = { struct option opts[] = {
{ "aliases", 0, NULL, 'a' }, { "aliases", 0, NULL, 'a' },
{ "activity", 0, NULL, 'A' },
{ "debug", 2, NULL, 'd' }, { "debug", 2, NULL, 'd' },
{ "display-activity", 0, NULL, 'D' },
{ "empty", 0, NULL, 'e' }, { "empty", 0, NULL, 'e' },
{ "first-alias", 0, NULL, 'f' }, { "first-alias", 0, NULL, 'f' },
{ "help", 0, NULL, 'h' }, { "help", 0, NULL, 'h' },
...@@ -1149,7 +1270,7 @@ int main(int argc, char *argv[]) ...@@ -1149,7 +1270,7 @@ int main(int argc, char *argv[])
page_size = getpagesize(); page_size = getpagesize();
while ((c = getopt_long(argc, argv, "ad::efhil1noprstvzTS", while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTS",
opts, NULL)) != -1) opts, NULL)) != -1)
switch (c) { switch (c) {
case '1': case '1':
...@@ -1158,11 +1279,17 @@ int main(int argc, char *argv[]) ...@@ -1158,11 +1279,17 @@ int main(int argc, char *argv[])
case 'a': case 'a':
show_alias = 1; show_alias = 1;
break; break;
case 'A':
sort_active = 1;
break;
case 'd': case 'd':
set_debug = 1; set_debug = 1;
if (!debug_opt_scan(optarg)) if (!debug_opt_scan(optarg))
fatal("Invalid debug option '%s'\n", optarg); fatal("Invalid debug option '%s'\n", optarg);
break; break;
case 'D':
show_activity = 1;
break;
case 'e': case 'e':
show_empty = 1; show_empty = 1;
break; break;
......
...@@ -11,12 +11,35 @@ ...@@ -11,12 +11,35 @@
#include <linux/workqueue.h> #include <linux/workqueue.h>
#include <linux/kobject.h> #include <linux/kobject.h>
enum stat_item {
ALLOC_FASTPATH, /* Allocation from cpu slab */
ALLOC_SLOWPATH, /* Allocation by getting a new cpu slab */
FREE_FASTPATH, /* Free to cpu slub */
FREE_SLOWPATH, /* Freeing not to cpu slab */
FREE_FROZEN, /* Freeing to frozen slab */
FREE_ADD_PARTIAL, /* Freeing moves slab to partial list */
FREE_REMOVE_PARTIAL, /* Freeing removes last object */
ALLOC_FROM_PARTIAL, /* Cpu slab acquired from partial list */
ALLOC_SLAB, /* Cpu slab acquired from page allocator */
ALLOC_REFILL, /* Refill cpu slab from slab freelist */
FREE_SLAB, /* Slab freed to the page allocator */
CPUSLAB_FLUSH, /* Abandoning of the cpu slab */
DEACTIVATE_FULL, /* Cpu slab was full when deactivated */
DEACTIVATE_EMPTY, /* Cpu slab was empty when deactivated */
DEACTIVATE_TO_HEAD, /* Cpu slab was moved to the head of partials */
DEACTIVATE_TO_TAIL, /* Cpu slab was moved to the tail of partials */
DEACTIVATE_REMOTE_FREES,/* Slab contained remotely freed objects */
NR_SLUB_STAT_ITEMS };
struct kmem_cache_cpu { struct kmem_cache_cpu {
void **freelist; /* Pointer to first free per cpu object */ void **freelist; /* Pointer to first free per cpu object */
struct page *page; /* The slab from which we are allocating */ struct page *page; /* The slab from which we are allocating */
int node; /* The node of the page (or -1 for debug) */ int node; /* The node of the page (or -1 for debug) */
unsigned int offset; /* Freepointer offset (in word units) */ unsigned int offset; /* Freepointer offset (in word units) */
unsigned int objsize; /* Size of an object (from kmem_cache) */ unsigned int objsize; /* Size of an object (from kmem_cache) */
#ifdef CONFIG_SLUB_STATS
unsigned stat[NR_SLUB_STAT_ITEMS];
#endif
}; };
struct kmem_cache_node { struct kmem_cache_node {
......
...@@ -205,6 +205,19 @@ config SLUB_DEBUG_ON ...@@ -205,6 +205,19 @@ config SLUB_DEBUG_ON
off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying
"slub_debug=-". "slub_debug=-".
config SLUB_STATS
default n
bool "Enable SLUB performance statistics"
depends on SLUB
help
SLUB statistics are useful to debug SLUBs allocation behavior in
order find ways to optimize the allocator. This should never be
enabled for production use since keeping statistics slows down
the allocator by a few percentage points. The slabinfo command
supports the determination of the most active slabs to figure
out which slabs are relevant to a particular load.
Try running: slabinfo -DA
config DEBUG_PREEMPT config DEBUG_PREEMPT
bool "Debug preemptible kernel" bool "Debug preemptible kernel"
depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64) depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
......
...@@ -250,6 +250,7 @@ enum track_item { TRACK_ALLOC, TRACK_FREE }; ...@@ -250,6 +250,7 @@ enum track_item { TRACK_ALLOC, TRACK_FREE };
static int sysfs_slab_add(struct kmem_cache *); static int sysfs_slab_add(struct kmem_cache *);
static int sysfs_slab_alias(struct kmem_cache *, const char *); static int sysfs_slab_alias(struct kmem_cache *, const char *);
static void sysfs_slab_remove(struct kmem_cache *); static void sysfs_slab_remove(struct kmem_cache *);
#else #else
static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; } static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; }
static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p) static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
...@@ -258,8 +259,16 @@ static inline void sysfs_slab_remove(struct kmem_cache *s) ...@@ -258,8 +259,16 @@ static inline void sysfs_slab_remove(struct kmem_cache *s)
{ {
kfree(s); kfree(s);
} }
#endif #endif
static inline void stat(struct kmem_cache_cpu *c, enum stat_item si)
{
#ifdef CONFIG_SLUB_STATS
c->stat[si]++;
#endif
}
/******************************************************************** /********************************************************************
* Core slab cache functions * Core slab cache functions
*******************************************************************/ *******************************************************************/
...@@ -1364,17 +1373,22 @@ static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node) ...@@ -1364,17 +1373,22 @@ static struct page *get_partial(struct kmem_cache *s, gfp_t flags, int node)
static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail) static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail)
{ {
struct kmem_cache_node *n = get_node(s, page_to_nid(page)); struct kmem_cache_node *n = get_node(s, page_to_nid(page));
struct kmem_cache_cpu *c = get_cpu_slab(s, smp_processor_id());
ClearSlabFrozen(page); ClearSlabFrozen(page);
if (page->inuse) { if (page->inuse) {
if (page->freelist != page->end) if (page->freelist != page->end) {
add_partial(n, page, tail); add_partial(n, page, tail);
else if (SlabDebug(page) && (s->flags & SLAB_STORE_USER)) stat(c, tail ? DEACTIVATE_TO_TAIL : DEACTIVATE_TO_HEAD);
add_full(n, page); } else {
stat(c, DEACTIVATE_FULL);
if (SlabDebug(page) && (s->flags & SLAB_STORE_USER))
add_full(n, page);
}
slab_unlock(page); slab_unlock(page);
} else { } else {
stat(c, DEACTIVATE_EMPTY);
if (n->nr_partial < MIN_PARTIAL) { if (n->nr_partial < MIN_PARTIAL) {
/* /*
* Adding an empty slab to the partial slabs in order * Adding an empty slab to the partial slabs in order
...@@ -1388,6 +1402,7 @@ static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail) ...@@ -1388,6 +1402,7 @@ static void unfreeze_slab(struct kmem_cache *s, struct page *page, int tail)
slab_unlock(page); slab_unlock(page);
} else { } else {
slab_unlock(page); slab_unlock(page);
stat(get_cpu_slab(s, raw_smp_processor_id()), FREE_SLAB);
discard_slab(s, page); discard_slab(s, page);
} }
} }
...@@ -1400,6 +1415,9 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) ...@@ -1400,6 +1415,9 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
{ {
struct page *page = c->page; struct page *page = c->page;
int tail = 1; int tail = 1;
if (c->freelist)
stat(c, DEACTIVATE_REMOTE_FREES);
/* /*
* Merge cpu freelist into freelist. Typically we get here * Merge cpu freelist into freelist. Typically we get here
* because both freelists are empty. So this is unlikely * because both freelists are empty. So this is unlikely
...@@ -1429,6 +1447,7 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) ...@@ -1429,6 +1447,7 @@ static void deactivate_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c)
{ {
stat(c, CPUSLAB_FLUSH);
slab_lock(c->page); slab_lock(c->page);
deactivate_slab(s, c); deactivate_slab(s, c);
} }
...@@ -1511,6 +1530,7 @@ static void *__slab_alloc(struct kmem_cache *s, ...@@ -1511,6 +1530,7 @@ static void *__slab_alloc(struct kmem_cache *s,
slab_lock(c->page); slab_lock(c->page);
if (unlikely(!node_match(c, node))) if (unlikely(!node_match(c, node)))
goto another_slab; goto another_slab;
stat(c, ALLOC_REFILL);
load_freelist: load_freelist:
object = c->page->freelist; object = c->page->freelist;
if (unlikely(object == c->page->end)) if (unlikely(object == c->page->end))
...@@ -1525,6 +1545,7 @@ static void *__slab_alloc(struct kmem_cache *s, ...@@ -1525,6 +1545,7 @@ static void *__slab_alloc(struct kmem_cache *s,
c->node = page_to_nid(c->page); c->node = page_to_nid(c->page);
unlock_out: unlock_out:
slab_unlock(c->page); slab_unlock(c->page);
stat(c, ALLOC_SLOWPATH);
out: out:
#ifdef SLUB_FASTPATH #ifdef SLUB_FASTPATH
local_irq_restore(flags); local_irq_restore(flags);
...@@ -1538,6 +1559,7 @@ static void *__slab_alloc(struct kmem_cache *s, ...@@ -1538,6 +1559,7 @@ static void *__slab_alloc(struct kmem_cache *s,
new = get_partial(s, gfpflags, node); new = get_partial(s, gfpflags, node);
if (new) { if (new) {
c->page = new; c->page = new;
stat(c, ALLOC_FROM_PARTIAL);
goto load_freelist; goto load_freelist;
} }
...@@ -1551,6 +1573,7 @@ static void *__slab_alloc(struct kmem_cache *s, ...@@ -1551,6 +1573,7 @@ static void *__slab_alloc(struct kmem_cache *s,
if (new) { if (new) {
c = get_cpu_slab(s, smp_processor_id()); c = get_cpu_slab(s, smp_processor_id());
stat(c, ALLOC_SLAB);
if (c->page) if (c->page)
flush_slab(s, c); flush_slab(s, c);
slab_lock(new); slab_lock(new);
...@@ -1610,6 +1633,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s, ...@@ -1610,6 +1633,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
object = __slab_alloc(s, gfpflags, node, addr, c); object = __slab_alloc(s, gfpflags, node, addr, c);
break; break;
} }
stat(c, ALLOC_FASTPATH);
} while (cmpxchg_local(&c->freelist, object, object[c->offset]) } while (cmpxchg_local(&c->freelist, object, object[c->offset])
!= object); != object);
#else #else
...@@ -1624,6 +1648,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s, ...@@ -1624,6 +1648,7 @@ static __always_inline void *slab_alloc(struct kmem_cache *s,
else { else {
object = c->freelist; object = c->freelist;
c->freelist = object[c->offset]; c->freelist = object[c->offset];
stat(c, ALLOC_FASTPATH);
} }
local_irq_restore(flags); local_irq_restore(flags);
#endif #endif
...@@ -1661,12 +1686,15 @@ static void __slab_free(struct kmem_cache *s, struct page *page, ...@@ -1661,12 +1686,15 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
{ {
void *prior; void *prior;
void **object = (void *)x; void **object = (void *)x;
struct kmem_cache_cpu *c;
#ifdef SLUB_FASTPATH #ifdef SLUB_FASTPATH
unsigned long flags; unsigned long flags;
local_irq_save(flags); local_irq_save(flags);
#endif #endif
c = get_cpu_slab(s, raw_smp_processor_id());
stat(c, FREE_SLOWPATH);
slab_lock(page); slab_lock(page);
if (unlikely(SlabDebug(page))) if (unlikely(SlabDebug(page)))
...@@ -1676,8 +1704,10 @@ static void __slab_free(struct kmem_cache *s, struct page *page, ...@@ -1676,8 +1704,10 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
page->freelist = object; page->freelist = object;
page->inuse--; page->inuse--;
if (unlikely(SlabFrozen(page))) if (unlikely(SlabFrozen(page))) {
stat(c, FREE_FROZEN);
goto out_unlock; goto out_unlock;
}
if (unlikely(!page->inuse)) if (unlikely(!page->inuse))
goto slab_empty; goto slab_empty;
...@@ -1687,8 +1717,10 @@ static void __slab_free(struct kmem_cache *s, struct page *page, ...@@ -1687,8 +1717,10 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
* was not on the partial list before * was not on the partial list before
* then add it. * then add it.
*/ */
if (unlikely(prior == page->end)) if (unlikely(prior == page->end)) {
add_partial(get_node(s, page_to_nid(page)), page, 1); add_partial(get_node(s, page_to_nid(page)), page, 1);
stat(c, FREE_ADD_PARTIAL);
}
out_unlock: out_unlock:
slab_unlock(page); slab_unlock(page);
...@@ -1698,13 +1730,15 @@ static void __slab_free(struct kmem_cache *s, struct page *page, ...@@ -1698,13 +1730,15 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
return; return;
slab_empty: slab_empty:
if (prior != page->end) if (prior != page->end) {
/* /*
* Slab still on the partial list. * Slab still on the partial list.
*/ */
remove_partial(s, page); remove_partial(s, page);
stat(c, FREE_REMOVE_PARTIAL);
}
slab_unlock(page); slab_unlock(page);
stat(c, FREE_SLAB);
#ifdef SLUB_FASTPATH #ifdef SLUB_FASTPATH
local_irq_restore(flags); local_irq_restore(flags);
#endif #endif
...@@ -1758,6 +1792,7 @@ static __always_inline void slab_free(struct kmem_cache *s, ...@@ -1758,6 +1792,7 @@ static __always_inline void slab_free(struct kmem_cache *s,
break; break;
} }
object[c->offset] = freelist; object[c->offset] = freelist;
stat(c, FREE_FASTPATH);
} while (cmpxchg_local(&c->freelist, freelist, object) != freelist); } while (cmpxchg_local(&c->freelist, freelist, object) != freelist);
#else #else
unsigned long flags; unsigned long flags;
...@@ -1768,6 +1803,7 @@ static __always_inline void slab_free(struct kmem_cache *s, ...@@ -1768,6 +1803,7 @@ static __always_inline void slab_free(struct kmem_cache *s,
if (likely(page == c->page && c->node >= 0)) { if (likely(page == c->page && c->node >= 0)) {
object[c->offset] = c->freelist; object[c->offset] = c->freelist;
c->freelist = object; c->freelist = object;
stat(c, FREE_FASTPATH);
} else } else
__slab_free(s, page, x, addr, c->offset); __slab_free(s, page, x, addr, c->offset);
...@@ -3980,6 +4016,62 @@ static ssize_t remote_node_defrag_ratio_store(struct kmem_cache *s, ...@@ -3980,6 +4016,62 @@ static ssize_t remote_node_defrag_ratio_store(struct kmem_cache *s,
SLAB_ATTR(remote_node_defrag_ratio); SLAB_ATTR(remote_node_defrag_ratio);
#endif #endif
#ifdef CONFIG_SLUB_STATS
static int show_stat(struct kmem_cache *s, char *buf, enum stat_item si)
{
unsigned long sum = 0;
int cpu;
int len;
int *data = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL);
if (!data)
return -ENOMEM;
for_each_online_cpu(cpu) {
unsigned x = get_cpu_slab(s, cpu)->stat[si];
data[cpu] = x;
sum += x;
}
len = sprintf(buf, "%lu", sum);
for_each_online_cpu(cpu) {
if (data[cpu] && len < PAGE_SIZE - 20)
len += sprintf(buf + len, " c%d=%u", cpu, data[cpu]);
}
kfree(data);
return len + sprintf(buf + len, "\n");
}
#define STAT_ATTR(si, text) \
static ssize_t text##_show(struct kmem_cache *s, char *buf) \
{ \
return show_stat(s, buf, si); \
} \
SLAB_ATTR_RO(text); \
STAT_ATTR(ALLOC_FASTPATH, alloc_fastpath);
STAT_ATTR(ALLOC_SLOWPATH, alloc_slowpath);
STAT_ATTR(FREE_FASTPATH, free_fastpath);
STAT_ATTR(FREE_SLOWPATH, free_slowpath);
STAT_ATTR(FREE_FROZEN, free_frozen);
STAT_ATTR(FREE_ADD_PARTIAL, free_add_partial);
STAT_ATTR(FREE_REMOVE_PARTIAL, free_remove_partial);
STAT_ATTR(ALLOC_FROM_PARTIAL, alloc_from_partial);
STAT_ATTR(ALLOC_SLAB, alloc_slab);
STAT_ATTR(ALLOC_REFILL, alloc_refill);
STAT_ATTR(FREE_SLAB, free_slab);
STAT_ATTR(CPUSLAB_FLUSH, cpuslab_flush);
STAT_ATTR(DEACTIVATE_FULL, deactivate_full);
STAT_ATTR(DEACTIVATE_EMPTY, deactivate_empty);
STAT_ATTR(DEACTIVATE_TO_HEAD, deactivate_to_head);
STAT_ATTR(DEACTIVATE_TO_TAIL, deactivate_to_tail);
STAT_ATTR(DEACTIVATE_REMOTE_FREES, deactivate_remote_frees);
#endif
static struct attribute *slab_attrs[] = { static struct attribute *slab_attrs[] = {
&slab_size_attr.attr, &slab_size_attr.attr,
&object_size_attr.attr, &object_size_attr.attr,
...@@ -4009,6 +4101,25 @@ static struct attribute *slab_attrs[] = { ...@@ -4009,6 +4101,25 @@ static struct attribute *slab_attrs[] = {
#endif #endif
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
&remote_node_defrag_ratio_attr.attr, &remote_node_defrag_ratio_attr.attr,
#endif
#ifdef CONFIG_SLUB_STATS
&alloc_fastpath_attr.attr,
&alloc_slowpath_attr.attr,
&free_fastpath_attr.attr,
&free_slowpath_attr.attr,
&free_frozen_attr.attr,
&free_add_partial_attr.attr,
&free_remove_partial_attr.attr,
&alloc_from_partial_attr.attr,
&alloc_slab_attr.attr,
&alloc_refill_attr.attr,
&free_slab_attr.attr,
&cpuslab_flush_attr.attr,
&deactivate_full_attr.attr,
&deactivate_empty_attr.attr,
&deactivate_to_head_attr.attr,
&deactivate_to_tail_attr.attr,
&deactivate_remote_frees_attr.attr,
#endif #endif
NULL NULL
}; };
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment