• Waiman Long's avatar
    mm/memcg: improve refill_obj_stock() performance · 5387c904
    Waiman Long authored
    There are two issues with the current refill_obj_stock() code.  First of
    all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to
    atomically flush out remaining bytes to obj_cgroup, clear cached_objcg and
    do a obj_cgroup_put().  It is likely that the same obj_cgroup will be used
    again which leads to another call to drain_obj_stock() and
    obj_cgroup_get() as well as atomically retrieve the available byte from
    obj_cgroup.  That is costly.  Instead, we should just uncharge the excess
    pages, reduce the stock bytes and be done with it.  The drain_obj_stock()
    function should only be called when obj_cgroup changes.
    
    Secondly, when charging an object of size not less than a page in
    obj_cgroup_charge(), it is possible that the remaining bytes to be
    refilled to the stock will overflow a page and cause refill_obj_stock() to
    uncharge 1 page.  To avoid the additional uncharge in this case, a new
    allow_uncharge flag is added to refill_obj_stock() which will be set to
    false when called from obj_cgroup_charge() so that an uncharge_pages()
    call won't be issued right after a charge_pages() call unless the objcg
    changes.
    
    A multithreaded kmalloc+kfree microbenchmark on a 2-socket 48-core
    96-thread x86-64 system with 96 testing threads were run.  Before this
    patch, the total number of kilo kmalloc+kfree operations done for a 4k
    large object by all the testing threads per second were 4,304 kops/s
    (cgroup v1) and 8,478 kops/s (cgroup v2).  After applying this patch, the
    number were 4,731 (cgroup v1) and 418,142 (cgroup v2) respectively.  This
    represents a performance improvement of 1.10X (cgroup v1) and 49.3X
    (cgroup v2).
    
    Link: https://lkml.kernel.org/r/20210506150007.16288-4-longman@redhat.comSigned-off-by: default avatarWaiman Long <longman@redhat.com>
    Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
    Cc: Alex Shi <alex.shi@linux.alibaba.com>
    Cc: Chris Down <chris@chrisdown.name>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
    Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Muchun Song <songmuchun@bytedance.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Wei Yang <richard.weiyang@gmail.com>
    Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
    Cc: Yafang Shao <laoar.shao@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    5387c904
memcontrol.c 191 KB