mm/list_lru: optimize memcg_reparent_list_lru_node()
Since commit 2c80cd57 ("mm/list_lru.c: fix list_lru_count_node() to be race free"), we are tracking the total number of lru entries in a list_lru_node in its nr_items field. In the case of memcg_reparent_list_lru_node(), there is nothing to be done if nr_items is 0. We don't even need to take the nlru->lock as no new lru entry could be added by a racing list_lru_add() to the draining src_idx memcg at this point. On systems that serve a lot of containers, it is possible that there can be thousands of list_lru's present due to the fact that each container may mount its own container specific filesystems. As a typical container uses only a few cpus, it is likely that only the list_lru_node that contains those cpus will be utilized while the rests may be empty. In other words, there can be a lot of list_lru_node with 0 nr_items. By skipping a lock/unlock operation and loading a cacheline from memcg_lrus, a sizeable number of cpu cycles can be saved. That can be substantial if we are talking about thousands of list_lru_node's with 0 nr_items. Link: https://lkml.kernel.org/r/20220309144000.1470138-1-longman@redhat.comSigned-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Showing
Please register or sign in to comment